Hindsight 架构讲解

什么是 Hindsight？

Hindsight 是 Hermes Agent 的长期记忆系统。它让 AI agent 在会话之间记住你是谁、你的偏好、项目决策、技术环境——而不是每次重置都从头认识你。

三种核心操作：

retain（存储）→ 从对话中提取结构化事实，存入知识图谱 + 向量库
recall（检索）→ 语义搜索 + 关键词 + 实体图遍历 + 重排序，找到最相关的记忆
reflect（反思）→ 跨所有记忆推理，综合出有结构的回答

架构分层

从 Hermes Agent 到底层数据库，共 6 层，以朱砂 / 陶土 / 墨色交替区分层级：

Hermes Agent · run_agent.py

↓ MemoryManager.prefetch() / sync_turn()

HindsightMemoryProvider · 1760 行实现

↓ _get_client() · _run_sync(coro)

hindsight_client · Hindsight() / HindsightEmbedded()

↓ HTTP REST · localhost:9177

Hindsight API Daemon · FastMCP · PID 95629

↙ pg0 ↘ Z.AI API

PostgreSQL 18.1 + pgvector · 127.0.0.1:5433

GLM-4-Flash · Z.AI · retain/reflect

三种核心操作

📥 retain — 存储

Agent 调 hindsight_retain 或自动 sync_turn
内容入异步写者队列（单线程串行）
Hindsight API 接收 → LLM 提取结构化事实
实体存入知识图谱（PG）
文本向量化 → vector store

🔍 recall — 检索

Agent 调 hindsight_recall
Hindsight API 执行 3 路检索：
① 语义搜索（向量相似度）
② 关键词匹配（FTS5）
③ 实体图遍历（知识图谱）
Reranker 重排序 → Top-N 返回

🧠 reflect — 反思

Agent 调 hindsight_reflect
Hindsight recall 获取所有相关记忆
LLM 跨记忆推理
返回有结构的合成回答

🔄 自动预取（prefetch）

每轮对话前 MemoryManager 自动调用
以用户最新消息为查询后台 recall
结果注入系统提示，无需手动调 tool
可配置 auto_recall=true/false

会话生命周期

// 每个会话 starts 时：
Agent 启动 → MemoryManager → initialize(session_id)
                    → 创建异步事件循环（后台线程）
                    → 初始化 hindsight_client
                    → 注册 atexit 清理

// 每一轮对话：
前序：       → prefetch(query) → 后台 recall → 注入系统提示
每轮结束后： → sync_turn(user_msg, asst_msg) → 入队 retain job
工具调用：   → agent 调 hindsight_retain / _recall / _reflect

// 会话结束时：
→ shutdown() → 等待队列消费完毕 → 关闭 client session

完整交互时序

  用户发消息 "昨天我们讨论的那个方案..."

  Hermes → 收到消息
  Hermes → prefetch("昨天讨论的方案")   // 后台 recall
  Hermes → 组装的系统提示包含 Hindsight 返回的上下文
  GLM-5 → 回复时引用历史
  Hermes → sync_turn(user_msg, asst_msg) → 入队 retain 任务
  Hindsight 写者线程 → POST /api/retain { content, context, tags }
  Hindsight API → LLM 提取实体 → 向量化 → 存储到 PG

本地进程拓扑

Mac mini 上的实际运行状态：

Hindsight Daemon
PID 95629 · :9177

PostgreSQL 18.1
pg0 管理 · 5433

连接池
daemon → PG

11 MB

数据库大小
hindsight 库

      ⚠️ 沉淀 之前有个废弃实例（redis-dembed- / 5432，0 连接）已清理。当前只有 hindsight-embed-hermes (5433) 在运行。
    

hindsight_api.main · PID 95629 · FastMCP

↓ pg0 内部连接

PostgreSQL 18.1 (pg0) · PID 94685 · 127.0.0.1:5433

┌─────────────────────┐

Hermes Agent · hindsight_client (HTTP)

GLM-4-Flash · Z.AI API (远程)

关键发现：

Daemon idle_timeout=0 → 永不自关闭，和 Hermes 生命周期解耦
pg0 管理 PG → 嵌入型 PostgreSQL，minimal 配置，密码静态存储
数据库仅 11 MB → 使用量还很小，适合长期积累

配置方案

// ~/.hermes/hindsight/config.json
{
  "mode":          "local_external",
  "api_url":       "http://localhost:9177",
  "recall_budget": "mid",
  "auto_recall":   true,
  "auto_retain":   true,
  "retain_async":  true,
  "retain_every_n_turns": 1,
  "memory_mode":   "hybrid"
}

模式	行为	适合
context 🧠	自动 prefetch + 系统提示注入，不暴露 tool	想自动获取但不想手动调 tool
tools 🔧	只暴露 tool schemas，不自动注入上下文	精确控制，只在需要时调 tool
hybrid ◀	prefetch 注入系统提示 + tools 可用	当前设置：自动获取 + 按需精细检索

三种工作模式

模式	客户端	需要什么	适用场景
cloud	hindsight_client (HTTP)	API key + cloud URL	轻量，不占本地资源
local_embedded	HindsightEmbedded	~200MB 模型 + LLM key	全本地，无外部依赖
local_external ◀	hindsight_client (HTTP)	已有 daemon + PG	当前：daemon 先起，Hermes 连接

📚 扩展阅读

plugins/memory/hindsight/__init__.py 在 Hermes 源码中——1760 行的完整实现。