Version: 0.9.53.dev.260429

后端： 1. 流式思考链路从 raw reasoning_content 切到 `thinking_summary` 摘要协议，补齐摘要 prompt、digestor 与 Lite 压缩链路，plan / execute / fallback 统一改为“只出摘要、不透原始推理”，正文开始后自动关停摘要流。 2. thinking_summary 打通 timeline / SSE / outbox 持久化闭环，只落 detail_summary 与必要 metadata，并补强 seq 自检、冲突幂等识别与补 seq 回填，提升重放恢复稳定性。 3. 会话历史口径继续收紧，assistant 正文与时间线不再回写 raw reasoning_content，仅保留正文与思考耗时，避免刷新恢复时再次暴露内部推理文本。前端： 4. 助手页开始接入 thinking_summary 实时流与历史恢复，补齐短摘要状态、长摘要折叠区、正文开流后自动收口，并增加调试入口用于协议联调与验收。 5. 当前前端助手页仍是残次过渡态，本版先以 thinking_summary 协议接通和基础渲染为主，样式、交互与细节体验暂未收平，下一版集中修复。仓库： 6. 补充 thinking_summary 对接说明，明确 SSE 协议、timeline 恢复口径与 short/detail summary 的使用边界。
2026-04-29 01:00:38 +08:00
parent d89e2830a9
commit f81f137791
21 changed files with 8566 additions and 229 deletions
--- a/docs/frontend/newagent_thinking_summary_对接说明.md
+++ b/docs/frontend/newagent_thinking_summary_对接说明.md
@@ -0,0 +1,389 @@
+# NewAgent 思考摘要前端对接说明
+
+## 背景
+
+后端已经不再把模型原始 `reasoning_content` 直接透传给前端。新的展示入口是 SSE 顶层 `extra.kind = "thinking_summary"` 事件。
+
+目标体验：
+
+- 用户等待模型深度思考时，前端每隔几秒收到一条短摘要，作为当前思考状态的轻量提示。
+- 展开后展示稍长的 `detail_summary`，多条按时间追加。
+- 模型开始输出正文后，当前思考摘要停止更新。
+- 刷新会话后，只恢复长摘要，不恢复短摘要。
+
+## 实时 SSE 协议
+
+聊天接口仍然是：
+
+```http
+POST /api/v1/agent/chat
+Content-Type: application/json
+Accept: text/event-stream
+```
+
+SSE 每个业务包仍是标准格式：
+
+```text
+data: {json}
+
+data: [DONE]
+```
+
+后端保活心跳是 SSE 注释行：
+
+```text
+: ping
+```
+
+前端按现有逻辑忽略不能 JSON.parse 的块即可。
+
+## thinking_summary 事件
+
+实时思考摘要事件没有 `delta.content`，也没有 `delta.reasoning_content`。前端应从顶层 `extra.thinking_summary` 读取。
+
+示例：
+
+```json
+{
+  "id": "trace-id",
+  "object": "chat.completion.chunk",
+  "created": 1777399000,
+  "model": "pro",
+  "extra": {
+    "kind": "thinking_summary",
+    "block_id": "plan.speak",
+    "stage": "plan",
+    "display_mode": "append",
+    "thinking_summary": {
+      "summary_seq": 1,
+      "short_summary": "正在梳理计划",
+      "detail_summary": "正在把用户目标拆成可执行步骤，并检查是否需要补充约束。",
+      "duration_seconds": 3.214
+    }
+  }
+}
+```
+
+字段说明：
+
+| 字段 | 说明 |
+| --- | --- |
+| `extra.kind` | 固定为 `thinking_summary`。 |
+| `extra.block_id` | 当前摘要所属展示块，例如 `plan.speak`、`execute.speak`、`fallback.speak`。建议作为分组 key 的一部分。 |
+| `extra.stage` | 当前节点阶段，例如 `plan`、`execute`、`fallback`。 |
+| `extra.display_mode` | 当前固定为 `append`，表示长摘要按条追加。 |
+| `thinking_summary.summary_seq` | 同一个摘要器内递增，用于忽略重复或乱序摘要。不要当作全局 timeline seq。 |
+| `thinking_summary.short_summary` | 实时短摘要，只用于当前流式展示，不持久化。 |
+| `thinking_summary.detail_summary` | 展开态长摘要，按 append 语义追加；刷新后也只恢复这个字段。 |
+| `thinking_summary.duration_seconds` | 从首次收到 reasoning 到生成该摘要的耗时秒数，可能是小数。 |
+| `thinking_summary.final` | 可选。若出现 `true`，表示该摘要器在没有正文打断的情况下自然收口。不要依赖它一定出现。 |
+
+已删除字段：
+
+- `state` 已从协议、prompt、timeline 持久化里删除，前端不要再依赖或展示。
+
+## 前端处理建议
+
+建议把思考摘要作为 assistant 消息内的一个子结构，而不是普通正文。
+
+推荐 key：
+
+```ts
+const key = extra.block_id || extra.stage || 'thinking'
+```
+
+推荐类型：
+
+```ts
+export interface ThinkingSummaryPayload {
+  summary_seq?: number
+  short_summary?: string
+  detail_summary?: string
+  final?: boolean
+  duration_seconds?: number
+}
+
+export interface ThinkingSummaryBlock {
+  key: string
+  stage?: string
+  blockId?: string
+  latestSeq: number
+  latestShort: string
+  details: Array<{
+    seq: number
+    text: string
+    durationSeconds?: number
+    final?: boolean
+  }>
+  active: boolean
+  collapsed: boolean
+}
+```
+
+实时处理伪代码：
+
+```ts
+function handleThinkingSummary(extra: StreamExtra, message: AssistantMessage) {
+  if (extra.kind !== 'thinking_summary') return false
+
+  const summary = extra.thinking_summary
+  if (!summary) return true
+
+  const key = extra.block_id || extra.stage || 'thinking'
+  const block = ensureThinkingSummaryBlock(message, key, {
+    stage: extra.stage,
+    blockId: extra.block_id,
+  })
+
+  const seq = summary.summary_seq ?? block.latestSeq + 1
+  if (seq <= block.latestSeq) return true
+
+  block.latestSeq = seq
+  block.active = summary.final !== true
+
+  if (summary.short_summary?.trim()) {
+    block.latestShort = summary.short_summary.trim()
+  }
+
+  if (summary.detail_summary?.trim()) {
+    block.details.push({
+      seq,
+      text: summary.detail_summary.trim(),
+      durationSeconds: summary.duration_seconds,
+      final: summary.final,
+    })
+  }
+
+  return true
+}
+```
+
+正文开始时的处理：
+
+```ts
+function handleAssistantContentStart(message: AssistantMessage) {
+  // 后端正文一出现就会停止当前 block 的摘要；
+  // 前端这里也可以把活跃思考块收口，避免动效继续闪。
+  message.thinkingSummaryBlocks?.forEach(block => {
+    block.active = false
+  })
+}
+```
+
+注意：
+
+- 收到 `thinking_summary` 时，不要追加到 `assistantMessage.content`。
+- 收到 `thinking_summary` 时，不要写入旧的 `assistantMessage.reasoning`。
+- 若仍收到旧链路 `delta.reasoning_content`，可以保留兼容，但新样式应优先使用 `thinking_summary`。
+- `summary_seq` 只在同一个 `block_id/stage` 下去重；不同 block 不要互相比较。
+
+## 展示语义
+
+短摘要：
+
+- 展示最新一条 `short_summary`。
+- 适合放在折叠态标题、胶囊、加载条旁边。
+- 不要持久化到本地历史，也不要在刷新恢复后强行补出来。
+
+长摘要：
+
+- 每次收到非空 `detail_summary` 就追加一条。
+- 展开态展示 `details` 列表。
+- 如果你想做得更像 Gemini/豆包，可以折叠态只露最新短摘要，展开态按时间展示长摘要列表。
+
+收口条件：
+
+- 收到第一段 `delta.content`：关闭当前 assistant 消息里的活跃思考态。
+- 收到 `finish_reason` 或 `[DONE]`：关闭所有活跃思考态。
+- 收到 `thinking_summary.final === true`：可以关闭对应 block，但不要依赖它总会出现。
+
+## 历史 timeline 恢复
+
+刷新会话时读取：
+
+```http
+GET /api/v1/agent/conversation-timeline?conversation_id={conversation_id}
+```
+
+统一响应仍是：
+
+```json
+{
+  "status": "0",
+  "info": "success",
+  "data": []
+}
+```
+
+`thinking_summary` timeline item 示例：
+
+```json
+{
+  "id": 123,
+  "seq": 8,
+  "kind": "thinking_summary",
+  "content": "正在把用户目标拆成可执行步骤，并检查是否需要补充约束。",
+  "payload": {
+    "stage": "plan",
+    "block_id": "plan.speak",
+    "display_mode": "append",
+    "summary_seq": 1,
+    "detail_summary": "正在把用户目标拆成可执行步骤，并检查是否需要补充约束。",
+    "duration_seconds": 3.214
+  },
+  "created_at": "2026-04-28T21:00:00+08:00"
+}
+```
+
+历史恢复规则：
+
+- 只恢复 `detail_summary`，没有 `short_summary`。
+- 按 timeline item 的 `seq` 排序渲染即可，后端已升序返回。
+- 可用 `payload.block_id || payload.stage || "thinking"` 归组到对应 assistant 消息附近。
+- 如果当前前端还没做跨事件归组，可以先把它渲染为 assistant 消息里的“思考摘要条目”，位置按 timeline 顺序插入。
+
+建议更新现有前端类型：
+
+```ts
+export interface TimelineThinkingSummaryPayload {
+  stage?: string
+  block_id?: string
+  display_mode?: 'append'
+  summary_seq?: number
+  detail_summary?: string
+  duration_seconds?: number
+  final?: boolean
+}
+
+export interface TimelineEvent {
+  id: number
+  seq: number
+  kind:
+    | 'user_text'
+    | 'assistant_text'
+    | 'tool_call'
+    | 'tool_result'
+    | 'confirm_request'
+    | 'schedule_completed'
+    | 'business_card'
+    | 'thinking_summary'
+  role?: 'user' | 'assistant'
+  content?: string
+  payload?: {
+    stage?: string
+    block_id?: string
+    display_mode?: 'append' | 'replace' | 'card'
+    thinking_summary?: never
+    detail_summary?: string
+    summary_seq?: number
+    duration_seconds?: number
+    final?: boolean
+    tool?: TimelineToolPayload
+    confirm?: TimelineConfirmPayload
+    business_card?: TimelineBusinessCardPayload
+  }
+  tokens_consumed?: number
+  created_at?: string
+}
+```
+
+## 与正文/工具卡片的关系
+
+同一轮流里可能出现：
+
+1. `thinking_summary`
+2. `tool_call` / `tool_result`
+3. `assistant_text` 或 `delta.content`
+4. `finish`
+5. `[DONE]`
+
+前端建议：
+
+- `thinking_summary` 是“等待过程”组件。
+- `tool_call` / `tool_result` 继续走现有工具卡片。
+- `delta.content` 继续追加到 assistant 正文。
+- `finish` / `[DONE]` 只负责收尾，不需要生成可见消息。
+
+## 测试用例
+
+### 1. 只有摘要，还没正文
+
+输入事件：
+
+```json
+{
+  "extra": {
+    "kind": "thinking_summary",
+    "block_id": "plan.speak",
+    "stage": "plan",
+    "display_mode": "append",
+    "thinking_summary": {
+      "summary_seq": 1,
+      "short_summary": "正在理解需求",
+      "detail_summary": "正在识别用户的目标、约束和需要补充的信息。",
+      "duration_seconds": 2.1
+    }
+  }
+}
+```
+
+预期：
+
+- 折叠态显示“正在理解需求”。
+- 展开态新增一条 detail。
+- 正文区域不新增文字。
+
+### 2. 多条摘要追加
+
+输入 `summary_seq=1,2,3`。
+
+预期：
+
+- `latestShort` 使用第 3 条短摘要。
+- `details` 有 3 条，按收到顺序或 seq 升序展示。
+
+### 3. 乱序或重复摘要
+
+已处理到 `summary_seq=3` 后，又收到 `summary_seq=2`。
+
+预期：
+
+- 忽略旧事件，不回退短摘要，不追加 detail。
+
+### 4. 正文开始
+
+收到：
+
+```json
+{
+  "choices": [
+    {
+      "delta": { "content": "我整理好了，下面是建议：" }
+    }
+  ]
+}
+```
+
+预期：
+
+- 当前活跃思考块停止 loading 动效。
+- 正文正常追加。
+- 后续若仍意外收到同 block 摘要，可按 seq 处理，但 UI 上建议不再重新激活。
+
+### 5. 历史恢复
+
+timeline 返回 `kind=thinking_summary`。
+
+预期：
+
+- 只展示 `payload.detail_summary || content`。
+- 不展示短摘要占位。
+- 不需要显示 `state`，协议里已经没有这个字段。
+
+## 最小改动清单
+
+1. `StreamEventPayload.extra` 增加 `thinking_summary` 字段。
+2. `TimelineEvent.kind` 增加 `thinking_summary`。
+3. SSE 解析里在 `handleStreamExtraEvent` 增加 `extra.kind === "thinking_summary"` 分支。
+4. 收到正文 `delta.content` 时，把当前思考摘要块置为非活跃。
+5. 历史 timeline 恢复时支持 `kind === "thinking_summary"`，只恢复长摘要。