Bug Fix·Pushed May 1, 2026·S
Structured output now works with reasoning models
Gitpulse's AI summarizer now handles reasoning models like DeepSeek-R1 and MiniMax, which prefix responses with thinking traces. A robust JSON extractor strips these blocks automatically, enabling structured output across more model types.
Some AI models think out loud before answering. Models like DeepSeek-R1 and MiniMax prefix their responses with blocks containing their reasoning process — but this broke gitpulse's structured output pipeline.
The previous implementation relied on LangChain's withStructuredOutput, which uses tool-calling under the hood. When the parser received a response starting with raw thinking text instead of a clean tool-call, it failed.
A custom JSON extractor now handles the heavy lifting. It strips blocks, removes markdown code fences, and narrows to the outermost {} even when prose wraps the JSON. The same Zod schema validation happens afterward, so output integrity is preserved.
The summarizer now works across reasoning and non-reasoning models — no configuration changes required.
Technical description
## Changes to [[code ref=1]]action/src/llm.ts[[/code]]
The core issue: reasoning models (DeepSeek-R1, MiniMax M2.x) prefix responses with thinking traces. The previous approach used LangChain's [[code ref=2]]StorySchema[[/code]] with [[code]]withStructuredOutput[[/code]], which relies on tool-calling. When these models include raw text before any structured response, the parser fails.
### The Fix
Replaced tool-calling with a custom extraction pipeline:
````typescript
file=action/src/llm.ts
// Before: structured output via tool-calling
const structured = llm.withStructuredOutput(StorySchema, { name: 'gitpulse_story' });
const result = await structured.invoke([]);
// After: plain chat + manual extraction
const response = await llm.invoke([]);
const raw = typeof response.content === 'string' ? response.content : extractText(response.content);
const json = extractJson(raw);
const parsed = JSON.parse(json);
return StorySchema.parse(parsed);
````
### JSON Extraction
The [[code ref=3]]extractJson[[/code]] function handles three cases in sequence:
1. **Strip blocks**: Uses regex [[code]]/[\s\S]*?<\/think>/gi[[/code]] to remove thinking traces
2. **Extract from code fences**: Matches ```json blocks and extracts the content
3. **Narrow to JSON**: Finds the outermost [[code]]{}[[/code]] using indexOf/lastIndexOf
This approach is resilient — it works whether the model outputs clean JSON, JSON in fences, or JSON wrapped in explanatory prose.
### Schema Update
The Zod schema was simplified to remove inline descriptions (now in the system prompt instead), but validation rules remain identical.
### Files at a Glance
- [[code]]action/src/llm.ts[[/code]] — LLM integration with reasoning model support
Categories
- Bug Fix (65%) — Fixes incompatibility with reasoning models (MiniMax, DeepSeek-R1) that prefix responses with blocks, breaking the previous tool-calling approach
- Refactoring (35%) — Changed the structured output approach from LangChain's withStructuredOutput (tool-calling) to manual JSON extraction with custom parsers