If you open a Claude Code session file for the first time, it is tempting to think you are looking at a chat log. You are not. A Claude Code transcript is closer to an event stream that happens to describe a conversation.
That difference matters if you want to analyze sessions correctly, calculate usage, rebuild conversation state, or understand where costs came from.
One session, many record types
Claude Code stores each session as a JSONL file where every line is a self-contained record with a type.
Across the reverse-engineered build, there are 22 distinct record types, including:
userassistantsystemlast-promptcustom-titlesummarymodefile-history-snapshotmarble-origami-commitmarble-origami-snapshot
This is the first place many parsers go wrong. If you assume every line is "a message," your analysis is already off. Some lines are conversation content. Some are metadata. Some are operational telemetry. Some exist purely to support resume, undo, or context compaction.
What one raw record actually looks like
At the storage layer, a transcript entry is not abstract. It is a concrete JSON object with enough fields to rebuild both conversation flow and runtime state.
{
"type": "user",
"uuid": "13250cfe-9b8b-40ec-815f-5e342f4a2f3f",
"parentUuid": "d99fa40e-0553-4471-817d-61914f1b77b1",
"timestamp": "2026-03-27T10:07:35.130Z",
"sessionId": "bb3a7def-9f72-4e9e-86a8-92cf2808b01b",
"cwd": "/Users/martin/Documents/PagespeedBulk",
"permissionMode": "bypassPermissions",
"message": {
"role": "user",
"content": "the desktop version fetching is not scheduled regularly"
}
}
That is why transcript parsing is more than string processing. You are reconstructing causal state.
If you are coming from the storage side first, How Claude Code Stores Transcripts Without a Database is the better starting point before this parsing layer.
Assistant turns are split across multiple records
A single Claude response is not stored as one blob. Instead, Claude Code persists one record per content block as the response streams in. A typical turn might look like:
- Assistant thinking block
- Assistant text block
- Assistant tool call block
- User tool result block
- Assistant follow-up text
- System turn duration record
That means line counts are a bad proxy for turns. If you want to reconstruct actual conversational turns, you need to pay attention to terminal markers like stop_reason: "end_turn" and stop_reason: "tool_use".
This is exactly why Vibenalytics ended up with two ingestion modes instead of pretending one was enough. Hooks are great for low-latency metadata capture, but transcripts are where the raw response structure lives. Once you care about accurate token accounting, prompt assembly, and the difference between one assistant turn and several streamed content blocks, transcript parsing stops being optional.
The transcript is a tree, not a list
This is the most important structural detail in the whole format. Messages are linked by parentUuid.
That means the transcript is not just ordered by time. It is ordered by ancestry. The shape is a singly linked tree that lets Claude Code represent:
- normal turn flow
- tool use chains
- branch points
- subagent sidechains
- compaction boundaries
In practice, this lets you walk backwards from a record to reconstruct the chain that produced it. If you ignore parentUuid and only trust file order, your parser will work until it does not.
Why the tree structure exists
There are three strong reasons for this design.
1. Tool flow needs causality
A tool result is not just "another user message." It is the result of a specific assistant tool request. Claude Code preserves that relationship both through the parent chain and through additional linking fields like sourceToolAssistantUUID, which makes tool-level reconstruction much more reliable.
2. Subagents need branch points
When Claude Code spawns a subagent, the conversation branches logically from a tool invocation in the main session. That is hard to model cleanly in a flat list. It is natural in a tree.
3. Context compaction breaks simple chronology
Once context compaction enters the picture, the model-visible conversation and the full UI-visible history are no longer identical. The transcript needs a way to represent both. That is where the tree starts doing real work.
Thinking blocks are present, but not readable
Claude Code stores thinking records, but the actual thinking text is redacted. What remains is an empty thinking field plus a cryptographic signature.
This is a subtle but important design choice. The transcript proves that a thinking block existed and preserves enough integrity metadata to verify it later, but the internal reasoning itself does not land on disk in plaintext.
For transcript analysts, the practical implication is simple:
- you can count thinking blocks
- you can place them in sequence
- you cannot inspect their content
Any analytics product pretending otherwise is making something up.
System messages carry more value than most people realize
System records are not just noise. They contain some of the most useful operational data in the transcript.
Examples include:
- API errors and retry metadata
- stop hook summaries
- turn duration
- compaction boundaries
This means Claude Code transcripts do not just tell you what the agent said. They also tell you what the runtime experienced. That is the difference between a chat export and an observability source.
A useful parser mental model
If you are designing a parser, think in layers:
| Layer | Question it answers |
|---|---|
| Raw record stream | What lines exist in the session file? |
| Record typing | Which lines are message content vs metadata? |
| Parent reconstruction | Which records belong to the same conversational branch? |
| Turn assembly | Which content blocks together form one user-visible turn? |
| Session analytics | What happened, how much did it cost, and where? |
This is why simplistic line-based parsers keep drifting into wrong conclusions. They skip the middle layers and jump straight to totals.
That sounds theoretical until you build something on top of it. In Vibenalytics, some of the most important product decisions came from discovering that "just parse the files" was not enough. We needed:
- hooks for real-time ingestion
- transcript cursors for resumable reads
- prompt and request-level normalization
- explicit handling for compaction and subagents
Without those layers, the output looks analytical but is not trustworthy.
That becomes especially obvious once subagents enter the picture. For the storage and sync consequences of that, see How Claude Code Stores Subagents and Large Tool Results.
What a correct parser needs to do
If you want reliable analysis, your parser should follow a few rules.
Parse line by line
JSONL is append-oriented. Treat each line as an independent record.
Reconstruct turns explicitly
Do not assume one line equals one turn. Assistant responses are chunked. Tool calls split the flow. System records sit between logical conversation steps.
Follow parent links
Use parentUuid to rebuild chains. This is especially important when tool calls, compaction, or sidechains are involved.
Keep metadata records
Do not throw away non-message lines too early. Titles, summaries, mode changes, undo snapshots, and compaction markers all affect interpretation.
Handle "last write wins" metadata
Session metadata is appended, not updated in place. That means a session can have multiple title or summary records. The most recent one is authoritative.
One concrete example of why this matters is historical import. The transcript mode in Vibenalytics exists partly because users install the plugin after they already have Claude Code history on disk. That means the parser cannot just handle the "happy path" of current sessions. It has to tolerate partially written state, old files, and whatever Claude Code left behind before Vibenalytics was present.
That is a very different engineering problem from "render a transcript viewer."
The 30-day problem changes how you should think about parsing
Even if you parse the format perfectly, local transcripts still have a retention limit. Claude Code's local session storage is typically cleaned up after about 30 days, so the transcript is rich but temporary.
That creates an important asymmetry:
- the raw data is detailed enough for serious analysis
- the local history is short-lived enough to erase longer-term patterns
This is why transcript parsing alone is not the same thing as observability. A parser can tell you what happened inside a recent session. A history layer can tell you what has been happening across projects, machines, and months.
That distinction matters even more once compaction boundaries start changing what the model could actually see. For that, see marble-origami: How Claude Code Handles Context Compaction.

Want this without building your own parser?
Vibenalytics turns recent Claude Code events into project-level analytics, long-term history, and session drill-downs so you do not have to keep rebuilding transcript logic from scratch.