Back to blog

How Claude Code Stores Transcripts Without a Database

Claude Code stores transcripts in flat JSONL files instead of a database. Here is how the storage architecture works, why it fits agent workflows, and where the limits show up.

Martin Vančo
How Claude Code Stores Transcripts Without a Database

Most developer tools reach for SQLite the moment they need persistence. Claude Code does not. After reverse-engineering Claude Code v2.1.85 and inspecting a live installation, the storage model turns out to be much simpler: flat files, mostly JSONL, arranged under ~/.claude/.

That sounds almost too simple for an agent that can spawn subagents, stream tool calls, track session state, support undo, and resume previous work. But the choice is more deliberate than it looks.

The big idea

Claude Code stores sessions as append-only transcript files instead of rows in a database. At a high level, the filesystem looks like this:

~/.claude/
├── history.jsonl
├── stats-cache.json
├── projects/
│   └── {encoded-project-path}/
│       ├── {session-id}.jsonl
│       ├── {session-id}/
│       │   ├── subagents/
│       │   └── tool-results/
│       └── memory/
├── sessions/
├── file-history/
├── shell-snapshots/
├── tasks/
└── plans/

One project directory. One transcript file per session. Extra folders hold large tool output, subagent transcripts, undo snapshots, and auxiliary state. No migrations, no connection pools, no WAL files, and no database corruption story.

Why JSONL fits this problem

Claude Code is not storing relational business data. It is storing an event stream, and that matters.

Each user message, assistant block, tool result, hook summary, title update, and system event gets appended as one JSON object per line. That makes JSONL a natural fit for the workload:

  • New events are mostly append-only.
  • Sessions are naturally isolated.
  • Files are easy to inspect with cat, rg, or jq.
  • Crash recovery is simpler than reconstructing a partially committed relational state.

For an agent transcript, this is usually the right trade. The most important design decision is not "JSONL vs SQLite." It is that Claude Code treats the transcript as a log first.

The storage model is optimized around sessions

The sharding strategy is almost embarrassingly practical. Claude Code groups sessions by the working directory where they started. The project path is encoded into a filesystem-safe folder name, and each session gets its own UUID-named JSONL file.

That immediately solves a lot of problems:

  • Minimal cross-session contention
  • Trivial cleanup
  • Natural project-level grouping
  • Easy session export and analysis

It also explains why session-level tooling built on top of Claude Code can work well with simple file scanning.

What we learned building Vibenalytics on top of this

The local storage model looks simple when you first inspect it. In practice, building a reliable analytics product on top of it is where things get interesting.

Vibenalytics now supports two collection paths:

  • hooks, for real-time capture through Claude Code's event system
  • transcripts, for historical import and exact raw API metadata

That dual approach was not theoretical. It came from real limitations we hit.

Hooks are excellent when you want events immediately and want to sync before Claude Code's local history rotates away. Transcripts are better when you need exact token counts, turn metadata, and older sessions that happened before the plugin was installed.

If you only use hooks, you can miss details that only exist in the transcript stream. If you only use transcripts, you inherit all the quirks of a storage format that was designed for Claude Code's own runtime, not for external analytics.

That is why the right answer for us ended up being both.

The practical difference from a database-first design

This architecture makes different tradeoffs than a small embedded database would. Neither approach is universally correct, but for an append-heavy transcript log, Claude Code's choices are unusually coherent.

ConcernFlat JSONL filesSQLite-style storage
Write patternExcellent for append-only event streamsStrong for mixed reads/writes
Human inspectabilityVery highLow without tooling
Per-session isolationNaturalRequires schema and filtering
MigrationsUsually unnecessaryEventually unavoidable
Concurrency handlingSimpler with one file per sessionCentralized but more coupled
Ad hoc shell analysisEasy with rg, jq, tailAwkward

That table is the core reason the design works. Claude Code is not trying to be a general-purpose datastore. It is trying to preserve a structured local event log.

If you want the next layer down from storage layout into how those records are actually reconstructed, see Inside Claude Code Transcripts: Record Types, Trees, and Turn Reconstruction.

Writes are cheap because they are batched

The storage engine does not flush every event immediately. It uses a small async batching window, around 100ms, to coalesce many transcript writes into fewer append operations. That matters in real usage.

A single agent turn can produce:

  • a thinking block
  • a text block
  • one or more tool calls
  • one or more tool results
  • system timing metadata

Without batching, that becomes a burst of tiny writes. With batching, it turns into one or two append operations per file. Same transcript, much less I/O noise.

This is one of those details that tells you the authors were optimizing for long-running, tool-heavy sessions rather than toy demos.

Here is the kind of logic this implies at a high level:

type TranscriptEntry = { type: string; uuid: string }

const writeQueue = new Map<string, TranscriptEntry[]>()

function enqueueWrite(filePath: string, entry: TranscriptEntry) {
  const pending = writeQueue.get(filePath) ?? []
  pending.push(entry)
  writeQueue.set(filePath, pending)
  scheduleDrain()
}

async function drainWriteQueue() {
  for (const [filePath, entries] of writeQueue) {
    const payload = entries.map((entry) => JSON.stringify(entry)).join('\n') + '\n'
    await appendFile(filePath, payload)
  }
  writeQueue.clear()
}

This is obviously simplified, but it captures the architectural point: coalesce first, append second.

On the Vibenalytics side, we also had to deal with a more awkward timing reality: the Claude Code Stop hook can fire before the final assistant message is fully flushed to disk. The practical fix in our CLI is not glamorous. After boundary events like Stop and SessionEnd, we sleep for two seconds before parsing the transcript.

// Wait for transcript writes to flush before parsing.
// Stop fires before the final assistant message is fully written (~500ms race).
std::thread::sleep(std::time::Duration::from_secs(2));

That is the kind of decision you only make after discovering the race in practice. It is not academically elegant. It is operationally honest.

The concurrency lesson showed up again on our side. One of the nastier production bugs in Vibenalytics was not about Claude Code's transcript writer at all. It was about our own metrics.jsonl.

Under heavy concurrent use, multiple hook-triggered processes could try to read and truncate the same local metrics file. That led to partial reads and occasional corruption risk. The fix was an atomic staging pattern:

metrics.jsonl
  -> rename atomically to metrics.staging.jsonl
  -> sync from staging
  -> archive on success
  -> prepend back on failure

That is a very specific example of the bigger point: once you build on append-only local files, correctness depends on being defensive about timing and atomicity.

Claude Code only creates the transcript file when the session becomes real

Another good detail: Claude Code does not create a session JSONL file the moment the process starts. It waits until there is an actual user or assistant message, which prevents empty junk sessions from piling up when someone opens Claude Code, looks around, and quits. The filesystem reflects real work, not just process launches.

Reliability is handled with targeted mechanisms, not one universal abstraction

This is where the architecture gets more interesting. Claude Code does not use the same persistence strategy everywhere.

  • Session transcripts rely on per-file isolation and in-process queues.
  • Shared history uses locking because multiple sessions append to it.
  • Cached stats are written through a temp-file-plus-rename pattern for atomicity.
  • Shutdown paths switch to synchronous writes so the last metadata survives process exit.

That is the right level of engineering. Instead of introducing a heavier persistence layer, Claude Code applies just enough coordination to each file based on how that file is used.

We ran into the same category of problem in project attribution. Vibenalytics hashes project paths locally using FNV-1a so raw filesystem paths never leave the machine, but the hook-based path and the transcript-derived path did not initially normalize the same way. The result was a very real bug: the same project showed up as two different projects in the dashboard.

The root cause was subtle. Hooks gave us the raw cwd. Transcript discovery had to decode Claude Code's encoded directory names. Those two inputs looked equivalent until they were not, especially once non-slash special characters were involved. The fix was not "hash better." The fix was to normalize paths before hashing and tighten the transcript decoder.

That kind of bug is exactly why transcript analytics is harder than it first appears. The format is simple enough to read, but correctness lives in the edge cases.

The downside of local flat-file storage

The tradeoffs are real. Flat files are excellent for append-only local logs, but they are not a complete analytics system.

Claude Code's local session data has a default retention window of roughly 30 days. On cleanup, older session JSONL files and related artifacts are removed. That is perfectly reasonable if the product goal is local operational state.

It is not enough if your goal is longitudinal insight. If you want to answer questions like:

  • Which project consumed the most usage over the last quarter?
  • When do subagents spike most often?
  • How has tool mix changed across machines?
  • Which workflows are getting slower over time?

...then a 30-day local transcript window is not a history layer. It is a short-lived cache.

That distinction matters because many developers assume "the data exists locally" means "the history is available." In practice, no history means no insight.

If you were parsing this yourself

If you want to build a local transcript parser, the storage layout already tells you how to approach it:

find ~/.claude/projects -name '*.jsonl' -maxdepth 3

Then process each session file line by line:

jq -c '. | {type, timestamp, sessionId, parentUuid}' session.jsonl

And only after that start layering in:

  • subagents/*.jsonl
  • tool-results/*.txt
  • session metadata records
  • retention-aware historical aggregation

The order matters. Many transcript tools start with cost estimation and skip structural correctness. That is backwards.

Vibenalytics also uses transcript cursors as more than just resumable offsets. If the stored byte offset is greater than the current file size, the system treats that as a rotation or truncation signal and resets the cursor to zero. One state machine ends up solving two problems:

  • deduplicating already-read transcript bytes
  • recovering when Claude Code rotates or recreates local files

That is the sort of detail that rarely makes it into marketing copy, but it is exactly the kind of defensive engineering transcript-based systems need.

The same theme shows up again in the retention problem. Local files are a good source of truth for recent work, but they are not a durable history layer. That is covered in more detail in Claude Code's 30-Day Memory Problem.

There is also a product-level lesson here. If you are building analytics on top of Claude Code, the transcript directory is not just raw data. It is a moving, eventually-consistent local system with:

  • async writes
  • per-session sharding
  • sidecar subagent files
  • compaction artifacts
  • cleanup pressure

That means the first job is not charting. The first job is building a trustworthy ingestion path.

What this architecture gets right

Claude Code's storage system is not flashy. That is part of why it is good. It is:

  • debuggable
  • portable
  • append-friendly
  • easy to inspect
  • robust enough for concurrent local use

For transcript persistence, that is a strong set of priorities. The more important lesson is architectural: if the data is fundamentally an event stream, storing it like an event stream often beats forcing it into a database-shaped solution too early.

What this means for analytics tooling

If you are building around Claude Code transcripts, the local storage layer is a good source of truth for recent sessions. It gives you rich raw material with minimal ceremony, but it also has a built-in ceiling:

  • data is local
  • retention is limited
  • cross-machine visibility is fragmented
  • historical analysis expires

That is exactly why observability tools emerge around agent workflows. Token counts are just the surface. The real value is context: where usage happened, which project generated it, how it changed, and what pattern it belongs to. The transcript format gives you the raw events. A real analytics layer gives you the timeline.

This is where Vibenalytics fits

Claude Code's local transcript storage is rich, but short-lived. Vibenalytics preserves the useful metadata over time so you can compare projects, sessions, and workflows after the local files have rotated out.

See the dashboard