I noticed pretty high token usage when I finished working on the Blindfold Chess Trainer app. I was curious if I had done something stupid, and whether I could track it down using the logs/cache files that Claude Code maintains in the ~/.claude directory.

There's a ton of info there, but it's not 100% readily usable. You'd need to write some significant scripts to patch it all together and make it queryable.

Fortunately, the claude-code-log tool has already done that.

This is how you run it (assumes uv). By default, the command analyzes the entire ~/.claude folder. You can also restrict it to specific projects.

uvx claude-code-log@latest 

It will crunch for a minute. There are two important output files:

- ~/.claude/projects/index.html
- ~/.claude/projects/claude-code-log-cache.db

The index.html file is a sort of stats / project navigator tool. It's quite nice for reviewing old conversations with Claude. But we are interested in the SQLite database file: claude-code-log-cache.db. This contains timestamped information about every message including the token and message counts.

Making Charts

You could fire up Claude here and begin analyzing immediately. Assuming you've got a python environment, something like this should work:

Use python and seaborn to analyze the claude-code-log-cache.db sqlite database. Make an area chart showing the number of user prompts grouped by hour. Do not group by date. Convert timestamps to EST. Use a dark theme and semi-transparent blue for the chart.

Exactly what you get back is gonna depend on your particular projects, cache history, settings, CLAUDE.md files, the phase of the moon, etc.

But it will probably look KINDA like this:

Area Chart showing User Prompts by Hour

That's a simple example, but Claude can come up with some pretty neat stuff here by joining the different fields together in creative ways.

But wait.. there's more

With a quick detour, you can also get some more very useful data. For reasons evidently related to data compression, the original messages.content field is encoded. You can ask Claude to decode this field for you and it will add extra columns to the database, something like:

- decoded_text
- decoded_thinking
- tool_name
- file_paths
- has_thinking
- has_tools

Having the complete user messages and assistant responses in the decoded_text and decoded_thinking fields opens a lot of possibilities.

I've been exploring things like:

  • cost per token
  • context length vs session depth
  • effect of reading/writing large files

Turns out that I had been doing two things that tended to balloon the context window: barely ever using /clear and asking for complex changes to large files. During the last 1/3 of the blindfold chess project, I had inlined everything into a single HTML file.

Whether a large context is desirable depends on exactly what you're doing. But it's at least important to understand how it behaves.

Other Stats

The claude-code-log tool seems pretty comprehensive for project data stored in the ~/.claude folder. But there are some other ways to generate logs and stats that could be interesting to analyze.

One fun thing would be to correlate this cache data with the rich output logs available in the Hooks feature. You could analyze what packages are getting installed or what bash commands are running in detail. Maybe something to look at soon.