v0.3.0 · MIT License · Python 3.10+
codectx compiles your repository into a structured context file for AI agents — ranking files by dependency graph centrality, compressing to a token budget, and emitting a document an agent can reason from immediately.
or: uv add codectx · pipx install codectx
When you dump a repository into an LLM context window, you get files in filesystem order — alphabetical, arbitrary, disconnected. The model sees tests before the modules they test, utility helpers before the architecture they support, config files before the code that reads them.
Naive context dumps waste the most valuable positions in the context window on noise. Most files in any codebase are boilerplate, test fixtures, and auto-generated code — none of which helps an agent understand the system.
Arbitrarily truncating at a token limit doesn't help either. You might cut off the core module and keep a lockfile.
codectx builds a dependency graph of your repository, then scores every file by its fan-in centrality — how many other files import it. Files that everything depends on rank highest.
Git commit frequency, distance from entry points, and (optionally) semantic similarity to your query combine into a composite score. The top 15% of files get full AST-derived structured summaries. The next 30% get function signatures. The rest get one-liners.
The output is not a source dump. It is a compiled document an agent can navigate from the first token — architecture first, then core modules, then periphery.
| Repository | Naive tokens | codectx tokens | Reduction | |
|---|---|---|---|---|
| fastapi | 224k | 78k | 64.9% | |
| requests | 41k | 6k | 84.7% | |
| typer | 80k | 35k | 55.4% | |
| rich | 354k | 28k | 92% | |
| httpx | 63k | 6k | 89.5% |