CODEBASE TRAINING DATA
Codebase Intelligence
Full-history private codebases - commits, pull requests, reviews, and linked issues - packaged as training data for coding agents and SWE evals.
Overview
What's included
Source & Review History
Commits, diffs, pull requests, and review threads with author and repo entity resolution. Real engineering decisions, not synthetic tasks.
Issue-to-Code Linkage
Tickets and project history joined to the commits and PRs that resolved them - end-to-end traces for SWE-agent evals.
Build & Release Signal
CI runs, release tags, and deploy history - the full lifecycle from issue to shipped code.
Application
Coding-Agent Training
Real multi-year SDLC behavior across many codebases - grounded supervision for agents that read, modify, and ship inside real repos.
SWE Eval Harness
Point-in-time repo snapshots with the linked issue and the human PR that fixed it. Score an agent against what real engineers did.
Provenance-Clean Pretraining
A consented code corpus with surrounding history - distinct from scraped public code of unknown license.
Request trial access.
90-day trial with restricted scope for evaluation. No commitment required.