SPORTS + FINANCE NLP

Sports Intelligence

Name: Sports Intelligence
Creator: Gerra

The only sports data source that powers real-time AI tool calls for the largest language models.

12 YEARSJSON · CSV · ParquetReal-time (during live events) or daily batch

1B+

NL queries

Sports covered

12 yrs

History

Response shapes

Sample Schema Included Methodology Evals Graders Application Integration

01Sample

See the data

Representative records in the exact shape we deliver. Real provenance and full slices are shared under license.

Table-type, leaderboard

Representative shape, not real data. Cell values are strings; column and row are order-aligned; row_count equals the number of rows.

queries.jsonlrepresentative

{
  "sport": "nba",
  "query": "who leads the nba in scoring this season",
  "type": "table",
  "column": ["RANK", "NAME", "PPG", "SEASON", "TM", "GP"],
  "row": [
    ["1", "Player A", "31.4", "2025-26", "XYZ", "44"],
    ["2", "Player B", "30.2", "2025-26", "ABC", "49"]
  ],
  "row_count": 2
}

Answer-type, single value

Representative. On answer-type, answer is populated and column / row / row_count are absent.

queries.jsonlrepresentative

{
  "sport": "mlb",
  "query": "who won the world series last year",
  "type": "answer",
  "answer": "Team X won the most recent World Series."
}

02Schema

Record shape

Every field, its type, whether it can be null, and a representative value.

Field	Type	Constraint	Description
sport	string	required	Sport category: nba, nfl, mlb, nhl, wnba, cfb, pga, fc. e.g. nba
query	string	required	The natural-language query as posed. e.g. who leads the nba in scoring this season
type	string	required	Response type: table (structured rows) or answer (single value). e.g. table
column	string[]	nullable	Column headers, order-aligned to each row. Null on answer-type. e.g. ["RANK","NAME","PPG"]
row	string[][]	nullable	2D row data; each inner array matches column order. Null on answer-type. e.g. [["1","Player A","31.4"]]
row_count	int · rows	nullable	Number of data rows returned. Null on answer-type. e.g. 25
answer	string	nullable	Natural-language answer for single-value responses. Present on answer-type only. e.g. Team X won the most recent title.

03What's included

NLP Query Layer

Natural language in, structured data out. The conversational query layer that powers live AI tool calls.

Historical Stats

All major sports, all historical seasons, all available statistics. Complete structured coverage.

Finance Overlay

Sports performance data mapped to publicly traded tickers (DKNG, FLUT, DIS, apparel brands).

04Methodology

How it is built

01
NL query capture
The dataset is a corpus of natural-language sports questions paired with their resolved structured responses. The unit of data is the query-to-structured-answer pair, not a raw stats table.
02
Structured resolution
Each query resolves to either a tabular result (column, row, row_count) or a single-value answer. Observed intents include leaderboard, career, season, standings, and historical.
03
Sport and stat coverage
Eight sports across all historical seasons and all available statistics. Per-sport stat vocabularies differ - an NBA leaderboard carries many more columns than a standings table.
04
Normalization to a common envelope
Heterogeneous stat responses across sports are wrapped in one consistent record envelope so consumers parse every sport identically.
05
Delivery and refresh
Served real-time during live events or as a daily batch.

05Evals

How we validate

What each evaluation measures and how it is run. Where no benchmark is published, we show the methodology and say so.

NL-to-structured resolution accuracy

Measures

Whether a query resolves to the correct structured answer - right stat, right entities, right ranking.

Method

A held-out set of NL queries scored against ground-truth stat tables, with exact-match on returned rows and values and column alignment.

Result

Methodology-stage. Not yet run or published.

Cross-sport schema consistency

Measures

Whether the record envelope parses uniformly across all eight sports.

Method

A schema-validation pass over records per sport: envelope conformance, column and row alignment, and row_count equal to the number of rows.

Result

Methodology-stage. No published figure.

06Graders

Ground truth

What correct means for this data, and how it is established.

Ground truth

Verified official statistics for the queried season and entity - the canonical numeric stats a query should resolve to.

How it is established

Compare the returned rows or answer against ground-truth stats by exact value match with column-order alignment; for answer-type, value match against the known fact.

Agreement

No grader implementation or agreement figure is published at this stage.

07Application

Sports Betting Intelligence

Query volume patterns reveal where attention and money flow before lines move. Track which matchups and players drive the most analytical interest.

Ticker-Mapped Performance

Correlate sports outcomes and engagement with stock performance of gaming, media, and apparel companies.

AI Model Training

The same NLP-to-structured-data pipeline that powers major AI models. Licensed for model training with full historical coverage.

08Environment & integration

How you load it

Delivery

S3, REST API, WebSocket

Formats

JSON, CSV, Parquet

Auth

Licensed access. Derived natural-language and structured data.

Cadence

Real-time during live events, or daily batch.

Request access.

Restricted-scope evaluation access for qualified teams. We share real samples, full schema, and provenance under a mutual NDA.

Talk to us team@gerra.com

Sports Intelligence

See the data

Record shape

NLP Query Layer

Historical Stats

Finance Overlay

How it is built

NL query capture

Structured resolution

Sport and stat coverage

Normalization to a common envelope

Delivery and refresh

How we validate

NL-to-structured resolution accuracy

Cross-sport schema consistency

Ground truth

Sports Betting Intelligence

Ticker-Mapped Performance

AI Model Training

How you load it

Request access.

Product

Company

Connect