HoopQ · Open NBA Analytics · Home of Q-Rating

HoopQ.
Open NBA analytics, home of Q-Rating.

Deep-RL player impact, matchup-aware neural presence, per-game skill trajectories, doubly-robust off-policy evaluation, and lineup chemistry. Every metric built possession-by-possession over 2000-01 through 2025-26.

Home of Q-Rating

A deep-RL player impact metric with an action-level decomposition and a possession-level presence rating. Same points-per-100 units as RAPM, with a breakdown of where the value came from.

See the full Q-Rating leaderboard → Learn more about the metrics →

Featured Players

Top 15 by 2025-26 Q-Rating Total Impact (action + career presence) · ≥10,000 career possessions · ≥3,000 possessions in 2025-26 · Top Skills = highest per-action contribution buckets.

Leaderboards

Top 3 by each metric · click any row to open the player profile or the full ranking.

Explore

Chemistry

Pair- and trio-augmented ridge synergy, beyond what individual RAPMs predict.

Lineup Optim

Best 5-man lineups per team via Q-Rating presence + pair chemistry.

Projections

DARKO-style 1/2/3-year forecasts with aging-curve adjustments.

OPE

Doubly-robust per-event value above the state-conditional league policy.

Clutch / WPA

Per-event win-probability-added with a clutch breakdown.

Possession Value

Sequence-transformer expected reward; per-player Shot Luck vs the model.

Playstyle

Action-frequency vectors and nearest stylistic neighbors.

Play Types

Twelve offensive option buckets via hierarchical RL.

Revealed Prefs

Inverse-RL action × context heatmap of decision preferences.

Coaches

Two-step residual RAPM + Q-Rating: coach effect above player baseline. Kalman trajectory, shot chart, style profile.

▶Multi-season pooled RAPM (2000-01 through 2025-26), points per 100 possessions.

Multi-season pooled RAPM with a box-score Bayesian prior, computed across 26 seasons and 5.49M possessions. Each row is one player. Values are in points per 100 possessions contributed.

Click a column header to sort. Click a row to open the full player profile. See About for methodology.

Source: Search: Min seasons: Min total poss: Played in season:

Player:

▶Single-season RAPM leaderboard. Scrub the season dropdown to see the top of each era.

Best RAPM players in a single season. Each row is one (player, season) pair, computed from a prior-informed Ridge fit on just that season's possessions. Use the season dropdown to scrub through the era.

Season: Sort by: Min poss: Search:

▶Projected 2026-27, 2027-28, and 2028-29 player metrics. Recency-weighted recent seasons plus league aging curve. 95% CI shown.

Each projection combines a recency-weighted average of the player's most recent three seasons (weights 0.2 / 0.3 / 0.5) with the age-indexed league aging-curve delta from their current age to the target season's age. Applied independently to RAPM, Q-Rating, offensive presence, and defensive presence.

Aging curves are possession-weighted means by actual age, from birthdates when available and career year otherwise. The 95% credible interval reflects year-over-year drift plus the variance of the player's recent seasons.

Min career poss: Search:

Click any player row above to see their per-metric projection curves.

League aging curves

Possession-weighted mean per age (player-seasons ≥500 poss, age 19-42). Toggle the metric to see how it ages across the league. Lower chart breaks RAPM down by Q-Rating action bucket. Pick a skill to see when in a career it peaks.

Per-action aging

Bucket:

▶Team Q-Rating = poss-weighted sum of player Q-Rating (action + presence). Split into offense + defense. Team RAPM shown alongside for reference.

Per-season team strength from two independent estimators. Team Q-Rating (total) is the possession-weighted sum of each player's Q-Rating (action + presence combined), capturing both event-attributed contribution and lineup-swap presence (rim deterrence, spacing, gravity). Split into offense (off action + off presence) and defense (def action − def presence, both flipped to good-direction). Team RAPM = net points per 100 possessions (offense minus defense), the independent ridge-regression estimator. When Q-Rating and RAPM disagree, the delta is informative. Q-Rating covers 2005-26; pre-2005 team-seasons show blank (retrain in progress will extend this to 2000-26). Click a team row for its season-by-season trajectory.

Season: Search:

▶Coach effect isolated as the possession-level residual after controlling for player quality. Two metrics side by side — Coach Q-Rating (default) and Coach RAPM (cross-check) — plus shot chart and action-style profile on the coach profile page.

Coach effect isolated via a two-step fit. Step 1 fixes each player's expected contribution using a player-quality prior. Step 2 regresses the per-possession residual (actual minus player-implied points) on the offensive and defensive head coach one-hots plus season fixed effects. The coach coefficient captures whatever pts/100 the players + league scoring era couldn't explain — read as coach + team-baseline effect for the coach's tenure.

Two priors, two views: Coach Q-Rating (default) uses the full deep-RL Q-Rating — action head plus presence — as the player prior. Coach RAPM swaps in box-score-prior player RAPM instead, with the defensive term as a 50/50 blend of def_rapm and Q-Rating def presence so rim-deterrence gets credit. The two agree overall at Pearson r=0.76 on the ≥250-game qualified pool; their defense signals correlate at 0.97 but offense signals diverge (r=0.49). When both metrics agree the coach's effect is robust; when they disagree the ranking is method-sensitive on that coach (typically an offensive-scheme case Q-Rating captures better than box-score RAPM).

Beyond the ratings. Click any coach row for their profile: Kalman-smoothed season trajectory (raw shrunk + smoothed line), aggregate shot chart of teams under this coach (frequency + efficiency), and a "Coach style" widget showing which action buckets they over- or under-index on relative to league mean.

Sign convention: off positive = raises team pts/100; def negative = lowers opponent pts/100. Total = off − def (positive is good, mirroring player RAPM). Career shrunk with κ=40000 possessions; per-season with κ=12500 plus a 1-D Kalman smoother across seasons for trajectory. Click a coach row for their season-by-season chart.

Metric: Season: Min games: Search:

▶Win Probability Added per event. Clutch = 4th quarter or later, under 5 minutes, score within 5.

Win Probability Added credits each event by its contribution to the offensive team's win probability. A clutch shot that moved WP by 5 percentage points contributes +0.05 WPA. Per-season models are trained across 2000-01 through 2025-26; the selector defaults to the most recent season.

Total WPA is cumulative and rewards volume as well as efficiency; WPA per event isolates efficiency. Defensive WPA only credits explicit defensive events (steals, blocks, defensive rebounds, fouls). Rim deterrence and off-ball spacing effects are not captured, so elite rim protectors read systematically low here.

Season: View: Min events: Search:

▶Action-frequency distribution per player. Click a row for signature bars and stylistic nearest neighbors.

Playstyle vectors are the per-player distribution over action classes for the 2024-25 season. Each column shows the fraction of that player's actions that fall in the given category.

Click a row to open the player's signature bar chart and their ten most-similar players by cosine similarity.

Search:

▶Twelve offensive play types via hierarchical RL options. Click a player for per-option success rates.

Per-player play-type distribution from a hierarchical RL options framework. Each offensive action is labeled as one of twelve play types: pick-and-roll handler, pick-and-roll finisher, isolation, post-up, cut, putback, spot-up, off-screen, transition, free throw, rebound, or other.

Columns show the fraction of a player's offensive actions in each option. Defensive events are tracked separately as per-100-poss steals, blocks, and fouls. Click a player row to see their per-option success breakdown.

Search:

▶Inverse RL: action × context heatmap. Positive = player picks that action more than league baseline.

Inverse RL, Revealed Preferences. Treats each player's action choices as a softmax over a personal reward function and recovers the per-player reward weights via log-ratio against the league baseline, conditioned on game state (clutch / blowout / early or late shot clock / transition / normal). Positive cell = player chooses that action far more than expected; negative = chooses it less. Pick a player to see their action × context heatmap.

Season: Player:

▶OPE: per-event value added above the league-policy baseline in the same state context. DR is the headline.

Off-policy evaluation measures the per-event marginal value a player generates above the league-average action distribution in the same state context, split across six contexts (clutch, blowout, early-clock, late-clock, transition, normal). Three estimators are reported: importance sampling (IS), self-normalized IS (WIS), and doubly-robust (DR). DR is the headline; the 95% CI comes from a 200-sample bootstrap.

A positive DR marginal means the player's actual decisions outperform what league-average action choices would produce in similar situations. Behavior policy is a learned π_b(a | s, p); target policy is the state-conditional empirical league action frequency; the Q baseline is per-action-bucket calibrated.

The metric reflects per-event scoring efficiency rather than total impact. Pure playmakers and defensive stars rank lower than their RAPM would suggest because OPE only measures the per-shot value-add.

Season: Min events: Search:

▶Best 5-man lineups per team. Score = Q-Rating presence sum + pair chemistry.

Best 5-man lineups per team (default 2025-26; pick a season via the dropdown), found by exhaustive search over rotation players. Score = sum of player Q-Rating presence (Off − Def, neg-good convention) + offensive pair chemistry − defensive pair chemistry. The player term uses neural-lineup-aware Q-Rating presence (from a leave-one-out swap on the trained Q model with attention pooling over lineups) instead of career RAPM, which puts scores in a more realistic range (~+15 to +25 per 100 poss instead of the previous +30 to +40).

Season: Team: Search players:

▶Lineup Simulator: swap any player into a 5-man lineup and see the projected net pts/100, combining per-player Q-Rating presence + observed pair chemistry.

How the score is built.
1. Per-player contribution = off_presence_shrunk + (−def_presence_shrunk). These come from the Q model's leave-one-out lineup-swap procedure, and they capture rim deterrence, spacing, gravity, and off-ball value (the lineup-level effects that pure action-bucket ratings miss). Both sides are shown "good-direction" (positive = good).
2. Pair chemistry = Σ across all C(5,2)=10 pairs. We use the observed coefficient from the multi-year ridge fit (residual above each player's individual contribution). Pairs that haven't played together show 0.00. A Q-model-based predictor for hypothetical pairs is on the roadmap but not yet wired into this page.
3. Net pts/100 = Σ per-player contributions + Σ pair chemistry.

Caveats. Triple-and-higher chemistry isn't modeled. Multi-year ratings are the most stable input; single-season ratings are noisier and overweight 2025-26 small samples. Presence-based defense does capture rim deterrence and off-ball value (that's the whole point of the leave-one-out lineup swap), but the magnitude depends on how much between-lineup variance the Q model saw during training. Players with very repetitive teammates may have their deterrence partly absorbed into "average team defense."

Team: Season: Rating window:

▶Pair-Augmented RAPM: extra pts/100 a pair adds beyond their individual RAPMs predict.

Pair-Augmented RAPM on multi-season pooled data (2000-01 → 2025-26, 5.49M possessions). Ridge regression with both player indicators and same-team pair indicators in one joint fit. Coefficients shown here are extra points per 100 possessions when the pair is on the floor together, beyond what their individual RAPM predicts. Held-out R² improvement is positive (vs zero for single-season alone), so this is the validated signal version.

Context filter splits the same ridge into three possession subsets: clutch (Q4, last 5 min, |margin|≤5), regular (|margin|≤15, not clutch), and blowout (|margin|>15). Curry+Draymond drop from +4.1 regular to +1.4 clutch — defenses take away the DHO late. Dirk+JET surface as a top clutch pair. Fewer qualified pairs in the clutch subset (~2,500 vs ~34,000 regular) since each pair needs more clutch minutes to qualify.

Group: Side: Context: Played together in season: Min poss together: Search:

Player A: Player B:

▶Possession Value: possession-context prefix transformer (2024-25). Positive = player beat league-average expectation on the possessions they finished (shot, missed, or turned over).

A per-possession expected-reward model (transformer over event-prefix sequences) predicts how many points a possession will yield given the setup events. The actual-vs-predicted residual is attributed to the possession's shot-taker (or turnover-er), with rebounds passing credit through to the shooter of the preceding shot so misses count against the player who took them, and and-one free throws roll into the shot they came from. Positive Poss Luck = player hit shots / avoided turnovers better than the model predicted given the possession context. Distinct from xFG shot-making (which only sees location); this uses the full event sequence, so hard shots (contested pull-ups, late-clock isolations, drives against elite defenders) get their proper difficulty. Currently trained on 2024-25 only; multi-year retrain queued.

Min possessions: Search:

▶

About HoopQ

HoopQ is an independent NBA analytics site focused on deep-RL player impact, matchup-aware neural presence, per-game skill trajectories, and lineup chemistry. Every metric on the site is built possession-by-possession from NBA play-by-play covering the 2000-01 through 2025-26 regular seasons: 5.49M possessions across 30,815 games.

The rest of this page is a glossary. Each entry explains one metric on the site: what it is, how it's computed, how to read it, and where it falls short. Use the search box to filter, or the Expand all / Collapse all buttons.

Not affiliated with the NBA. Independent hobby project. Data is derived from public play-by-play feeds. Contact: hoopqhq@gmail.com.

▶

Multi-season RAPM (2000-2026)

What it is. Regularized Adjusted Plus-Minus, the standard "all-in-one" basketball impact metric. Each player gets two coefficients: an offensive rating (points contributed per 100 possessions when they're on offense) and a defensive rating (points conceded per 100 possessions when they're on defense, where more negative is better). Total RAPM = ORAPM − DRAPM.

How it's computed. For every NBA possession from 2000-01 through 2025-26 (5.49M possessions across 30,815 games), we record the 5-on-5 lineup and the points scored. A sparse linear regression then solves for each player's marginal contribution while controlling for who they shared the floor with. We use a Bayesian prior built from each player's box-score profile (per-100 stats: points, assists, turnovers, rebounds, steals, blocks, etc.) so that low-minute players shrink toward what their box stats predict rather than toward zero. Ridge regularization strength (λ) was chosen by 3-fold cross-validation on held-out possessions.

Strengths. Captures impact you can't see in the box score: screens, spacing, defensive positioning. Multi-season pooling reduces single-season noise by ~10× (the GOAT-era list is dominated by Jokić, LeBron, Embiid, Chris Paul, Curry, Giannis, Kawhi, Garnett, Luka, Nash, consensus-correct).

Limitations. No causal claim, RAPM is correlational. Small-sample players (Wembanyama, Chet Holmgren: 2 seasons each) still appear in the top 25 with wide uncertainty. Doesn't separate role from talent (a great defender on a great defense gets partial credit for teammates' help).

▶

Q-Rating (RL Q-Model)

What it is. A deep-RL player rating with two complementary flavors per side, both in points per 100 possessions (RAPM units):

Off / Def (action), sum of per-action-bucket contributions. Each event is credited to its actor; tells you which buckets (3pt pullups, rim finishes, blocks, rebounds, etc.) drove the value.
Off / Def (presence), neural ORAPM / DRAPM via a leave-one- out lineup swap on the trained Q model. Captures value not tied to any single event: spacing, screen-setting, rim deterrence, screen navigation. Closer analog to RAPM.

How it's computed. Each NBA possession is an RL episode; each event (shot, rebound, turnover, foul) is a step. A neural network learns Q(state, action, player), the expected remaining points in the possession given the game context, action, and actor. Player identity is a learned embedding; lineups are pooled by a 4-head self-attention layer. Action ratings = average advantage (Q with actual embedding − Q with league-average embedding) × usage per 100 possessions, summed per side. Presence ratings = average Q change at the start of each possession when the player is replaced with a neutral baseline (the mean of all real player embeddings), averaged across every possession they were in the off / def lineup, × 100.

Empirical-Bayes shrinkage. Per-100-poss rates have high variance for low-sample players, so every rating is shrunk by n / (n + κ) with κ = 8000 on-floor events (~1 season for a half-time player). Veterans keep ~90% of their raw rating; small-sample noise players collapse toward 0.

How to read it. Presence is the headline number; action shows where the value came from. For elite defenders the two diverge, Def (action) is mostly blocks/steals/rebounds, Def (presence) includes the value of just being out there. Player embeddings also cluster meaningfully (Curry near Lillard near Trae; Jokić near Sabonis), so the model isn't just memorizing role.

▶

Kalman Career Trajectory (game-by-game)

What it is. A DARKO-style Kalman filter over per-game Q-Rating, producing a smoothed skill trajectory across every game a player has played. The observation is per-game Q-Rating in points per 100 team possessions, not a Game Score composite: it's the neural action-bucket decomposition redistributed per event, so the trajectory tracks actual skill rather than box-score production.

How it's computed. Each season's per-player action-bucket coefficients are decomposed into per-event credits, aggregated per (player, game) as a per-100 rate, and fed one at a time into a Kalman filter. Aging drift comes from a smoothed age curve fit on all observations pooled by age. The rookie prior is seeded from a draft-slot prior (median career-year-1 Q-Rating by pick). Career-average presence is folded in as a season-constant so elite defenders and off-ball creators surface at their real magnitudes.

How to read it. The bold line is the posterior mean; the shaded band is a ±1.96σ 95% confidence interval; small dots are per-game observations. For players drafted since 2000-01 the x-axis is career game number so trajectories are apples-to-apples across eras; for earlier careers the axis falls back to date since our data window starts mid-career. The Compare-With input overlays a second player.

Limitations. Career-average presence gives the trajectory its right magnitude but is constant within a career, so within-season presence changes (injury, role shift) don't move the line beyond what action-bucket noise contributes. Rookies have wide bands until they've played enough games.

▶

Lineup GNN + Matchup Influence

What it is. A heterogeneous graph neural network over each 5-vs-5 possession, treated as a 10-node graph with three edge types: within-offense (chemistry among the offensive 5), within-defense (chemistry among the defensive 5), and cross (offense-vs-defense matchup edges). Two hops of message passing let signal like "my teammate has a favorable matchup, which frees me up" propagate through the graph.

How it's computed. Node features are learned player embeddings plus a role token (offense or defense). Each layer aggregates three separate messages (within side, within side, cross) into an update. Two layers are applied, node representations are pooled by side, and a standard Q head predicts expected reward on the possession. Downstream artifacts: per-player GNN presence via leave-one-out lineup swaps, a per-player synthetic matchup sweep (X vs candidate defender in an isolated 5-vs-5 context), and team-vs-team predictions aggregated over each team's most-used 5-man lineups.

Where it shows up. The Matchup Influence table on every player profile (defenders whose presence lowers or raises the player's expected Q per possession), the vs-team preview widget on team profiles, and the GNN presence stat card on the Lineup Sim page.

Limitations. Play-by-play has no per-shot defender attribution, so the matchup signal is on-court concurrence not literal 1-on-1 assignment. Same-era filtering restricts candidate defenders to seasons a player actually played in so anachronistic pairings don't populate the table.

▶

Pair Chemistry (Pair-Augmented RAPM)

What it is. Extra points per 100 possessions a specific pair generates (offense) or prevents (defense) when they're on the floor together, beyond what their individual RAPMs predict. A measure of two-player chemistry / fit / synergy.

How it's computed. Same ridge regression as RAPM but with an extra binary feature for every player pair that played enough possessions together. The pair coefficient is what's left after the individual player effects are accounted for: the "team-up bonus." Positive on offense = synergy. Negative on defense = synergy (less points allowed).

Limitations. Identifiable pairs need a meaningful number of possessions together; one-game cameos shrink to zero. Pair coefficients are still correlational, not causal, and can't separate "lineup synergy" from "shared strategy" or "shared coach.")

▶

QR-DQN Skill Advantage

What it is. A distributional-RL companion to the Q-Rating model. For every offensive event, we predict the per-event reward distribution above a no-actor baseline Q(s, a, teammates, defenders, context). The per-player skill advantage is the mean of that residual, points per event the actor adds beyond what a league-average player would have produced in the same situation. The 95% CI shown in the Player Profile is a proper SE-of-the-mean confidence interval, not the spread of individual outcomes.

How it's computed. Two-stage: first a baseline mean-Q model is trained with the actor masked out of the offensive lineup; then a Quantile-Regression DQN with 51 quantile heads is fit (pinball Huber loss) on advantage = actual return − baseline prediction. Per-player aggregation averages the distributional mean across each player's events.

Strengths. Interpretable units (pts/event) and clean residualization that removes the action-outcome variance dominating a raw RTG estimate. Top players by mean advantage are consistently the league's high-usage per-event creators.

Limitations. Per-event efficiency systematically undersells cumulative creators (Jokić, Giannis show small positives) because action-by-action credit can't capture possession-spanning value the way RAPM does. Best read alongside RAPM, not as a replacement.

▶

Shot Quality (xFG)

What it is. A LightGBM classifier that estimates expected FG% for each shot given the shot location, action type, and shot-clock state. Per-player it aggregates to two axes: Shot Making (actual points minus expected points, how much the player beats their location-baseline) and Shot Selection (mean xFG per attempt, how high-quality the shots they take are on average).

How it's computed. Gradient-boosted trees trained on every field-goal attempt with the shot's location coordinates, primary action bucket, and clock features. Per-shot xFG is used to compute per-player pts and selection aggregates, then compared against actual pts scored on those shots.

How to read it. Shot Making of +150 pts means the player scored 150 more points on their attempts than xFG expected. Shot Selection of 0.55 means the average shot they attempted had a 55% baseline FG probability. Two axes let a low-volume rim finisher (high selection, modest making) and a high-volume creator taking hard shots (lower selection, positive making) end up on different quadrants.

Limitations. The model doesn't see defender proximity or contested-vs-open state (no tracking data). That context lives in the Possession Value transformer instead.

▶

Possession Value Transformer

What it is. A prefix-only transformer over possession event sequences that predicts the expected reward of the possession-ending event given every event that preceded it (dribbles, screens, passes, off-ball actions). Publishes per-player Shot Luck as actual reward minus predicted reward.

How it's computed. The transformer reads the sequence of events in a possession, produces an expected reward for the final event conditioned on the prefix. Actor attribution walks back from the possession end to find the shot-taker, so rebounds pass credit through to the missed shot's shooter and and-one free throws roll into the generating shot. A calibration constant is applied at scoring time to keep the metric zero-centered on possession totals since the training target was last-event reward.

How to read it. Positive Shot Luck = player hit shots or avoided turnovers better than the model predicted given the possession context. Distinct from xFG because the model sees the full sequence of dribbles/screens/passes, not just where the shot went up. A contested pull-up after 22 seconds of dribbling and a wide-open catch-and-shoot get different difficulty estimates.

Limitations. Currently trained on 2024-25 only; multi-year retrain queued. The metric captures per-possession-context beat over baseline but doesn't distinguish shot-making from decision-making the way OPE does.

▶

Play Types (Hierarchical Options)

What it is. Per-player distribution over 12 offensive "options" inferred heuristically from the action stream: pnr_handler, pnr_finisher, isolation, post_up, cut, putback, spot_up, off_screen, transition, free_throw, rebound, other. Tells you what a player does, not how well they do it. Click a player for offensive distribution + per-option success (pts/action) + defensive rates per 100 def-poss (steals, blocks, fouls).

How it's computed. Each event is labeled by a rule combining action bucket, is_assisted, possession step, and seconds-into-possession. Isolation vs PnR handler split uses the assisted flag (assisted pullup/drive → PnR; unassisted → isolation). Defensive credits are computed separately: stealer/blocker via player3_id from steal/block events, normalized by the player's defensive possessions on the floor.

Limitations. Heuristic, not learned. Without tracking data, "iso vs PnR" is an imperfect proxy; missed shots default to isolation since assist credits only attach to makes. Transition uses a clock-only proxy (no start_type available in the possessions parquet). Use it for shape/role inference, not for skill grading.

▶

Win Probability + WPA

What it is. Win Probability is a gradient-boosted classifier that predicts the home team's chance of winning from (period, clock, score margin, possession side, off/def lineups). Win Probability Added (WPA) is the per-event change in WP attributable to the actor, split into offensive and defensive halves and broken down by action bucket. Powers the Clutch / WPA page.

▶

Player Projections

What it is. 1-, 2-, and 3-year-ahead forecasts of each per-100 metric (points, assists, rebounds, etc.) using DARKO-style aging curves and a Kalman-flavored update that blends a player's own trajectory with a prior built from comparable players at the same age. Surfaced on the Projections page and in the Player Profile career trajectory chart.

▶

Playstyle vectors

What it is. Each player's action-frequency vector projected into a low-dim space, then used to find their nearest neighbors in style (not in impact). Useful for "who plays like X?" comparisons. Distinct from the Play Types view: that one shows the raw distribution, this one shows similarity.

▶

Revealed Preferences (Inverse RL)

What it is. For each (player, action, game-context) triple, the log-ratio of the player's frequency against the league baseline, what actions a player chooses reveals their inferred reward weights. Reveals e.g. DeMar DeRozan's strong midrange preference, Curry's elevated clutch-3 preference. State contexts: clutch, blowout, early-clock, late-clock, transition, normal.

▶

Lineup Optimization

What it is. For each team, exhaustively scores every 5-man lineup over its rotation players and returns the best by predicted net rating.

How it's computed. Lineup score = sum of the 5 players' Q-Rating presence (Off − Def, neg-good convention) + offensive pair chemistry summed over all 10 pairs − defensive pair chemistry summed over all 10 pairs. Q-Rating presence is the leave-one-out swap on the trained Q model with attention pooling over lineups (closer analog to RAPM than the action decomposition).

▶

Team Q-Rating (detrended)

What it is. A team-season summary of the roster's Q-Rating quality, centered so the sign is meaningful. Raw team Q-Rating is a possession-weighted sum of every roster player's Q-Rating (action + presence). Detrended = raw minus the season league mean. Positive means the roster grades above the league that year, negative means below.

How it's computed. For each (team, season): weighted mean over the roster of each player's total-impact (action + presence − opponent-side presence, good-direction on both sides), weighted by that player's possessions with that team that year. The result is multiplied by 5 to put it on per-100-poss scale (five players share each possession). Detrending subtracts the season-wide league mean so ~50% of teams end up on each side of zero.

How to read it. Semantically parallels Team RAPM (already zero-centered). +2 means "roster grades ~2 pts/100 poss above the league that year," −3 means "3 below." The trajectory chart on the team page shows total plus off/def breakdown, with Team RAPM overlaid as a reference line.

Limitations. Aggregates the whole roster equally by possessions played, so it doesn't distinguish starter-heavy vs bench-heavy usage. Presence uses career-average magnitudes broadcast across seasons, so team-season snapshots reflect the long-run quality of the players on the roster rather than that specific year's form.

▶

Team Action Profile

What it is. Per (team, season, action bucket) rate of how much value each team generated from each of the ~30 offensive and defensive action buckets, centered against the season league mean per bucket. Surfaces two views on the team page: "Season strengths" ranks the current-season buckets from most above to most below league, and "By season" shows a heatmap of every season the team has on record.

How it's computed. For each (team, season, bucket): sum every roster player's per-bucket Q-Rating coefficient weighted by their possessions with that team that year, divided by total team possessions. Detrend by subtracting the season league mean per bucket. Green cells indicate the team was above league on that bucket that year; red indicates below.

How to read it. Bucket rows in the heatmap are sorted by how identity-defining the bucket is over the team's history, so signature strengths and weaknesses float to the top. Low-volume specialty buckets (3pt corner pullup, midrange post hook, foul-take, etc.) swing more than high-volume ones (rim finishes, midrange catches).

Limitations. Only sees action-level credit, not presence. Two teams with identical action profiles but different rim protection would look identical here even though their defense is not.

▶

Off-Policy Evaluation (OPE)

What it is. A per-event marginal: how much value did this player's actual decisions add above what the league-average action distribution would produce in the same state context? Headline number is the doubly-robust (DR) marginal in points per event. Three estimators reported: importance sampling (IS), self-normalized IS (WIS), and doubly-robust (DR). DR is the headline.

How it's computed. For each event we have (state, observed action, reward). Behavior policy π_b(a|s,p) is a learned neural classifier. Target policy is the state-conditional empirical league action frequency (no player conditioning), bucketed into six contexts: clutch / blowout / early-clock / late-clock / transition / normal. IS weight = π_target(a|s) / π_b(a|s,p), clipped at 20. DR baseline = Σ_a π_target(a|s) × Q_calibrated(s, a, neutral_player), where Q is per-action-bucket calibrated via post-hoc OLS so each bucket's predicted Q matches observed event-reward expectation. Marginal = observed_avg_reward − DR_value. 95% CI is from 200-sample bootstrap.

What this measures. Per-event scoring efficiency relative to league policy in the same situation. Positive = player's actual decisions outperform league defaults in their state mix.

Limitations. Per-event efficiency, not total impact: pure passers (Jokić, CP3) and defensive stars rank lower than their RAPM would suggest because the metric only measures the per-shot/per-action value-add, not playmaking or defense. Low-event samples (rookie years, partial seasons) have wide CIs; the lower-bound column is the statistically distinguishable view.

▶

Decision Quality

What it is. Per-player aggregate of how often the action they chose matches the action with the highest predicted Q from the Q-Rating model. Three views: top-1 rate (chose the best action), top-3 rate (chose one of the top three), and avg rank percentile (1.0 = always optimal, 0.0 = always worst).

How it's computed. For each event the actor took action a in state s, we evaluate Q(s, a', actor) for every valid action a' in the action vocabulary, compute the rank of the chosen action among all alternatives, and report the percentile = 1 − rank/(N_actions − 1). Per-player aggregation averages over all their decisions.

Limitations. "Best" is defined by the Q model, so this measures agreement with Q, not absolute optimality. Players whose role pushes them into low-Q decisions (e.g. forced isolation possessions late-shot-clock) will score lower even if their decision was the best of bad options.

▶

Behavior Policy (π_b)

What it is. A neural classifier that learns π_b(a | state, player): which action does this specific player tend to choose in this specific state? Used as the denominator in the OPE importance-sampling weights.

How it's computed. Same lineup-attention encoder as the Q model, but the output head is a softmax over the action vocabulary instead of a scalar Q. Trained on cross-entropy against the observed action; achieves ~38% top-1 and ~68% top-3 accuracy on held-out events.

Why we care. Without a learned π_b, IS weights would have to fall back to a uniform action prior, which inflates variance dramatically for high-volume players whose action mix is far from uniform.

▶

How to read the numbers

RAPM-style numbers are in points per 100 possessions. A +5 RAPM player is roughly +5 net points per 100 possessions compared to a league-average player on a neutral team. League scoring runs ~115 per 100, so +5 is genuinely elite (top-10ish in any given season). For defense, more negative is better: a −3 DRAPM player concedes 3 fewer points per 100 than average. QR-DQN skill advantage is in different units (points per event, not per 100 possessions). Play Types percentages are shares of a player's offensive actions; defensive rates are per 100 defensive possessions they were on the floor for.

▶

Data & pipeline

NBA play-by-play covering the 2000-01 through 2025-26 regular seasons, 5.49M possessions across 30,815 games, parsed into per-event tuples with the 5-on-5 lineup, action, actor, and reward. Models implemented in scikit-learn (ridge with box-score prior, K-fold CV), PyTorch (Q-network with lineup attention, distributional QR-DQN, lineup GNN, prefix transformer for possession value, Kalman filter for career trajectories), LightGBM (win probability, expected FG%), and a closed-form Gaussian-Gaussian Bayesian solve for the RAPM prior step. Ratings regenerate end-to-end from raw play-by-play, no hand curation.

No game selected. Pick one from the strip above.