About This Dataset
USMLEPredictor.com's prediction model is built on 5,039 verified student score reports collected between January 2022 and March 2026. This is the largest independently collected dataset of USMLE practice-to-actual score correlations available through a free public tool.
How Scores Are Collected and Verified
Score reports are submitted voluntarily by USMLE test-takers after completing their exam. To ensure data integrity, every submission goes through the following verification process:
- Document verification: Submitters are required to upload an official NBME or USMLE score document. Screenshots, PDFs, and MyUSMLE portal exports are accepted and cross-checked for formatting consistency with official score report templates.
- Duplicate screening: Submissions are screened by IP address, email, and score pattern to identify and remove duplicate entries.
- Outlier review: Statistical outliers (scores that deviate more than 3 standard deviations from the mean for a given NBME form) are manually reviewed before inclusion.
- Anonymization: All personal identifiers (name, medical school, AAMC ID) are removed at the point of submission. USMLEPredictor.com stores only the practice exam score(s), final USMLE score, and exam date.
No student data is sold, shared, or used for any purpose other than improving prediction accuracy.
Who Built This Tool
USMLEPredictor.com was developed by a team of medical educators and data scientists with direct experience in USMLE preparation and medical education research. The prediction algorithm was designed to address a well-documented gap: existing score prediction methods (Reddit spreadsheets, estimates) rely on self-reported, unverified data with no statistical rigor.
Our team includes individuals with backgrounds in:
- Medical education and USMLE preparation coaching
- Statistical modeling and machine learning
- Clinical medicine (MD-level contributors)
The accuracy data on this page is reviewed quarterly and updated as new NBME forms are released and additional score reports are collected. No student data is sold, shared, or used for any purpose other than improving prediction accuracy.
Our 3-Method Ensemble Algorithm
USMLEPredictor.com does not use simple linear regression. We combine three independent modeling approaches to maximize accuracy across the full score range:
| Method | Weight | Purpose |
|---|---|---|
| K-Nearest Neighbors (KNN) | 40% | Accounts for score clustering |
| Weighted Average Model | 35% | Applies form-specific weighting |
| Per-Form Linear Regression | 25% | Handles outlier correction |
Combining these methods outperforms any single approach, particularly for students in the 220–250 range where score distributions are densest and single-method models show the highest variance.
Practice Test Accuracy & Correlation Data
The following correlation data is derived from the full 5,039-report dataset. Pearson's r measures the strength of the linear relationship between practice exam score and actual USMLE Step 2 CK score. Precision (±) represents the range within which 80% of actual scores fell relative to prediction.
Primary Predictors (Highest Accuracy)
NBME Form 14
r = 0.92Notes: Currently the most predictive NBME form available for 2025–2026 test-takers. Form 14 closely mirrors the current Step 2 CK item difficulty distribution.
UWSA 2
r = 0.89Notes: Highly reliable but consistently overpredicts actual performance by an average of 3.2 points. Our model applies a downward correction factor for UWSA 2 inputs. Students should interpret a raw UWSA 2 score with this tendency in mind.
NBME Form 13
r = 0.88Notes: Strong predictor, slightly less precise than Form 14. Recommended as a secondary data point alongside Form 14 or UWSA 2.
Free 120
r = 0.85Notes: Lower precision than NBME forms because the Free 120 was designed to assess item type familiarity rather than content mastery. Best used as a final verification of testing stamina and clinical reasoning under exam conditions — not as a primary score predictor.
NBME Score Conversion & Form-Specific Notes
Modern NBME Forms (Recommended for Current Test-Takers)
- NBME Form 32: The newest release. Shows high fidelity with the current Step 2 CK question style and difficulty curve.
- NBME Form 31: Among the strongest predictors for 2024–2026 test-takers. Precision: ±5 points. Highly recommended.
- NBME Form 30 & 29: Reliable predictors. May slightly underpredict actual performance by 2–5 points due to minor item difficulty differences from current exam versions.
Legacy Forms (Useful for Trend Tracking)
- NBME Forms 9–12: Remain useful for broad benchmarking during early dedicated study. Not recommended as primary predictors for current exam dates due to evolving question style.
- NBME Forms 13–16: Excellent for tracking score trajectory over time and building exam stamina. Precision is lower than modern forms when used in isolation.
UWSA 2 Deep Dive: Accuracy in the 2025–2026 Cohort
UWSA 2 is the most widely used UWorld self-assessment for Step 2 CK prediction. In our 2025–2026 cohort (n = 1,847 submissions with UWSA 2 data:
- 74% of students scored within 5 points of their UWSA 2 result on actual Step 2 CK
- 89% of students scored within 8 points of their UWSA 2 result
- Average overprediction: +3.2 points (UWSA 2 tends to be slightly easier than the actual exam)
- Standard deviation of prediction error: 4.1 points
Our model applies a -3 point correction factor to all UWSA 2 inputs, which reduces prediction error by approximately 28% compared to using the raw UWSA 2 score directly
How USMLEPredictor Compares to Other Methods
| Method | Data Source | Sample Size | Precision | Last Updated |
|---|---|---|---|---|
| USMLEPredictor | 5,039 verified | 5,039 | ±5 points (avg) | March 2026 |
| Reddit Predictor | Crowdsourced | ~300–500 | ±8–12 points | 2022 (stagnant) |
| Score | Internal | Undisclosed | ±8–10 points | Undisclosed |
The Reddit predictor spreadsheet, while widely shared, has several well-documented limitations: self-reported scores are not verified, the sample size is small and geographically concentrated, and the spreadsheet has not been updated since 2022.
Our model accounts for NBME form difficulty drift over time the Reddit spreadsheet does not.
Limitations and Honest Disclosures
We believe transparency about what our model cannot do is as important as documenting what it can.
This tool has the following limitations:
- Predictions are statistical estimates, not guarantees. Individual scores will always vary. Our precision ranges describe where 80% of users fall — 20% of users will score outside the stated range.
- Accuracy decreases for students with fewer than two practice exams. Our model performs best when given two or more data points (e.g., NBME + UWSA 2). Single-input predictions carry higher variance.
- Very high scorers (265+) and very low scorers (below 215) show slightly lower predictive accuracy. Score distributions at the extremes of the USMLE scale are less densely represented in our dataset.
- Newly released NBME forms have lower accuracy initially. When NBME releases a new form, we have limited data until sufficient submissions are collected. We note this clearly for each form.
- This tool is not affiliated with NBME or the USMLE program. We are an independent tool built on independently collected data. We do not have access to official NBME datasets.
- The model is currently trained on Step 2 CK data only. Step 1 and Step 3 predictors are in active development.
What Students Say
Predicted 258, actual 259. The NBME Form 14 and UWSA 2 weighted average was spot on throughout my dedicated period.
— MD Student, Verified Score Report
USMLEPredictor gave me the confidence to move my test date up two weeks. It is much faster than filling out a spreadsheet and the accuracy was within 3 points for me.
— IMG Applicant, Verified Score 252
Step 1 and Step 3 Predictors (Coming Soon)
We are actively collecting score reports and training models for:
- USMLE Step 1 Score Predictor: Focusing on PASS/FAIL outcome prediction and NBME self-assessment correlation for Step 1 forms.
- USMLE Step 3 Score Predictor: Analyzing CCS (Clinical Case Simulation) performance patterns and specialty-specific scoring trends.
If you have completed Step 1 or Step 3 and would like to contribute to this dataset, please submit your score report through our contribution form.
Authoritative Sources and Research Basis
Our algorithm design and validation methodology draws from the following published sources:
- USMLE Step 2 CK Bulletin of Information — Official score interpretation guidelines and scale documentation
- NBME Performance Data Reports — Historical form difficulty data and item response theory (IRT) benchmarks used for form-to-form calibration
- Journal of Medical Education — Peer-reviewed studies on the predictive validity of Free 120 and UWSA self-assessments relative to actual USMLE performance
- National Board of Medical Examiners Technical Reports — IRT methodology documentation for USMLE score scaling
Frequently Asked Questions
How often is the prediction model updated?
The model is reviewed quarterly. When new NBME forms are released, form-specific regression parameters are updated as soon as sufficient data (minimum 100 score reports per form) is collected.
Can I submit my score to help improve the model?
Yes. After your exam, you can submit your practice scores and official result through our score contribution form. All submissions are anonymized and used solely to improve prediction accuracy.
Is my data private?
All personal identifiers are removed at the point of submission. We store only the practice exam score, actual USMLE score, and exam month/year. No data is sold or shared with third parties.
What if my score was very different from the prediction?
Approximately 20% of users score outside our stated precision range. If your result differed significantly, we encourage you to submit your data through the contribution form — these outlier cases are particularly valuable for improving model accuracy at the score extremes.