How the AMI Prevalence Score (P) Is Calculated
The AMI Prevalence Score (P) is the estimated rate of academic misconduct in each country. This guide walks through how it is built — the six dimensions, the data sources, the weighting, and the rescaling.
TL;DR
The Prevalence Score is built from six dimensions (D1–D6), each scored on a 0–100 scale from live data or literature estimates, then weighted and combined into a country aggregate. Final scores are rescaled across the 39-country set so the lowest receives 5 and the highest receives 95.
TL;DR
The Prevalence Score is built in four steps: (1) score each of six dimensions per country from live data or literature; (2) normalise to 0–100; (3) apply dimension weights and aggregate; (4) rescale across the 39-country set so the lowest scores 5 and the highest scores 95.
The six dimensions
The P-Score is built from six dimensions of academic misconduct:
D1 Contract cheating
Pays-someone-else demand and incidence. Primary data source: Google Trends queries for contract cheating keywords plus essay mill brand names. Secondary: McCabe / ICAI survey data where available.
D2 AI-generated submissions
Demand for and incidence of AI-generated work submitted as one's own. Primary data source: Google Trends queries for AI submission tools. Secondary: Guardian FOI data, Scarfe et al. (2024), institutional disclosures.
D3 Exam impersonation
Having someone else sit an examination. Data sources: published prosecution data, examination-authority statistics, peer-reviewed literature.
D4 Plagiarism
Submitting copied work without attribution. Primary data source: ICAI McCabe surveys (where country-specific data exists). Secondary: regional extrapolations and country-specific peer-reviewed studies.
D5 Collusion
Unauthorised collaboration on individual assessments. Primary data source: ICAI McCabe surveys. Secondary: regional patterns.
D6 Data fabrication
Fabricating or falsifying research data. Primary data source: Retraction Watch database, filtered to misconduct-linked retractions, normalised by publication volume from OpenAlex.
Step 1 — Score each dimension
Each dimension is scored on a 0–100 scale where 100 represents the highest signal in the country set:
- Live-data dimensions (Retraction Watch, Google Trends, FOI) — scored directly from the data after normalisation by population or publication volume
- Survey dimensions (McCabe, ICAI) — scored from reported rates, with regional fallback for countries not in the survey
- Literature dimensions — country-specific peer-reviewed studies plus regional priors
Step 2 — Normalise
Dimension scores are normalised to 0–100 within the country set. Top-scoring country on each dimension gets 100; lowest gets 0.
Step 3 — Aggregate
Dimension scores are combined into a country P-Score using dimension weights. The weights are documented in the methodology document — they reflect both the prevalence-level importance of each dimension and the data quality available.
The aggregation is:
> P_raw = Σ (w_i × D_i)
where w_i is the weight for dimension i and D_i is the normalised dimension score.
Step 4 — Rescale
The raw aggregate scores are rescaled across the 39-country set so:
- Lowest-scoring country: P = 5
- Highest-scoring country: P = 95
This produces the published P-Score. The rescaling makes scores comparable across countries within the set.
What the rescaling means
A P-Score of 5 does not mean zero academic misconduct. It means the lowest estimated prevalence among the 39 countries currently scored. Canada's 4.90 is the floor of the current dataset.
A P-Score of 95 does not mean every student cheats. It means the highest estimated prevalence among the 39 countries scored. China's 99.98 is the ceiling, technically slightly above 95 due to the specific rescaling formula.
Adding or removing countries from the index will shift the scale. Future versions with expanded coverage will produce different raw-to-published mappings.
Data quality notes
Each country's P-Score carries a data quality flag (A, B, C) reflecting how much of the dimension data is from live, country-specific sources versus regional extrapolation or literature priors. Countries with 5 of 6 dimensions from live data receive the A flag.
See data quality flags explained for details.
Limitations and known issues
- Norway anomaly: Google Trends signals interpret academic and policy discussion as student demand. Documented in the methodology caveat section.
- Rescaling sensitivity: scores shift with country coverage changes.
- Survey age: McCabe data is from 2002–2015; AI-era dynamics not captured in D4/D5 surveys.
- Detection-prevalence confounding: countries with stronger detection report more cases; this is addressed by the enforcement-detection correction documented in the methodology.
Sources
- AMI v1.5 methodology document
- Retraction Watch Database, Crossref/GitLab (2026)
- Google Trends API (2022–2026)
- ICAI / McCabe survey data
- OpenAlex publication counts
Full methodology | Download dataset
Related
Frequently asked questions
What is the AMI Prevalence Score?
The Prevalence Score (P) is the AMI's estimate of how widespread academic misconduct is in a given country. It is built from six dimensions — contract cheating, AI submissions, exam impersonation, plagiarism, collusion, and data fabrication — weighted and combined into a country aggregate score on a 0–100 scale.
How are the six dimensions weighted?
Dimension weighting is documented in the methodology document. The weights reflect both the prevalence-level importance of each dimension (severity per unit of incidence) and the data quality available for each dimension. Higher-confidence dimensions with stronger live data sources receive slightly higher weight in the v1.5 weighting scheme.
Why is the P-Score rescaled to 5–95?
The rescaling places the lowest-scoring country in the current set at 5 and the highest at 95. This makes scores comparable across countries within the set but means scores are not absolute — adding or removing countries from coverage will shift the scale. A score near 5 indicates the lowest estimated prevalence among the countries scored, not zero misconduct.
How to cite this article
APA: Booth, F. (2026). How the AMI Prevalence Score (P) Is Calculated. Academic Misconduct Index. https://academicmisconductindex.com/blog/how-p-score-calculated
BibTeX: @misc{booth2026how, author={Booth, Francisco}, title={How the AMI Prevalence Score (P) Is Calculated}, year={2026}, url={https://academicmisconductindex.com/blog/how-p-score-calculated}}
Francisco Booth
Independent researcher, founder of the Academic Misconduct Index
Related posts