Guide20 May 2026

How the AMI Prevalence Score (P) Is Calculated

The AMI Prevalence Score (P) is the estimated rate of academic misconduct in each country. This guide walks through how it is built — the six dimensions, the data sources, the weighting, and the rescaling.

TL;DR

The Prevalence Score is built from six dimensions (D1–D6), each scored on a 0–100 scale from live data or literature estimates, then weighted and combined into a country aggregate. Final scores are rescaled across the 39-country set so the lowest receives 5 and the highest receives 95.

P-ScorePrevalencemethodologyAMIguide

TL;DR

The Prevalence Score is built in four steps: (1) score each of six dimensions per country from live data or literature; (2) normalise to 0–100; (3) apply dimension weights and aggregate; (4) rescale across the 39-country set so the lowest scores 5 and the highest scores 95.

The six dimensions

The P-Score is built from six dimensions of academic misconduct:

D1 Contract cheating

Pays-someone-else demand and incidence. Primary data source: Google Trends queries for contract cheating keywords plus essay mill brand names. Secondary: McCabe / ICAI survey data where available.

D2 AI-generated submissions

Demand for and incidence of AI-generated work submitted as one's own. Primary data source: Google Trends queries for AI submission tools. Secondary: Guardian FOI data, Scarfe et al. (2024), institutional disclosures.

D3 Exam impersonation

Having someone else sit an examination. Data sources: published prosecution data, examination-authority statistics, peer-reviewed literature.

D4 Plagiarism

Submitting copied work without attribution. Primary data source: ICAI McCabe surveys (where country-specific data exists). Secondary: regional extrapolations and country-specific peer-reviewed studies.

D5 Collusion

Unauthorised collaboration on individual assessments. Primary data source: ICAI McCabe surveys. Secondary: regional patterns.

D6 Data fabrication

Fabricating or falsifying research data. Primary data source: Retraction Watch database, filtered to misconduct-linked retractions, normalised by publication volume from OpenAlex.

Step 1 — Score each dimension

Each dimension is scored on a 0–100 scale where 100 represents the highest signal in the country set:

Live-data dimensions (Retraction Watch, Google Trends, FOI) — scored directly from the data after normalisation by population or publication volume
Survey dimensions (McCabe, ICAI) — scored from reported rates, with regional fallback for countries not in the survey
Literature dimensions — country-specific peer-reviewed studies plus regional priors

Step 2 — Normalise

Dimension scores are normalised to 0–100 within the country set. Top-scoring country on each dimension gets 100; lowest gets 0.

Step 3 — Aggregate

Dimension scores are combined into a country P-Score using dimension weights. The weights are documented in the methodology document — they reflect both the prevalence-level importance of each dimension and the data quality available.

The aggregation is:

> P_raw = Σ (w_i × D_i)

where w_i is the weight for dimension i and D_i is the normalised dimension score.

Step 4 — Rescale

The raw aggregate scores are rescaled across the 39-country set so:

Lowest-scoring country: P = 5
Highest-scoring country: P = 95

This produces the published P-Score. The rescaling makes scores comparable across countries within the set.

What the rescaling means

A P-Score of 5 does not mean zero academic misconduct. It means the lowest estimated prevalence among the 39 countries currently scored. Canada's 4.90 is the floor of the current dataset.

A P-Score of 95 does not mean every student cheats. It means the highest estimated prevalence among the 39 countries scored. China's 99.98 is the ceiling, technically slightly above 95 due to the specific rescaling formula.

Adding or removing countries from the index will shift the scale. Future versions with expanded coverage will produce different raw-to-published mappings.

Data quality notes

Each country's P-Score carries a data quality flag (A, B, C) reflecting how much of the dimension data is from live, country-specific sources versus regional extrapolation or literature priors. Countries with 5 of 6 dimensions from live data receive the A flag.

See data quality flags explained for details.

Limitations and known issues

Norway anomaly: Google Trends signals interpret academic and policy discussion as student demand. Documented in the methodology caveat section.
Rescaling sensitivity: scores shift with country coverage changes.
Survey age: McCabe data is from 2002–2015; AI-era dynamics not captured in D4/D5 surveys.
Detection-prevalence confounding: countries with stronger detection report more cases; this is addressed by the enforcement-detection correction documented in the methodology.

Sources

AMI v1.5 methodology document
Retraction Watch Database, Crossref/GitLab (2026)
Google Trends API (2022–2026)
ICAI / McCabe survey data
OpenAlex publication counts

Full methodology | Download dataset

Read the full methodology

Frequently asked questions

What is the AMI Prevalence Score?

The Prevalence Score (P) is the AMI's estimate of how widespread academic misconduct is in a given country. It is built from six dimensions — contract cheating, AI submissions, exam impersonation, plagiarism, collusion, and data fabrication — weighted and combined into a country aggregate score on a 0–100 scale.

How are the six dimensions weighted?

Dimension weighting is documented in the methodology document. The weights reflect both the prevalence-level importance of each dimension (severity per unit of incidence) and the data quality available for each dimension. Higher-confidence dimensions with stronger live data sources receive slightly higher weight in the v1.5 weighting scheme.

Why is the P-Score rescaled to 5–95?

The rescaling places the lowest-scoring country in the current set at 5 and the highest at 95. This makes scores comparable across countries within the set but means scores are not absolute — adding or removing countries from coverage will shift the scale. A score near 5 indicates the lowest estimated prevalence among the countries scored, not zero misconduct.

How to cite this article

APA: Booth, F. (2026). How the AMI Prevalence Score (P) Is Calculated. Academic Misconduct Index. https://academicmisconductindex.com/blog/how-p-score-calculated

BibTeX: @misc{booth2026how, author={Booth, Francisco}, title={How the AMI Prevalence Score (P) Is Calculated}, year={2026}, url={https://academicmisconductindex.com/blog/how-p-score-calculated}}

Francisco Booth

Independent researcher, founder of the Academic Misconduct Index

Guide

What Is Contract Cheating? Definition, Examples, and Global Data

Guide

What Is an Essay Mill? How They Work and Which Countries They Target

News

Introducing the Academic Misconduct Index

← Back to all posts

How the AMI Prevalence Score (P) Is Calculated

TL;DR

The six dimensions

D1 Contract cheating

D2 AI-generated submissions

D3 Exam impersonation

D4 Plagiarism

D5 Collusion

D6 Data fabrication

Step 1 — Score each dimension

Step 2 — Normalise

Step 3 — Aggregate

Step 4 — Rescale

What the rescaling means

Data quality notes

Limitations and known issues

Sources

Related

Frequently asked questions

What is the AMI Prevalence Score?

How are the six dimensions weighted?

Why is the P-Score rescaled to 5–95?