IRIS is pleased to announce that our sixth data release is now available on the virtual data enclave.

The 2024 dataset includes information on research spending, vendor contracts and employees from more than 100 campuses across the country. It contains data on more than 580,000 sponsored research grants worth more than $192 billion, about $35 billion in payments to more than 1.2 million vendors of research-related goods and services, and wages to about 985,000 employees.

The dataset has been used by researchers in higher education, economics, sociology, and many other fields.

Highlights of the 2024 IRIS UMETRICS research dataset include:

  • Improved quality of vendor and subaward files by disambiguating organization entities.
  • Improved quality of imputation on gender, ethnicity and race variables due to application of the LLM (DeBERTa) prediction model.
  • Improved quality of occupational classification due to utilizing the information embedded in UMETRICS employee job titles in order to develop staff and faculty functional roles.
  • Improved quality of object code (spending purpose) classification due to application of a machine learning method and revision of categories to reflect the new data
  • Improved quality of funding source data due to the cleaning, standardizing, and disambiguating of names of entities with a particular focus on federal agencies, foundations, and non-profit organizations, utilizing the new external data source (IRS 990).

For more on the data release, visit And for details on our research data generally, see