Workshops and Training

Joining the Data Revolution: Big Data in Education and Social Science Research

June 6-17, 2022 | Online

Jason Owen-Smith, IRIS Executive Director
Jinseok Kim, IRIS Research Assistant Professor
Matthew VanEseltine, IRIS Research Investigator
Robert Truex, IRIS Data Manager
Christopher Brown, IRIS Data Support Specialist
Natsuko Nicholls, IRIS Research Manager
Additional instructors TBD

The Institute for Research on Innovation and Science (IRIS) is offering an introductory-level workshop for researchers in education and social science fields (ESS) interested in gaining experience working with large-scale restricted data. The goal of the workshop is to increase the ability of participants to define and develop projects that will result in competitive proposals that might be submitted to NSF or other funders. A secondary goal is to support a community of researchers who can share information, tools, and insights to help strengthen research and teaching capabilities involving large scale data analysis in ESS fields.

In this hands-on class, participants will work in teams using large-scale datasets curated by IRIS to achieve a better understanding of the research questions that can be answered with big data. Datasets on the makeup of research teams will be featured. We designed this workshop to help participants acquire or expand data analysis skills to support efforts to develop and articulate individual research questions and to frame that question in a fashion suitable to pursue external funding to support ESS-focused research. While examples will be drawn from the IRIS UMETRICS dataset, topics of particular interest may focus on other data sources.

Workshop themes include data exploration, visualization, and linkage; basic data analysis; and grant proposal writing. During technical sessions, we will focus on working through well- documented examples of Python code in Jupyter Notebook. We introduce participants to the basics of data exploration and linkage techniques using Python and Structured Query Language (SQL) along with tutorials on data visualizations using Python packages that enable the user to create effective, attractive figures (such as histograms, scatter plots, box plots, etc.). Discussions, exercises, and pre-work materials will focus on examining whether and how different types of diversity in scientific research teams influence the amount, character, and impact of research produced by those teams.

The guided, step-by-step analytic work will help participants develop critical skills in working with real administrative data in a privacy protected research environment. More substantive discussions during introductory lectures, journal clubs and other discussions will focus on literature and general approaches to questions that you will help define.

IRIS is a consortium of universities anchored on an IRB-approved data repository that uses administrative data to understand, explain, and improve the public value of research and higher education. IRIS ingests member university administrative data related to sponsored research projects into a highly protected data enclave environment and makes linkages to a variety of public and proprietary datasets (including the U.S. Census Bureau). Using these data, IRIS generates reports and creates research datasets. See for more information.

Supported by the National Science Foundation as part of its Building Capacity in STEM Education Research program, the IRIS workshop is designed to help investigators from a wide range of backgrounds and disciplines acquire the tools and knowledge to secure grant funding for data-driven social science and education research.

Prerequisites: No experience in quantitative data analysis is necessary, although some pre- workshop reading will be required.

Registration Fee: There is no registration fee for this workshop.

Eligibility: Participants must be currently affiliated with an academic or research institution in the United States. Participants must have a PhD or have achieved candidacy in an applicable field (i.e., education, higher education, economics, sociology or related fields). Like our workshop sponsor, the National Science Foundation (NSF), IRIS is committed to broadening participation in this course to include scholars from underrepresented fields of study, institutions, and demographic groups.

There are no requirements for technical skills but note that this is a hands-on coding workshop. Past participants have suggested that a basic familiarity with Python makes the class much more effective. IRIS will share resources for developing such skills prior to the workshop’s start date. Participants will be encouraged to complete pre-workshop readings and exercises.

Click to see last year’s syllabus

Application: Admission to this workshop is competitive and enrollment is limited. Applications for 2022 are closed.

Course Times: Course meetings will be held beginning at 1 pm EDT via video conference. Specific details about course format will be provided in a syllabus sent to participants several weeks prior to the course start date.

For more information: Please visit for more information about IRIS. Contact with questions about the workshop.

to apply

Admission to this workshop is competitive and enrollment is limited.

Applications are closed for 2022. Please contact with any questions.


Sign up for our email newsletter to receive updates on upcoming workshops and training opportunities:

* indicates required

Institute for Research on Innovation and Science
University of Michigan
Institute for Social Research
Survey Research Center
330 Packard St, 2354 Perry Bldg
Ann Arbor, MI, 48104-2910

P: (734) 615-0015
F: (734) 763-3862
@IRIS_UMETRICS / #irisumetrics

Ewing Marion Kauffman Foundation Logo
Alfred P. Sloan Foundation Logo
University of Michigan Logo

© 2023 The Regents of the University of Michigan  |  Ann Arbor, MI 48109 USA  |  Phone: +1 (734) 764-1817 |  @IRIS_UMETRICS / #irisumetrics