Kern Institute Data Science Lab
The Data Science lab focuses on partnering with Kern Pillars and labs to support the process of aligning curriculum objectives with outcome measures to ensure important outcomes of interest are not missed or captured in a way that will bias conclusions. The lab supports scholars and conducts research.
Data Science Lab Focuses:
Design and Implementation
Building Data Pipelines
Linkage
Visualization
The KERN Data Science Mission
Innovation Through Experimentation and Play
- This involves creating a safe environment to experiment with different pedagogical and curriculum strategies and understand its impact on learners and those that teach them.
- Support the education of physicians and other health professionals as individuals, in groups and across the UME-GME-CME continuum.
- Identify optimal ways to communicate relevant and important data to learners and teachers in an accessible and user-friendly manner. This includes visualization using cognitive, instructional design and learning principles.
Statistical Learning and Development
- Evolve the understanding of statistics, research methodologies, and psychometrics (measurement).
- Increase the use and application of statistical theory in practice by minimizing barriers. This includes bringing back Generalizability Theory, and the use of Modeling techniques to see if theory holds true with data.
- Facilitate the use of correct analytic techniques for Likert data (e.g., pearson vs. polychoric correlations).
- Contribute and evolve statistical theory using proof of concept and simulation to understand the value, biases and limits of particular statistical practices in small, institutional and population level studies.
- Implementation of natural language processing in the analysis of open-ended text content.
- Translate statistical theory into computational algorithms and open-source code so users can apply it to their own work.
Scholarship and Open-Source Sharing
- Scholarly collaborations with other data scientists and institutions engaged in this work.
- Develop and share data pipelines from commonly used systems (e.g., Qualtrics and Redcap), surveys/ measurement tools and data wrangling techniques through open-source platforms.
Make 'Data Reign' With Educational Data Repositories and Facilitate Data Linkage
- Develop and manage an educational data repository of UME and GME learner data over time to support curriculum and program evaluation and research in medical education. This repository will support the transformative and evidence-based changes needed in the education of medicals students, improvement in physician health and wellbeing and ultimately patient outcomes.
- Provide access and availability of data from the educational repository to researchers an anonymized fashion through a careful data access request process.
- Examine and unpack student learning trajectories as they progress through a curriculum to understand how to help students who are struggling with some of the material and help facilitate their learning, evaluate curriculum components, and understand the role teachers play so they can continue to be impactful in the right ways.
- Develop a data access request process to access data from the educational repository and for researchers to include their data for others to access. This requires creating appropriate technological systems for data linkage.
- Conduct data linkage as a trusted third-party entity following ethical, privacy and secure data practices.
Data Science Lab Members

Tavinder K. Ark, PhD
Director, Data Science Lab, Kern Institute

Andrew Gleave
Software Engineer
