The MSt Healthcare Data Science is a part-time Master's course designed to fit with the demands of full-time employment. The course is delivered through a combination of face-to-face sessions requiring attendance in Cambridge (blended with remote learning where suitable), plus self-directed learning supported through a virtual learning environment [VLE].
Teaching delivery:
Full in-person attendance is required at the teaching blocks which are held on a bi-monthly basis commencing October 2025. The Master class sessions will take place online.
Teaching dates below are indicative and will be updated as soon as possible
The course is structured across the following modules:
Module 1 – Data driven decision making
This module explores data-driven decision-making in healthcare, focusing on project setup in Trusted Research Environments (TREs), data analysis, and reporting using RMarkdown to create dynamic, reproducible reports for healthcare insights. The module explores data-driven decision-making in healthcare, focusing on project setup in Trusted Research Environments (TREs), data analysis, and reporting using RMarkdown to create dynamic, reproducible reports for healthcare insights. Outcomes include a critical understanding of data-driven procedures, reproducible evaluation methods for diverse data sources, and advanced skills in appraising relevant literature. Students will gain a conceptual understanding of healthcare systems, patient pathways, legal/ethical data-sharing principles, and agile development processes, applying these concepts to real-world health data science projects.
- TBC October-November 2025 (5 day teaching block plus 2 day master class)
Module 2 – Principles of Health Data Science
This module introduces the foundational principles of Health Data Science, focusing on the complexities of accessing and working with patient-level data. It covers data access procedures and governance considerations required for patient level data. Students will develop professional skills in using reproducible tools and methodologies essential for conducting rigorous health data science projects, ensuring data integrity and compliance with industry standards.
- TBC November-December 2025 (5 day teaching block plus 2 day master class)
Module 3 – Health Data Science II
This advanced module builds on foundational Health Data Science concepts, focusing on data integration, cleaning, and preprocessing techniques essential for managing complex healthcare datasets. Students will develop comprehensive knowledge of data management strategies and best practices for creating effective project protocols, equipping them with the skills to handle diverse health data sources and ensure high-quality analysis and reproducibility in health data science projects.
- TBC January-February 2026 (5 day teaching block plus 2 day master class)
Module 4 – Data Visualisation
This module focuses on the principles and techniques of effective data visualization in health data science. Students will develop advanced skills in designing visual elements that communicate complex healthcare data insights clearly and effectively. The course emphasizes practical approaches to crafting visualizations that enhance understanding and decision-making in health data science projects, equipping students to convey complex concepts to diverse audiences.
- TBC February-March 2026 (5 day teaching block plus 2 day master class)
Module 5 – Machine learning
This module provides an in-depth exploration of Machine Learning (ML) approaches in the healthcare domain, emphasizing their role in developing and implementing health data science projects. Students will critically evaluate both human-based and automated ML techniques, assessing their effectiveness in real-world healthcare applications. The course covers key ethical considerations and the responsible use of ML in healthcare, fostering a comprehensive understanding of how these technologies can drive innovation while ensuring patient safety and data integrity.
- TBC April-May 2026 (5 day teaching block plus 2 day master class)
Module 6 – Databases
This module covers advanced concepts in database systems, focusing on data retrieval, management, and application in health data science. Students will gain expertise in using Structured Query Language (SQL) and other query languages to interrogate database servers. Key topics include database design standards and data linkage. The module also explores metadata development, data mining, and feature selection. By the end of the course, students will be able to translate complex clinical questions into effective SQL queries.
- TBC May-June 2026 (5 day teaching block plus 2 day master class)
Module 7 – Data analysis and inference
This module provides an in-depth understanding of data analysis and inference in the context of health data science. Students will develop professional skills in applying statistical and epidemiological principles to analyse health data, producing actionable insights. The course emphasizes a critical understanding of the scientific concepts that guide data interpretation and their implications for individuals and organizations. Additionally, students will evaluate the strengths and limitations of various statistical and epidemiological methods, analytical tools, and approaches, equipping them to make informed decisions in health data analysis.
- TBC October-November 2026 (5 day teaching block plus 2 day master class)
Module 8 – Advanced statistical methods
This module focuses on advanced statistical methods, providing students with the skills to evaluate and apply complex statistical techniques in health data science. It covers a wide range of advanced modelling approaches, enabling students to select and implement the most appropriate methods for specific scenarios. By the end of the module, students will demonstrate professional competence in using these advanced statistical tools, ensuring rigorous and robust analysis in health-related research and projects.
- TBC November-December 2026 (5 day teaching block plus 2 day master class)
Module 9 – Research Dissertation
For dissertation projects students are able to choose from a range of healthcare data sources available to the programme and gain facilitated access to these data based on their research proposal to undertake their dissertation project, applying advanced analytical and data visualization techniques to large-scale health-related databases. Students will develop a systematic understanding of data-driven decision-making, focusing on reproducible approaches. The module emphasizes a deep conceptual understanding of the legal and ethical principles surrounding data sharing, equipping students to critically evaluate current methodologies and propose innovative approaches in health data research. Through this project, students will demonstrate their ability to formulate hypotheses and assess the suitability of these methods within a health organization context.
- TBC January 2027 (2 day teaching session)
- TBC February 2027 (1 day teaching session)
- TBC April 2027 (2 day teaching session)
- TBC May 2027 (1 day teaching session)
Each module is worth the equivalent of 15 credits of study with the exception of the dissertation which is worth 60. 15 credits is approximately equivalent to 150 hours of study which will consist of face-to-face teaching, blended, and self-directed learning.
Supervision
Each learner will be assigned a dissertation supervisor who will be experienced in the area and/or methodology being studied as part of the dissertation. They will meet regularly during their dissertation development process with the supervisor, either in person or remotely. These meetings will support development of the research question and methodology, acquisition and analysis of data and feedback on a single draft of the dissertation.
Assessment
Each module (with the exception of the research dissertation) requires the submission of a piece of summatively assessed work which is between 2500-3000 words or equivalent.
The research dissertation of between 10-12,000 words.