Skip to content

Explainable Machine Learning Using Electronic Health Records to Select High-risk Patients for Lung Cancer Screening S&T61

  • School: School of Science and Technology
  • Study mode(s): Full-time / Part-time
  • Starting: 2022
  • Funding: UK student / EU student (non-UK) / International student (non-EU) / Fully-funded


NTU's Fully-funded PhD Studentship Scheme 2022

Project ID: S&T61

This project aims to develop explainable machine learning to more accurately selecting eligible high-risk patients for lung cancer screening. Lung cancer is one of the commonest cancers and the most common cause of cancer mortality in the UK. Almost three-quarters of lung cancers are diagnosed at an advanced stage, but early detection of lung cancer might lead to a curative treatment and reduce mortality. Currently computed tomography screening (CT scans) in patients at high risk of lung cancer is an important method to detect lung cancer early and substantially to reduce mortality from the disease. The cost effectiveness of screening depends on selecting a high-risk group where the benefit of early cancer detection outweighs the harms that can potentially result. The NHS’ Targeted Lung Health Check that uses mathematical risk prediction models to determine the risk threshold for eligibility for the programme. The models are applied to people who are willing to participate who have a history of ever having smoked and are aged 55–75 years to have a free lung check.

Machine-learning is a novel way of building risk models for predicting lung cancer risk. Machine learning can be used to extract predictive information and patterns from electronic health records (EHRs) that has the potential to select people who will benefit most and avoid screening people who have little or no chance of benefitting. This is an important priority as is will likely influence the efficacy and cost effectiveness of the programme. This project aims to design and implement explainable machine learning models such as decision trees for selecting high-risk patients. These models will be built upon a primary healthcare database which contains coded anonymised information about patients from GPs, where the confidential and private information has been removed from the data.   The main research questions in this project are

  1. What are explainable machine learning models for predicting lung cancer risk and how to design them?
  2. Whether the performance of explainable machine learning models is better than current risk prediction models?

School strategic research priority

This research aligns with Computer Science and Informatic Research Centre.

Entry qualifications

For the eligibility criteria, visit our studentship application page.

How to apply

For guidance and to make an application, please visit our studentship application page. The application deadline is Friday 14 January 2022.

Fees and funding

This is part of NTU's 2022 fully-funded PhD Studentship Scheme.

Guidance and support

Download our full applicant guidance notes for more information.

Still need help?

+44 (0)115 941 8418