Project
PHASE IV AI
School: School of Science and Technology
Overview
Privacy compliant health data as a service for AI development
Artificial intelligence (AI) enables data-driven innovations in health care. AI systems, which process vast amounts of data quickly and in detail, show promise both as a tool for preventive health care and clinical decision-making. However, the distributed storage and limited access to health data form a barrier to innovation, as developing trustworthy AI systems requires large datasets for training and validation. Furthermore, the availability of anonymous datasets would increase the adoption of AI-powered tools by supporting health technology assessments and education. Secure, privacy compliant data utilization is key for unlocking the full potential of AI and data analytics.
In this proposal, we will advance the current state-of-the-art data synthesis methods towards a more generalized approach of synthetic data generation. We will also develop metrics for testing and validation, as well as protocols that enable synthetic data generation without access to real-world data (through multi-party computation).
We aim to provide:
1) Improved methods and technical pipelines for privacy-preserving data synthesis including different data formats such as EHRs and medical images,
2) Easy to use and configurable data services to enable AI developers’ access to larger pools of decentralized de-identified data through multi-party computing,
3) Provide anonymous data on demand or from a (temporary) repository,
4) Establish a Data Market – facilitating data sharing and monetization incl. incentives-based provision of data to the services,
5) Integrate the data market and the data service ecosystem as a X-European health data hub in the European Health Data Space, and
6) Validate the results with real-world use-cases focusing on high impact diseases, cancer types in particular.
PHASE IV AI will provide and validate a comprehensive set of scientific, technological, and enabling results accessible as services. Beyond the mere aspects of research on beyond SotA technology, the accessibility and applicability of the technological results are key issues for competitiveness of the European Health industry and citizens, health care providers and health systems benefiting from a swift uptake of innovative health technologies and services. With the holistic approach of PHASE IV A ,researchers, innovators as well as the European health industry shall gain access to privacy compliant data and computation as a service in high quality facilitating decreased time-to-market for data-driven innovation increasing its competitiveness.
Therefore, PHASE IV AI follows an agile integrated approach developing (1) data synthetization services (DaaS), (2) multi-party computation services (MaaS), and (3) a Health Data Hub. Underlying to these developments, innovative tools for in-situ data extraction, data management, e.g. filtering, cleansing, and joining, shall be utilized. The development of PHASE IV AI will setup upon clinical data environments across several European regions – Scandinavia, Central Europe, Western Europe, and the United Kingdom. This way a wide variety of data can be accessed as well as a broad validation of the developed tools and services along diverse work environments and professional user groups is ensured.
Objective 1 - To establish an integrated project workflow based on several development cycles, active reporting and efficient implementation.
Objective 2 - To generate privacy compliant, individual-level health data to support the development of innovative health technologies.
Objective 3 - To advance the conditions for the effective, cross-border utilization of real-world evidence through multi-party computation
Objective 4 - To facilitate the GDPR-compliant secondary use of health data, especially for industry-based research, development, and innovation
Objective 5 - To promote the uptake of breakthrough technologies through testing and validation in real-world use cases
Objective 6 - To develop practical data market solutions and concepts for the competitive and sustainable European health industry.
Specifically, the developments of the PHASE-IV-AI project will be validated in 3 real-life use cases in relevant high-impact diseases comprising (i) Lung Cancer, (ii) Prostate Cancer, and (iii) Ischemic Stroke. All three diseases are key topics of the European Health ecosystem. Lung Cancer as well as Prostate Cancer are among the top 3 priorities in tackling cancer, neurodegenerative diseases are one of the most relevant issues with the EU’s ageing population.