Research

The proliferation of medical data has unlocked significant opportunities to advance biomedical research and P4 medicine—Predictive, Preventive, Personalized, and Participatory—through cutting-edge AI technologies like deep learning. To fully realize this potential, efficient sharing of large-scale clinical and research data among stakeholders is essential. However, privacy concerns related to sensitive medical information pose substantial challenges. Our research addresses these challenges by developing innovative, privacy-preserving technologies that enable secure and responsible data sharing across federations of medical institutions, hospitals, and research laboratories. We aim to balance usability, scalability, and data protection to facilitate effective P4 medicine.

Here are some themes and techniques that we are currently working on:

Privacy-Preserving Similar Patient Queries: We focus on methods that allow querying for similar patients across diverse biomedical data types while preserving individual privacy. This enhances understanding of treatment responses and disease progression, especially when working with large patient cohorts.

Privacy-Preserving Medical Record Linkage: Our goal is to securely combine patient data from different sources to uncover new relationships between diseases and medical indications. We aim to improve Privacy-Preserving Record Linkage (PPRL) methods to efficiently and securely match patient records without compromising privacy.

Privacy-Preserving Genome-Wide Association Study (GWAS): We work on designing privacy-preserving methods for Genome-Wide Association Studies that can identify genetic variants associated with phenotypes. Our focus is on enabling secure analysis that scales effectively with cohort size, facilitating the discovery of rare but significant genetic factors.

Three-Party Secure Computation Framework for Privacy-Preserving Machine Learning: We have developed our own multi-party computation (MPC) framework called CECILIA. Our research extends privacy-preserving machine learning techniques beyond Convolutional Neural Networks (CNNs) to include algorithms like Recurrent Kernel Networks (RKNs), Long Short-Term Memory networks (LSTMs), and Generative Adversarial Networks (GANs). We address challenges in efficiently applying these techniques while maintaining data privacy.

Privacy-Preserving Performance Evaluation of Collaborative Machine Learning Models: We develop methods that allow clients to evaluate the performance of machine learning models using pooled test data without compromising privacy. This is crucial for clients with limited data who need to assess models trained on aggregated datasets, supporting collaborative learning architectures like federated learning.

Privacy-Preserving Explainable Machine Learning: We investigate the privacy risks associated with model explanations, such as the potential exposure of sensitive training data. Our research aims to design methods that provide insightful model explanations while preserving the privacy of individuals whose data was used in training, thus enhancing the trustworthiness of machine learning models.

Secure and Private Federated Learning: We address the privacy challenges in federated learning, particularly with high-dimensional medical data. Our focus is on developing frameworks that enhance security and privacy, ensuring that model updates and data remain protected throughout the collaborative learning process.