Genomics Whiteboard Seminar Series
Sean Simmons (Massachusetts Institute of Technology)
Calvin Lab Room 116
Differential privacy in Genomics
The growing stockpiles of genomic data found in biomedical repositories and patient records promise to be an invaluable resource for improving our understanding of human diseases. In particular, there is interest in using this genomic data to perform genome wide association studies (GWAS). Recent work, however, has shown that sharing this data--even when aggregated to produce p-values, regression coefficients, or other study statistics--may compromise patient privacy. This raises a fundamental question: how do we protect patient privacy while still making the most out of their data?
One proposed solution is to use a privacy preserving technique known as differential privacy. This approach, which works by slightly perturbing the data, protects patient privacy while still allowing researchers access to their genomic data. Unfortunately, existing differentially private GWAS techniques have limitations in terms of accuracy, computational efficiency, and their ability to deal with heterogeneous populations. I will share recent work we have done to help overcome these bottlenecks, work which moves privacy preserving GWAS closer to real world applicability.