Privacy-preserving techniques on genomic data
Abstract
Genomic data hold salient information about the characteristics of a living organism. Throughout the last decade, pinnacle developments have given us more accurate and inexpensive methods to retrieve our genome sequences. However, with the advancement of genomic research, there are growing security and privacy concerns regarding collecting, storing, and analyzing such sensitive data. Recent results show that given some background information, it is possible for an adversary to re-identify an individual from a specific genomic dataset. This can reveal the current association or future susceptibility of some diseases for that individual (and sometimes the kinship between individuals), resulting in a privacy violation.
This thesis has two parts and proposes several techniques to mitigate the privacy issues relating to genomic data. In our first part, we target the data privacy issues while using any external computational environment. We propose privacy-preserving frameworks to store genomic data in an untrusted computational environment (\textit{i.e.}, cloud). In particular, we employ prefix and suffix tree structures to represent genomic data while keeping them under encryption throughout its computational life-cycle. Therefore, the underlying methods perform different string search queries and arbitrary computations under encryption without requiring access to the raw sensitive data. We also propose a GPU-parallel Fully Homomorphic Encryption framework that optimizes existing algorithms and can perform string distance metrics such as Hamming, Edit distance and Set Maximal Matching. The GPU-parallel framework is 14.4 and 46.81 times faster for standard and matrix multiplications, respectively compared to the existing techniques.
The second part of the thesis targets another privacy setting where the outputs from different genomic data analyses are deemed sensitive. Here, we propose several differentially private mechanisms to share partial genome datasets and intermediate statistics providing a strict privacy guarantee. Experimental results demonstrate that the proposed methods are effective for protecting data privacy while computing and analysis of genomic data. Overall, the proposed techniques in this thesis are not specialized for genomic data but can be generalized to protect other types of sensitive data.
Collections
Related items
Showing items related by title, author, creator and subject.
-
An effective and efficient technique for supporting privacy-preserving keyword-based search over encrypted data in clouds
Cuzzocrea, Alfredo; Leung, Carson K.; Wodi, Bryan H.; Sourav, S.; Fadda, Edoardo (Elsevier, 2020)Nowadays, cloud providers offer to their clients the possibility of storage of emails and files on the cloud server. To avoid privacy concerns, encryption should be applied to data. Unlike searching plaintext documents by ... -
Establishing security and privacy in WAVE-enabled vehicular ad hoc networks
Biswas, Subir (2013-01-11)Security and privacy are among the growing concerns of a Vehicular Ad hoc Network (VANET) which requires a high degree of liability from its participants. In this dissertation, We address security, anonymity and privacy ... -
Privacy in personal care homes in Winnipeg, Manitoba as experienced by residents
Boonstra, Nicole (2014-08-29)The case study examined physical and social privacy for nine residents residing in personal care homes in Winnipeg, Manitoba, Canada. The study explored how residents’ well-being and quality of life was affected by the ...