Show simple item record

dc.contributor.advisorChatterjee, Sanjit
dc.contributor.authorChim, Sonali
dc.date.accessioned2021-07-02T04:46:07Z
dc.date.available2021-07-02T04:46:07Z
dc.date.submitted2021
dc.identifier.urihttps://etd.iisc.ac.in/handle/2005/5180
dc.description.abstractTwo-party data mining is a win-win game if played with a guarantee of data privacy from each other. This guarantee is provided by the use of cryptographic techniques in designing the two-party protocol. The need to obtain collaborative data mining results is growing and so is the need for privacy-preserving data mining protocols. Clustering is one of the data mining techniques and one of the popular clustering algorithms is k-means clustering. We studied the recent work for the secure two-party k-means clustering by Bunn and Ostrovosky and found that the protocol is inefficient for practical purposes. The protocol requires communication rounds which are linear in security parameter for the center initialization step and are quadratic in security parameter for an iterative Lloyd's step of the k-means clustering algorithm. The challenge in the secure two-party k-means clustering is the exorbitant communication cost occurring due to the high number of interactions between the parties for performing computations on the data. Our work attempts to resolve this problem of inefficiency in k-means clustering protocol in a two-party setting by proposing some modifications. We have come up with two comparison protocols that are required in the k-means clustering protocol. One of the protocols is to find a minimum of two shared numbers which runs in constant communication rounds. Using this protocol as a building block, another protocol is designed to find a minimum of n shared numbers, which runs in O(n) communication rounds. We have also improved a protocol that selects a random value from a domain oblivious to both parties. Apart from this, the idea to avoid the two-party integer division altogether is incorporated in the k-means clustering protocol. With these improvements, we propose a two-party k-means clustering protocol for which the initialization step requires communication rounds linear in security parameter and Lloyd's step requires communications rounds that are independent of the security parameter. The protocol provides a security guarantee in the semi-honest model except for some minor information leakage. We argue that this leakage in the protocol can be tolerated considering the substantial gain in the communication cost We have verified the gain in performance of the modified protocol by implementing both the k-means clustering protocols and running their instances in the same set-up.en_US
dc.language.isoen_USen_US
dc.rightsI grant Indian Institute of Science the right to archive and to make available my thesis or dissertation in whole or in part in all forms of media, now hereafter known. I retain all proprietary rights, such as patent rights. I also retain the right to use in future works (such as articles or books) all or part of this thesis or dissertationen_US
dc.subjectk-means clusteringen_US
dc.subjectprivacy preserving clusteringen_US
dc.subjecttwo party clusteringen_US
dc.subject.classificationResearch Subject Categories::TECHNOLOGY::Information technology::Computer science::Computer science::Cryptographyen_US
dc.titleTowards Effcient Privacy-Preserving Two-Party k-Means Clustering Protocolen_US
dc.typeThesisen_US
dc.degree.nameMTech (Res)en_US
dc.degree.levelMastersen_US
dc.degree.grantorIndian Institute of Scienceen_US
dc.degree.disciplineEngineeringen_US


Files in this item

This item appears in the following Collection(s)

Show simple item record