amazon

Monday, August 17, 2015

CLOSENESS A NEW PRIVACY MEASURE FOR DATA PUBLISHING

CLOSENESS A NEW PRIVACY MEASURE FOR DATA PUBLISHING

ABSTRACT


The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain “identifying” attributes) contains at least k records. Recently, several authors have recognized that k-anonymity cannot prevent attribute disclosure. The notion of l-diversity has been proposed to address this; l-diversity requires that each equivalence class has at least l well-represented values for each sensitive attribute. In this project, we show that l-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Motivated by these limitations, we propose a new notion of privacy called “closeness.” We first present the base model t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table. We then propose a more flexible privacy model called (n,t)-closeness that offers higher utility. We describe our desiderata for designing a distance measure between two probability distributions and present two distance measures. We discuss the rationale for using closeness as a privacy measure and illustrate its advantages through examples and experiments.



SYSTEM ANALYSIS
PROBLEM DEFINITION:

      The problem of information disclosure has been studied extensively in the framework of statistical databases. A number of information disclosure limitation techniques have been designed for data publishing. The first category of work aims at devising privacy requirements. A few subsequent works recognize that the adversary has also knowledge of the distribution of the sensitive attribute in each group. t-Closeness proposes that the distribution of the sensitive attribute in the overall table should also be public information. Privacy-preserving data publishing has been extensively
studied in several other aspects. First, background knowledge presents additional challenges in defining privacy requirements. By this, we use the Mondrian algorithm which partitions the high-dimensional space into regions and encodes data points in one region by the region’s representation. On the theoretical side, optimal k-anonymity has been proved, and approximation algorithms for finding the anonymization that suppresses the fewest cells have been proposed.
 Existing System
            In the existing system, several methods have recognized that k-anonymity cannot prevent attribute disclosure. So it is not able to maintain the privacy and it’s not much useful to formulate the census evaluations. Before, the system will not make any possible to secure the data publishing in the area of data security. It does not provide sufficient protection against attribute disclosure. Some of the hospitals only making database of the patient’s history. There is no integrated structure in the existing systems. Every one can enter to the application to view the patient history with out any security information in the previous sessions. These are the major drawbacks of the existing applications. To overcome these problems, have to make an integrated solution with the privacy security in future applications.

 Problems in Existing System
K-anonymity cannot prevent attribute disclosure.
It is not able to maintain the privacy.
There is no integrated structure in the existing systems.

3.2 Proposed System

            Here a novel privacy notion called “closeness” is proposed. From the idea of global background knowledge, first the base model t-closeness is formalized, which requires that the distribution of a sensitive attribute in any equivalence class to be close to the distribution of the attribute in the overall table. While the released table gives useful information to researchers, it presents disclosure risk to the individuals whose data are in the table. Therefore, an objective is to limit the disclosure risk to an acceptable level while maximizing the benefit. Closeness achieves a better balance between privacy and utility than the existing privacy models such as l-diversity and t-closeness. Finally, the effectiveness of the closeness model is evaluated in both privacy protection and utility preservation through an integrated hospital website with a real data set.









Hardware Requirements

Processor                     : Pentium IV
RAM                            : 512 MB
Hard Disk                    : 40 GB
Monitor                        : 15” Color Monitor
Keyboard                     : Multimedia
Mouse              : Optical

Software Requirements

            Framework                  : Visual Studio 2005
            Front End                     : Asp.net 2.0
            Code behind                : C#.net
            Database                      : Sql Server 2000




No comments:

Post a Comment