Xtream Technologies: CLOSENESS A NEW PRIVACY MEASURE FOR DATA PUBLISHING

Monday, August 17, 2015

CLOSENESS A NEW PRIVACY MEASURE FOR DATA PUBLISHING

ABSTRACT

The k-anonymity privacy requirement for publishing microdata requires that each equivalence class (i.e., a set of records that are indistinguishable from each other with respect to certain “identifying” attributes) contains at least k records. Recently, several authors have recognized that k-anonymity cannot prevent attribute disclosure. The notion of l-diversity has been proposed to address this; l-diversity requires that each equivalence class has at least l well-represented values for each sensitive attribute. In this project, we show that l-diversity has a number of limitations. In particular, it is neither necessary nor sufficient to prevent attribute disclosure. Motivated by these limitations, we propose a new notion of privacy called “closeness.” We first present the base model t-closeness, which requires that the distribution of a sensitive attribute in any equivalence class is close to the distribution of the attribute in the overall table. We then propose a more flexible privacy model called (n,t)-closeness that offers higher utility. We describe our desiderata for designing a distance measure between two probability distributions and present two distance measures. We discuss the rationale for using closeness as a privacy measure and illustrate its advantages through examples and experiments.

SYSTEM ANALYSIS

PROBLEM DEFINITION:

The problem of information disclosure has been studied extensively in the framework of statistical databases. A number of information disclosure limitation techniques have been designed for data publishing. The first category of work aims at devising privacy requirements. A few subsequent works recognize that the adversary has also knowledge of the distribution of the sensitive attribute in each group. t-Closeness proposes that the distribution of the sensitive attribute in the overall table should also be public information. Privacy-preserving data publishing has been extensively

studied in several other aspects. First, background knowledge presents additional challenges in defining privacy requirements. By this, we use the Mondrian algorithm which partitions the high-dimensional space into regions and encodes data points in one region by the region’s representation. On the theoretical side, optimal k-anonymity has been proved, and approximation algorithms for finding the anonymization that suppresses the fewest cells have been proposed.

Existing System

In the existing system, several methods have recognized that k-anonymity cannot prevent attribute disclosure. So it is not able to maintain the privacy and it’s not much useful to formulate the census evaluations. Before, the system will not make any possible to secure the data publishing in the area of data security. It does not provide sufficient protection against attribute disclosure. Some of the hospitals only making database of the patient’s history. There is no integrated structure in the existing systems. Every one can enter to the application to view the patient history with out any security information in the previous sessions. These are the major drawbacks of the existing applications. To overcome these problems, have to make an integrated solution with the privacy security in future applications.

Problems in Existing System

K-anonymity cannot prevent attribute disclosure.

It is not able to maintain the privacy.

There is no integrated structure in the existing systems.

3.2 Proposed System

Here a novel privacy notion called “closeness” is proposed. From the idea of global background knowledge, first the base model t-closeness is formalized, which requires that the distribution of a sensitive attribute in any equivalence class to be close to the distribution of the attribute in the overall table. While the released table gives useful information to researchers, it presents disclosure risk to the individuals whose data are in the table. Therefore, an objective is to limit the disclosure risk to an acceptable level while maximizing the benefit. Closeness achieves a better balance between privacy and utility than the existing privacy models such as l-diversity and t-closeness. Finally, the effectiveness of the closeness model is evaluated in both privacy protection and utility preservation through an integrated hospital website with a real data set.

Hardware Requirements

Processor : Pentium IV

RAM : 512 MB

Hard Disk : 40 GB

Monitor : 15” Color Monitor

Keyboard : Multimedia

Mouse : Optical

Software Requirements

Framework : Visual Studio 2005

Front End : Asp.net 2.0

Code behind : C#.net

Database : Sql Server 2000

Pages

amazon

Monday, August 17, 2015

CLOSENESS A NEW PRIVACY MEASURE FOR DATA PUBLISHING

No comments:

Post a Comment