Differential Privacy

Differential Privacy ( or undocumented, differential privacy ') is a measure of the risk of an individual to participate in a statistical database. The term falls within the range of secure, privacy -preserving publishing of sensitive information. Mechanisms that satisfy differential privacy, prevent an attacker can distinguish whether a particular person is included in a database or not.

  • 4.1 Definition
  • 4.2 Probabilistic Definition
  • 4.3 Differences to ε - differential privacy

Situation

We used differential privacy in the context of publishing sensitive information. In this case, one would like to make it available to a general, statistical information, but at the same time protect the privacy of individual participants.

A conceivable scenario is the publishing of a hospital patient data in order to enable researchers to detect statistical correlations. In this case, it is necessary to process the data so that the individual patients are not identifiable. For research purposes, it is sufficient that the statistical distributions in the published data correspond to those of the true data. It does not matter whether the individual properties, such as the name of the patient, have been modified.

Motivation

With the advent of online outsourcing and cloud computing, the sensitivity increases for data protection in this context. The cost reduction and performance improvement, we had hoped to benefit from the new technologies that are associated with the loss of control over the data.

During the encryption tool is established, with which you can ensure privacy effectively, so this also the processability of the data is restricted. To run on encrypted data operations, a full or partial decryption usually must be made, which is associated with corresponding costs. In addition, the use of encryption prevent full disclosure.

Therefore you are looking for methods by which one can publish information under privacy, without encrypting the data.

Differential privacy is pursuing the approach to provide data with noise, to make clear statements about certain properties of the data impossible.

ε - differential privacy

The original concept for differential privacy using a central parameter ε, to allow a measure of privacy.

Definition

A randomized function returns differential privacy if, and all valid for all records and that differ in at most one entry.

Mechanisms

There are different approaches to implement the requirements of differential privacy in mechanisms. By the addition of noise in a given record, it is possible to obtain the desired properties. Noise can be achieved in this case by the generation of new entries. These new entries, also called dummies must be indistinguishable compared to the original data in order to meet the requirements of differential privacy requirements.

Weaken

ε - differential privacy places high demands on mechanisms to obtain the results lose part strongly benefit. If too much noise is generated and this is too different from the original data, so the information content is very limited.

( ε, δ ) - differential privacy

Since ε - differential privacy adduced certain restrictions on the applicability, was the expansion of ( ε, δ ) - differential privacy. This term allows conditions to some degree remain unfulfilled.

Definition

For a given randomized mechanism is said to be M- differential privacy fulfilled if at all, and the following equation holds: .

Probabilistic definition

For a given randomized mechanism and constants, we say that M- Probabilistic satisfies differential privacy if you can divide all the range of values ​​into two sets such that and for all, so that, and the following equation is valid for all.

Differences to ε - differential privacy

ε - differential privacy has been extended to the parameter δ. Thus, it is the mechanism allowed to violate the requirements to a certain degree, which is determined by δ.

Composability

Differential privacy has the property of composability. Become a differential -privacy mechanism with a given ε a number of questions asked, and uses the mechanism for each processing an independent random source, then finally yields a result that differential adduced privacy.

Computational Differential Privacy

Differential privacy is a static concept, which requires no formal restrictions on the cardinality of attackers. For application in practice, it is sensible to make certain limitations. Approximately requires the use of encryption, an attacker can not find the private key by simply applying trial and error.

There are basically two types, Computational Differential Privacy ( CDP ) to reach. For one, it is sufficient to simply extend the existing definition of differential privacy to a restriction of attackers. This approach is called IND - CDP. In addition, the term SIM - CDP, the underlying sets a simulation of the view of the attacker exists.

240142
de