Data Leakage Detection Project

I've been assigned a project on the data leakage detection, but i googled n couldn't understand it, i need a very basic definition about it. What is it's use, how can me make a software on it, How to detect a data leakage, why data is lost while transferring, Pretty much everything about data leakage detection....

Data Leakage Detection Abstract: We study the following problem: A data distributor has given sensitive data to a set of supposedly trusted agents (third parties). Some of the data are leaked and found in an unauthorized place (e.g., on the web or somebody's laptop). The distributor must assess the likelihood that the leaked data came from one. I've been assigned a project on the data leakage detection, but i googled n couldn't understand it, i need a very basic definition about it. What is it's use, how can me make a software on it, How to detect a data leakage, why data is lost while transferring, Pretty much everything about data leakage detection.

coders_orgcoders_org

1 Answer

Though the question is off topic...

Definition of: Data leakage

The unauthorized transfer of classified information from a computer or datacenter to the outside world. Data leakage can be accomplished by simply mentally remembering what was seen, by physical removal of tapes, disks and reports or by subtle means such as data hiding (see steganography).

Definition of: Data leakage protection software

See Data loss prevention software

DMKDMK

Not the answer you're looking for? Browse other questions tagged data-loss or ask your own question.

Data leakage Detection

While doing business, sometimes sensitive data must be handed over to supposedly trusted third parties. For example, a hospital may give patient records to researchers who will devise new treatments. Similarly, a company may have partnerships with other companies that require sharing customer data. Another enterprise may outsource its data processing, so data must be given to various other companies. We call the owner of the data the distributor and the supposedly trusted third parties the agents. Our goal is to detect when the distributor’s sensitive data has been leaked by agents, and if possible to identify the agent that leaked the data. We consider applications where the original sensitive data cannot be perturbed. Perturbation is a very useful technique where the data is modified and made “less sensitive” before being handed to agents. For example, one can add random noise to certain attributes, or one can replace exact values by ranges. However, in some cases it is important not to alter the original distributor’s data. For example, if an outsourcer is doing our payroll, he must have the exact salary and customer bank account numbers. If medical researchers will be treating patients (as opposed to simply computing statistics), they may need accurate data for the patients.