Privacy measures for free text documents : bridging the gap between theory and practice

  1. (PDF, 332 KB)
  2. Get@NRC: Privacy measures for free text documents : bridging the gap between theory and practice (Opens in a new window)
DOIResolve DOI:
AuthorSearch for: ; Search for: ; Search for: ; Search for:
TypeBook Chapter
Proceedings titleTrust, Privacy and Security in Digital Business : 8th International Conference, TrustBus 2011, Toulouse, France, August 29 - September 2, 2011. Proceedings
Series titleLecture Notes In Computer Science; Volume 6863
Conference8th International Conference on Trust, Privacy & Security in Digital Business (TrustBus’11), August 29-September 2, 2011, Toulouse, France
Pages161173; # of pages: 12
SubjectPrivacy compliance; ontology; privacy measure; personal health information
AbstractPrivacy compliance for free text documents is a challenge facing many organizations. Named entity recognition techniques and machine learning methods can be used to detect private information, such as personally identifiable information (PII) and personal health information (PHI) in free text documents. However, these methods cannot measure the level of privacy embodied in the documents. In this paper, we propose a framework to measure the privacy content in free text documents. The measure consists of two factors: the probability that the text can be used to uniquely identify a person and the degree of sensitivity of the private entities associated with the person. We then instantiate the framework in the scenario of detection and protection of PHI in medical records, which is a challenge for many hospitals, clinics, and other medical institutions. We did experiments on a real dataset to show the effectiveness of the proposed measure.
Publication date
PublisherSpringer Berlin Heidelberg
AffiliationNRC Institute for Information Technology; National Research Council Canada
Peer reviewedNo
NPARC number18533383
Export citationExport as RIS
Report a correctionReport a correction
Record identifiera9402a69-4fa7-4d42-8bed-214a5cf618c7
Record created2011-09-03
Record modified2016-07-15
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)