posted on 2017-02-06, 15:05authored byVanessa Ayala-Rivera, Andrés Omar Portillo-Domínguez, Liam Murphy, Christina Thorpe
Conducting extensive testing of anonymization techniques
is critical to assess their robustness and identify the scenarios where
they are most suitable. However, the access to real microdata is highly
restricted and the one that is publicly-available is usually anonymized
or aggregated; hence, reducing its value for testing purposes. In this
paper, we present a framework (COCOA) for the generation of realistic
synthetic microdata that allows to de ne multi-attribute relationships in
order to preserve the functional dependencies of the data. We prove how
COCOA is useful to strengthen the testing of anonymization techniques
by broadening the number and diversity of the test scenarios. Results
also show how COCOA is practical to generate large datasets.
History
Publication
International Conference on Privacy in Statistical Databases: Lecture Notes in Computer Science (LNCS);9867, pp. 163-177
Publisher
Springer
Note
peer-reviewed
Other Funding information
SFI
Rights
The original publication is available at www.springerlink.com