CoDS: A representative sampling method for relational databases

Buda, Teodora Sandra; Cerqueus, Thomas; Murphy, John; Kristiansen, Morten

CoDS: A representative sampling method for relational databases

conference contribution

posted on 2014-08-05, 08:58 authored by Teodora Sandra Buda, Thomas Cerqueus, John Murphy, Morten Kristiansen

Database sampling has become a popular approach to handle large amounts of data in a wide range of application areas such as data mining or approximate query evaluation. Using database samples is a potential solution when using the entire database is not cost-e ective, and a balance between the accuracy of the results and the computational cost of the process applied on the large data set is preferred. Existing sampling approaches are either limited to speci c application areas, to single table databases, or to random sampling. In this paper, we propose CoDS: a novel sampling approach targeting relational databases that ensures that the sample database follows the same distribution for specific fields as the original database. In particular it aims to maintain the distribution between tables. We evaluate the performance of our algorithm by measuring the representativeness of the sample with respect to the original database. We compare our approach with two existing solutions, and we show that our method performs faster and produces better results in terms of representativeness.

History

Publication

24th International Conference on Database and Expert Systems Applications (DEXA 2013) [ Lecture Notes in Computer Science];8055, pp. 342-356

Publisher

Springer

Note

peer-reviewed

Other Funding information

SFI

Rights

The original publication is available at www.springerlink.com

Language

English

External identifier

http://dx.doi.org/10.1007/978-3-642-40285-2_30

CoDS: A representative sampling method for relational databases

History

Publication

Publisher

Note

Other Funding information

Rights

Language

External identifier

Usage metrics

Categories

Keywords

Licence

Exports