Loading...
Towards realistic sampling: generating dependencies in a relational database
Date
2013
Abstract
Managing large amounts of information is one of the most expensive, time-consuming and non-trivial activities and it usually requires expert knowledge. In a wide range of application areas, such as data mining, histogram construction, approximate query evaluation, and software validation, handling exponentially growing databases has become a dif- cult challenge, and a subset of the data is generally preferred. As a solution to the current challenges in managing large amounts of data, database sampling from the operational data available has proved to be a powerful technique. However, none of the existing sampling approaches consider the dependencies between the data in a relational database. In this paper, we propose a novel approach towards constructing a realistic testing environment, by analyzing the distribution of data in the original database along these dependencies before sampling, so that the sample database is representative to the original database.
Supervisor
Description
peer-reviewed
Publisher
Association for Computing Machinery
Citation
ACM ICUIMC’13;Article no. 12
Files
Loading...
Buda_2013_towards.pdf
Adobe PDF, 383.93 KB
Keywords
ULRR Identifiers
Funding code
Funding Information
Science Foundation Ireland (SFI)
