Scientific discovery through weighted sampling

Sidirourgos, Eleftherios; Kersten, Martin; Boncz, Peter

E. Sidirourgos (Eleftherios), M.L. Kersten (Martin) and P.A. Boncz (Peter)

2013-11-25

Scientific discovery through weighted sampling

Presented at the IEEE International Conference on Big Data

Scientific discovery has shifted from being an exercise of theory and computation, to become the exploration of an ocean of observational data. Scientists explore data originated from modern scientific instruments in order to discover interesting aspects of it and formulate their hypothesis. Such workloads press for new database functionality. We aim at sampling scientific databases to create many different impres- sions of the data, on which the scientists can quickly evaluate exploratory queries. However, scientific databases introduce different challenges for sample construction compared to classical business analytical applications. We propose adaptive weighted sampling as an alternative to uniform sampling. With weighted sampling only the most informative data is being sampled, thus more relevant data to the scientific discovery is available to examine a hypothesis. Relevant data is considered to be the focal points of the scientific search, and can be defined either a priori with the use of functions, or by monitoring the query workload. We study such query workloads, and we detail different families of weight functions. Finally, we give a quantitative and qualitative evaluation of weighted sampling.

Additional Metadata
Conference	IEEE International Conference on Big Data
Organisation	Database Architectures
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Sidirourgos, E., Kersten, M., & Boncz, P. (2013). Scientific discovery through weighted sampling.

Scientific discovery through weighted sampling

Publication

Publication

Address

CWI researchers

Questions or comments?

Scientific discovery through weighted sampling

Publication

Publication

Workflow

Workflow

Add Content