Fast, explainable view detection to characterize exploration queries

Sellam, Thibault; Kersten, Martin

doi:10.1145/2949689.2949692

The aim of data exploration is to get acquainted with an unfamiliar database. Typically, explorers operate by trial and error: they submit a query, study the result, and refine their query subsequently. In this paper, we investigate how to help them understand their query results. In particular, we focus on medium to high dimension spaces: if the database contains dozens or hundreds of columns, which variables should they inspect? We propose to detect subspaces in which the users' selection is different from the rest of the database. From this idea, we built Ziggy, a tuple description engine. Ziggy can detect informative subspaces, and it can explain why it recommends them, with visualizations and natural language. It can cope with mixed data, missing values, and it penalizes redundancy. Our experiments reveal that it is up to an order of magnitude faster than state-of-the-art feature selection algorithms, at minimal accuracy costs.

Additional Metadata
Persistent URL	doi.org/10.1145/2949689.2949692
Project	The SciLens-II Infrastructure, Big Data at work , Commit: Time Trails (P019)
Conference	International Conference on Scientific and Statistical Database Management
Grant	This work was funded by the The Netherlands Organisation for Scientific Research (NWO); grant id nwo/621.016.201 - The Scilens-II Infrastructure, Big Data at work
Organisation	Database Architectures
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Sellam, T., & Kersten, M. (2016). Fast, explainable view detection to characterize exploration queries. doi:10.1145/2949689.2949692

View at Publisher

Fast, explainable view detection to characterize exploration queries

Publication

Publication

Address

CWI researchers

Questions or comments?

Fast, explainable view detection to characterize exploration queries

Publication

Publication

Workflow

Workflow

Add Content