Extended methods to handle classification biases

Beauxis-Aussalet, Emmanuelle; Hardman, Lynda

doi:10.1109/DSAA.2017.52

E.M.A.L. Beauxis-Aussalet (Emmanuelle) and L. Hardman (Lynda)

2017-10-19

Extended methods to handle classification biases

Presented at the International Conference on Data Science and Advanced Analytics (October 2017), Tokyo, Japan

Classifiers can provide counts of items per class, but systematic classification errors yield biases (e.g., if a class is often misclassified as another, its size may be under-estimated). To handle classification biases, the statistics and epidemiology domains devised methods for estimating unbiased class sizes (or class probabilities) without identifying which individual items are misclassified. These bias correction methods are applicable to machine learning classifiers, but in some cases yield high result variance and increased biases. We present the applicability and drawbacks of existing methods and extend them with three novel methods. Our Sample-to-Sample method provides accurate confidence intervals for the bias correction results. Our Maximum Determinant method predicts which classifier yields the least result variance. Our Ratio-to-TP method details the error decomposition in classifier outputs (i.e., how many items classified as class C_y truly belong to C_x, for all possible classes) and has properties of interest for applying the Maximum Determinant method. Our methods are demonstrated empirically, and we discuss the need for establishing theory and guidelines for choosing the methods and classifier to apply.

Additional Metadata
Persistent URL	doi.org/10.1109/DSAA.2017.52
Project	Supporting humans in knowledge gathering and question answering w.r.t. marine and environmental monitoring through analysis of multiple video streams
Conference	International Conference on Data Science and Advanced Analytics
Grant	This work was funded by the European Commission 7th Framework Programme; grant id fp7/257024 - Supporting humans in knowledge gathering and question answering w.r.t. marine and environmental monitoring through analysis of multiple video streams (FISH4KNOWLEDGE)
Organisation	Human-Centered Data Analytics
Citation APA Style AAA Style APA Style Cell Style Chicago Style Harvard Style IEEE Style MLA Style Nature Style Vancouver Style American-Institute-of-Physics Style Council-of-Science-Editors Style BibTex Format Endnote Format RIS Format CSL Format DOIs only Format	Beauxis-Aussalet, E., & Hardman, L. (2017). Extended methods to handle classification biases. In 2017 International Conference on Data Science and Advanced Analytics, DSAA 2017 (pp. 765–774). doi:10.1109/DSAA.2017.52

View at Publisher

Full Text ( Final Version , 8mb )

See Also
inProceedings Extended Methods to Handle Classification Biases E.M.A.L. Beauxis-Aussalet (Emmanuelle) and L. Hardman (Lynda)
presentation Extended methods to handle classification biases E.M.A.L. Beauxis-Aussalet (Emmanuelle) and L. Hardman (Lynda)

Extended methods to handle classification biases

Publication

Publication

inProceedings
Extended Methods to Handle Classification Biases

presentation
Extended methods to handle classification biases

Address

CWI researchers

Questions or comments?

Extended methods to handle classification biases

Publication

Publication

inProceedings Extended Methods to Handle Classification Biases

presentation Extended methods to handle classification biases

Workflow

Workflow

Add Content

inProceedings
Extended Methods to Handle Classification Biases

presentation
Extended methods to handle classification biases