We present a four-category classification algorithm for the solar wind, based on Gaussian Process. The four categories are the ones previously adopted in Xu and Borovsky (2015): ejecta, coronal hole origin plasma, streamer belt origin plasma, and sector reversal origin plasma. The algorithm is trained and tested on a labeled portion of the OMNI data set. It uses seven inputs: the solar wind speed Vsw, the temperature standard deviation σT, the sunspot number R, the F10.7 index, the Alfven speed vA, the proton specific entropy Sp, and the proton temperature Tp compared to a velocity-dependent expected temperature. The output of the Gaussian Process classifier is a four-element vector containing the probabilities that an event (one reading from the hourly averaged OMNI database) belongs to each category. The probabilistic nature of the prediction allows for a more informative and flexible interpretation of the results, for instance, being able to classify events as "undecided." The new method has a median accuracy larger than 90% for all categories, even using a small set of data for training. The Receiver Operating Characteristic curve and the reliability diagram also demonstrate the excellent quality of this new method. Finally, we use the algorithm to classify a large portion of the OMNI data set, and we present for the first time transition probabilities between different solar wind categories. Such probabilities represent the "climatological" statistics that determine the solar wind baseline.
, , , ,
Journal of Geophysical Research: Space Physics
Centrum Wiskunde & Informatica, Amsterdam (CWI), The Netherlands

Camporeale, E, Carè, A, & Borovsky, J.E. (2017). Classification of solar wind with machine learning. Journal of Geophysical Research: Space Physics, 122(11), 10910–10920. doi:10.1002/2017JA024383