We leverage vectorized User-Defined Functions (UDFs) to efficiently integrate unchanged machine learning pipelines into an analytical data management system. The entire pipelines including data, models, parameters and evaluation outcomes are stored and executed inside the database system. Experiments using our MonetDB/Python UDFs show greatly improved performance due to reduced data movement and parallel processing opportunities. In addition, this integration enables meta-analysis of models using relational queries.

doi.org/10.5441/002/edbt.2018.50
International Conference on Extending Database Technology
Database Architectures

Raasveldt, M., Holanda, P., Mühleisen, H., & Manegold, S. (2018). Deep integration of machine learning Into column stores. In Advances in Database Technology - EDBT 2018 (pp. 473–476). doi:10.5441/002/edbt.2018.50