In this research project, we investigate an alternative to the standard cloud-centralized data architecture. Specifically, we aim to leave part of application data under the control of the individual data owners in conceptually decentralized personal data stores. Our primary goal is to increase data minimization, i. e., enabling more sensitive personal data to be under the control of its owners while providing a straightforward and efficient framework for architects to design data architectures that allow applications to run and their data to be analyzed. To serve this purpose, the centralized part of the schema contains aggregating views over this decentralized data. We propose to design a declarative language that extends SQL, for architects to specify at the schema level different kinds of tables: decentralized, centralized, and replicated, as well as centralized materialized views, and in addition, the sensitivity of decentralized columns and their minimum granularity levels, when these end up in centralized views. When users modify their personal data stores, the changes need to be reflected in the centralized views while ensuring privacy; this calls for the integration of cryptography techniques in distributed materialized view maintenance. We finally aim to implement this system, where the personal data stores could either live in mobile devices or encrypted cloud storage, in order to evaluate its performance properties experimentally.

, , ,
CEUR Workshop Proceedings
49th International Conference on Very Large Data Bases PhD Workshop, VLDB-PhD Workshop 2023
Database Architectures

Battiston, I, & Boncz, P.A. (2023). Improving data minimization through decentralized data architectures. In CEUR Workshop Proceedings.