2014-08-01
Predicting sense of community and participation by applying machine learning to open government data
Publication
Publication
Community capacity is used to monitor socio-economic development. It is composed of a number of dimensions, which can be measured to understand the possible issues in the implementation of a policy or the outcome of a project targeting a community. Measuring community capacity dimensions is usually expensive and time consuming, requiring locally organised surveys. Therefore, we investigate a technique to estimate them by applying the Random Forests algorithm on secondary open government data. Our research focuses on the prediction of measures for two dimensions: sense of community and participation. The most important variables for this prediction were determined. The variables included in the datasets used to train the predictive models complied with two criteria: nationwide vailability; sufficiently fine-grained geographic breakdown, i.e. neighbourhood level. The models explained 76.6% of the sense of community measures and 62.5% of participation. Due to the low geographic detail of the outcome measures available, further research is required to apply the predictive models built to a neighbourhood level. The most important variables were only partially in agreement with the factors influencing sense of community and participation the most, according to the social science literature consulted.
| Additional Metadata | |
|---|---|
| UvA | |
| L. Hardman (Lynda) | |
| Organisation | Human-Centered Data Analytics |
|
Piscopo, A. (2014, August). Predicting sense of community and participation by applying machine learning to open government data. UvA. |
|