Skip to main content
Log in

A formal method for rule analysis and validation in distributed data aggregation service

  • Published:
World Wide Web Aims and scope Submit manuscript

Abstract

The usage of Cloud Serviced has increased rapidly in the last years. Data management systems, behind any Cloud Service, are a major concern when it comes to scalability, flexibility and reliability due to being implemented in a distributed way. A Distributed Data Aggregation Service relying on a storage system meets these demands and serves as a repository back-end for complex analysis and automatic mining of any type of data. In this paper we continue our previous work on data management in Cloud storage. We present a formal approach to express retrieval and aggregation rules with a compact, yet powerful tool called Rule Markup Language. Our extended solution proposes a standard form to schemes and uses the tool to match the rules to the XML form of the structured data in order to obtain the unstructured entries from BlobSeer data storage system. This allows the Distributed Data Aggregation Service (DDAS) to bypass several steps when processing a retrieval request. Our new architecture is more loosely-coupled with a separate module, the new tool, used for transforming the XML entries to standard XML files which represent the final result. We model the dynamic behavior of the system using this new standard to ensure a simpler and efficient representation of the operations performed by the client while maintaining the constraints imposed by a distributed system running in the Cloud. Furthermore we prove that this method correctly performs the translation between the storage model’s unstructured view of data and the client’s structured objects.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
Figure 6

Similar content being viewed by others

References

  1. Aamodt, K., et al.: The ALICE experiment at the CERN LHC. JINST 3, S08002 (2008)

    Google Scholar 

  2. Bessani, A., Correia, M., Quaresma, B., André, F., Sousa, P.: Depsky: dependable and secure storage in a cloud-of-clouds. In: Proceedings of the sixth conference on Computer systems, EuroSys ’11, pp 31–46. ACM, New York, NY, USA (2011)

  3. Brampton, A., MacQuire, A., Rai, I.A., Race, N.J.P., Mathy, L.: Stealth distributed hash table: a robust and flexible super-peered dht. In: Proceedings of the 2006 ACM CoNEXT conference, CoNEXT ’06, pp 19:1–19:12. ACM, New York, NY, USA (2006)

  4. Cappello, F., Caron, E., Dayde, M., Desprez, F., Jegou, Y., Primet, P., Jeannot, E., Lanteri, S., Leduc, J., Melab, N., Mornet, G., Namyst, R., Quetier, B., Richard, O.: Grid’5000: A large scale and highly reconfigurable grid experimental testbed. In: Proceedings of the 6th IEEE/ACM International Workshop on Grid Computing, GRID ’05, pp 99–106. IEEE Computer Society, Washington, DC, USA (2005)

  5. Chang, F., Dean, J., Ghemawat, S., Hsieh, W.C., Wallach, D.A., Burrows, M., Chandra, T., Fikes, A., Gruber, R.E.: Bigtable: A distributed storage system for structured data. ACM Trans. Comput. Syst. 26, 4:1–4:26 (2008)

    Article  Google Scholar 

  6. Chen, J., Sehrish, S., Liao, W.-K., Choudhary, A., Schuchardt, K.: Improving the average response time in collective i/o. In: Recent Advances in the Message Passing Interface, LNCS 6090, pp 71–73 (2011)

  7. Glatard, T., Montagnat, J., Pennec, X.: Efficient services composition for grid-enabled data-intensive applications. In: Proceedings of the IEEE International Symposium on High Performance and Distributed Computing, pp 333–334 (2006)

  8. Gorgan, D., Bacu, V., Rodila, D., Pop, F., Petcu, D.: Experiments on ESIP—Environment oriented satellite data processing platform. Earth Science Informatics 3(4), 297–308 (2010)

  9. Hummer, W., Leitner, P., Dustdar, S.: Ws-aggregation: distributed aggregation of web services data. In: Proceedings of the 2011 ACM Symposium on Applied Computing, SAC ’11, pp 1590–1597. ACM, New York, NY, USA (2011)

  10. Jacob, J.: A rule markup language and its application to uml. In: Leveraging Applications of Formal Methods, pp 26–41. Springer (2006)

  11. Kulla, E., Spaho, E., Xhafa, F., Barolli, L., Takizawa, M.: Using data replication for improving qos in manets. In: Proceedings of the 2012 Seventh International Conference on Broadband, Wireless Computing, Communication and Applications, BWCCA ’12, pp 529–533. IEEE Computer Society, Washington, DC, USA (2012)

  12. Lakshman, A., Malik, P.: Cassandra: a decentralized structured storage system. SIGOPS Oper. Syst. Rev. 44, 35–40 (2010)

    Article  Google Scholar 

  13. Lee, J.K., Sohn, M.M.: The extensible rule markup language. Commun. ACM 46(5), 59–64 (2003)

    Article  Google Scholar 

  14. Nicolae, B., Antoniu, G., Bougé, L., Moise, D., Carpen-Amarie, A.: Blobseer: Next-generation data management for large scale infrastructures. J. Parallel Distrib. Comput. 71, 169–184 (2011)

    Article  Google Scholar 

  15. Palankar, M. R., Iamnitchi, A., Ripeanu, M., Garfinkel, S.: Amazon s3 for science grids: a viable solution?. In: Proceedings of the 2008 international workshop on Data-aware distributed computing, DADC ’08, pp 55–64. ACM, New York, NY, USA (2008)

  16. Pop, F., Gruia, C., Cristea, V.: Distributed algorithm for change detection in satellite images for Grid Environments. In: Parallel and Distributed Computing, 2007. ISPDC’07. Sixth International Symposium on (pp. 41-41). IEEE (2007)

  17. Serbanescu, V., Pop, F., Cristea, V., Antoniu, G.: Architecture of distributed data aggregation service. In: Proceedings of the 2014 IEEE 28th International Conference on Advanced Information Networking and Applications, AINA ’14, pp 727–734. IEEE Computer Society, Washington, DC, USA (2014)

  18. Song, S., Chen, L.: Indexing dataspaces with partitions. World Wide Web 16(2), 141–170 (2013)

    Article  Google Scholar 

  19. Stam, A., Jacob, J., de Boer, F.S., Bonsangue, M.M., van der Torre, L.: Using xml transformations for enterprise architectures. In: Margaria, T., Steffen, B. (eds.) Leveraging Applications of Formal Methods, volume 4313 of Lecture Notes in Computer Science, pp 42–56. Springer Berlin Heidelberg (2006)

  20. Sufyan Beg, M.M., Ahmad, N.: Soft computing techniques for rank aggregation on the world wide web. World Wide Web 6(1), 5–22 (2003)

    Article  Google Scholar 

  21. Venugopal, S., Buyya, R., Ramamohanarao, K.: A taxonomy of data grids for distributed data sharing, management, and processing. ACM Comput. Surv., 38 (2006)

  22. Xhafa, F., Kolici, V., Potlog, A.-D., Spaho, E., Barolli, L., Takizawa, M.: Data replication in p2p collaborative systems. In: Proceedings of the 2012 Seventh International Conference on P2P, Parallel, Grid, Cloud and Internet Computing, 3PGCIC ’12, pp 49–57. IEEE Computer Society, Washington, DC, USA (2012)

    Chapter  Google Scholar 

  23. Yu, Y., Gunda, P.K., Isard, M.: Distributed aggregation for data-parallel computing: interfaces and implementations. In: Proceedings of the ACM SIGOPS 22nd symposium on Operating systems principles, SOSP ’09, pp 247–260. ACM, New York, NY, USA (2009)

    Chapter  Google Scholar 

  24. Zhang, J, Tao, X., Wang, H.: Outlier detection from large distributed databases. World Wide Web 17(4), 539–568 (2014)

    Article  MATH  Google Scholar 

Download references

Acknowledgment

The research presented in this paper was supported by projects: “SideDOWN: Smart Internet Data Downloader and Aggregator,” ID: PN-II-IN-CI-2012-1-0324; CyberWater grant of the Romanian National Authority for Scientific Research, CNDI-UEFISCDI, project number 47/2012; MobiWay: Mobility Beyond Individualism: an Integrated Platform for Intelligent Transportation Systems of Tomorrow - PN-II-PT-PCCA-2013-4-0321; clueFarm: Information system based on cloud services accessible through mobile devices, to increase product quality and business development farms - PN-II-PT-PCCA-2013-4-0870.

The work was developed under the DataCloud@Work associated team between KerData and Myriads teams from INRIA Rennes - Bretagne Atlantique and the Computer Science Department from Politehnica University of Bucharest

The work is partly funded by the EU project FP7-610582 ENVISAGE: Engineering Virtualized Services (http://www.envisage-project.eu)

We would like to thank the reviewers for their time and expertise, constructive comments and valuable insight.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Florin Pop.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Serbanescu, V., Pop, F., Cristea, V. et al. A formal method for rule analysis and validation in distributed data aggregation service. World Wide Web 18, 1717–1736 (2015). https://doi.org/10.1007/s11280-015-0334-4

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11280-015-0334-4

Keywords

Navigation