Graph Based Data Governance Model for Real Time Data Ingestion

Автор: Hiren Dutta

Журнал: International Journal of Information Technology and Computer Science(IJITCS) @ijitcs

Статья в выпуске: 10 Vol. 8, 2016 года.

Бесплатный доступ

Data governance is one of the strongest pillars in Data management program which goes hand in hand with data quality. In industrial Data Lake huge amount of unstructured data is getting ingested at high velocity from different source systems. Similarly, through multiple channels of data are getting queried and transformed from Data Lake. Based on 3Vs of big data it's a real challenge to set up a rule based on traditional data governance system for an Enterprise. In today's world governance on semi structured or unstructured data on Industrial Data lake is a real issue to the Enterprise in terms of query, create, maintain and storage effectively and secured way. On the other hand different stakeholders i.e. Business, IT and Policy team want to visualize the same data in different view to analyze, imposes constraints, and to place effective workflow mechanism for approval to the policy makers. In this paper author proposed property graph based governance architecture and process model so that real time unstructured data can effectively govern, visualize, manage and queried from Industrial Data Lake.

Еще

Data Governance Architecture, Property Graph Process Model, Near Real time Data Governance, Data lake Governance

Короткий адрес: https://sciup.org/15012567

IDR: 15012567

Список литературы Graph Based Data Governance Model for Real Time Data Ingestion

  • Graph Markup Language (GraphML) , Chaper-16 Ulrik Brandes -University of Konstanz, Ulrik Brandes, J¨urgen Lerner- University of Konstanz, Christian Pich-Swiss Re.
  • http://graphml.graphdrawing.org/specification.html-GraphML specification
  • Simplifying Data Governance and Accelerating Real-time Big Data Analysis in Financial Services with MarkLogic Server and Intel. White paper 2014
  • https://github.com/thinkaurelius/faunus/wiki
  • Manuika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., and Hung Byers, A. (2011): “Big data: The next frontier for innovation, competition, and productivity”. McKinsey Global Institute (MGI).
  • McAfee, A. and Brynjolfsson, E. (2012): “Big data: the management revolution”. Harvard business review, 90(10), pp. 59-68.
  • T.H. Davenport, P. Barth, and R. Bean, “How 'Big Data' Is Different,” Sloan Management Rev., vol. 54, no. 1, 2012, pp. 43-46.
  • Weber, K., Otto, B., and ¨Osterle, H. 2009. One size does not fit all—a contingency approach to data governance. ACM J. Data Inform. Quality 1, 1, Article 4 (June 2009), 27 pages.
  • C. Beath et al., “Finding Value in the Information Explosion,” Sloan Management Rev., vol. 53, no. 4, 2012, pp. 18-20.
  • Alves de Freitas, P.; Andrade dos Reis, E.; Senra Michel, W.; Gronovicz, M.E.; De Macedo Rodrigues, M.A., "Information Governance, Big Data and Data Quality," in Computational Science and Engineering (CSE), 2013 IEEE 16th International Conference on , vol., no., pp.1142-1143, 3-5 Dec. 2013, doi: 10.1109/CSE.2013.168
  • Batra, V.; J. Bhattacharya; H. Chauhan; A. Gupta; M.Mohania; U. Sharma. 2002. ―Policy Driven Data
  • Saha, B.; Srivastava, D., "Data quality: The other face of Big Data," in Data Engineering (ICDE), 2014 IEEE 30th International Conference on , vol., no., pp.1294-1297, March 31 2014-April 4 2014 doi: 10.1109/ICDE.2014.6816764
  • J. Tee: Handling the four ’V’s of big data: volume, velocity, variety, and veracity. TheServerSide.com 2013.
  • M. Zhang, M. Hadjieleftheriou, B. Ooi, C. M. Procopiuc and D. Srivastava: On multi-column foreign key discovery. PVLDB 3(1): 805-814 (2010).
  • Mengjie Chen; Meina Song; Jing Han; Haihong, E., "Survey on data quality," in Information and Communication Technologies (WICT), 2012 World Congress on , vol., no., pp.1009-1013, Oct. 30 2012-Nov. 2 2012 doi: 10.1109/WICT.2012.6409222
Еще
Статья научная