Big Data Architecture - Introduction
Dr. Atif Farid Mohammad PhD
Chief AI | Cyber Security | Officer | AI Advisory Board CapTechU | AI/ML/Quantum Computing | Chair | Board Member | Professor, Adjunct
This article aims at developing the Data Architecture that aims at enabling the Enterprise Architecture and the Vision of an Organizational target to utilize all the data they are ingesting and the egress of their short-term or long-terms analytical needs, while making sure that they are addressing during the design phase of such data architecture for both directly and indirectly related stakeholder. Since all stakeholders have their relative interests to utilize the transformed data-sets.
This article also identifies most of the Big Data Architecture Roadmaps, in terms of smaller components by conducting a gap-analysis that has significant importance as Baseline Big Data Architecture, targeting the end resultant Architectures, once the distillation process of main Big Data Architecture is completed by the Data Architects.
Big Data Architectural Transformation Strategy
Let us take a Healthcare concern and later we will look at an Automotive Service provider as two examples from our contemporary enterprise structure. In this example our Healthcare concern has selected to undertake 360 degrees change in their architectural transformation to start utilizing all their Big Data utilization and processing needs. It is significant to recognize and address data management issues, since traditional data management is and was done using RDBMS solutions. All the data management, such as data ingestion and egress has been dealt with traditional SQL-oriented applications. A regulated and inclusive methodology as per compliance reasons to healthcare laws applicable to data management is in place for quite some time that has enabled the effective use of data to capitalize for both healthcare and other associated services on its competitive advantages.
The considerations/strategy towards altered path to introduce Big Data Architecture are including the following points.
- It is vital to establish a clear characterization of which application components out of the organizational system in the broader landscape, which will be utilized as the system of data storage, where the applications will be referencing from as Organizational Master Data Management.
- Establishment of an organization-wide norm that all Big Data System’s components (dealing with either data ingestion or egress after transformation), this will include software packages (Hive/Pig/Impala etc.), need to adopt prescriptive data models to have a road map that will be open for any new unforeseen needs of the organizational expansion in the future.
- Clearly recognize the diverse need of current organizational components on how data entities are employed by business functions, methods, and services. Such as Clinics Applications in relation to Pharmaceutical and other medicinal concerns of all related stakeholders of our healthcare concern?
- Clearly understand the ETL (Extract, Transform & Load) processes, by making sure that CRUD (Create, Read, Update & Load) is taken into consideration in terms of Big Data technologies, such as Hadoop/MapReduce.
- The level and complexity is to be determined, sine the Analytical needs, will also be introducing the use of tools, such as Tableau, Jaspersoft or the organization can also opt to use ‘R’ the Statistical programming language for data transformations, that are required to support the message passing between applications within this new Big Data system?
- Since Big Data system will be the requiring software to support data integration using Cloud to serve organizational customers (patients) and other stakeholders (medical equipment suppliers, pharmaceutical organizations) use of ETL tools, which can be developed in Java, if using opensource, in case of Microsoft C# or Scala using a NoSQL, such as HBase, Cassandra or CouchDB for data migration, data profiling tools, such as Hive, Pig or Impala to evaluate data quality?
To be continued...
RFID & IoT Innovator | Secure Software Solutions Leader
10 年Very precise and useful article, Good job Atif!