Big Data Architecture - Introduction

This article aims at developing the Data Architecture that aims at enabling the Enterprise Architecture and the Vision of an Organizational target to utilize all the data they are ingesting and the egress of their short-term or long-terms analytical needs, while making sure that they are addressing during the design phase of such data architecture for both directly and indirectly related stakeholder. Since all stakeholders have their relative interests to utilize the transformed data-sets.

This article also identifies most of the Big Data Architecture Roadmaps, in terms of smaller components by conducting a gap-analysis that has significant importance as Baseline Big Data Architecture, targeting the end resultant Architectures, once the distillation process of main Big Data Architecture is completed by the Data Architects.

Big Data Architectural Transformation Strategy

Let us take a Healthcare concern and later we will look at an Automotive Service provider as two examples from our contemporary enterprise structure. In this example our Healthcare concern has selected to undertake 360 degrees change in their architectural transformation to start utilizing all their Big Data utilization and processing needs. It is significant to recognize and address data management issues, since traditional data management is and was done using RDBMS solutions. All the data management, such as data ingestion and egress has been dealt with traditional SQL-oriented applications. A regulated and inclusive methodology as per compliance reasons to healthcare laws applicable to data management is in place for quite some time that has enabled the effective use of data to capitalize for both healthcare and other associated services on its competitive advantages.

The considerations/strategy towards altered path to introduce Big Data Architecture are including the following points.

  1. It is vital to establish a clear characterization of which application components out of the organizational system in the broader landscape, which will be utilized as the system of data storage, where the applications will be referencing from as Organizational Master Data Management.
  2. Establishment of an organization-wide norm that all Big Data System’s components (dealing with either data ingestion or egress after transformation), this will include software packages (Hive/Pig/Impala etc.), need to adopt prescriptive data models to have a road map that will be open for any new unforeseen needs of the organizational expansion in the future.
  3. Clearly recognize the diverse need of current organizational components on how data entities are employed by business functions, methods, and services. Such as Clinics Applications in relation to Pharmaceutical and other medicinal concerns of all related stakeholders of our healthcare concern?
  4. Clearly understand the ETL (Extract, Transform & Load) processes, by making sure that CRUD (Create, Read, Update & Load) is taken into consideration in terms of Big Data technologies, such as Hadoop/MapReduce.
  5. The level and complexity is to be determined, sine the Analytical needs, will also be introducing the use of tools, such as Tableau, Jaspersoft or the organization can also opt to use ‘R’ the Statistical programming language for data transformations, that are required to support the message passing between applications within this new Big Data system?
  6. Since Big Data system will be the requiring software to support data integration using Cloud to serve organizational customers (patients) and other stakeholders (medical equipment suppliers, pharmaceutical organizations) use of ETL tools, which can be developed in Java, if using opensource, in case of Microsoft C# or Scala using a NoSQL, such as HBase, Cassandra or CouchDB for data migration, data profiling tools, such as Hive, Pig or Impala to evaluate data quality?

To be continued...

Shehzad Manzoor

RFID & IoT Innovator | Secure Software Solutions Leader

10 年

Very precise and useful article, Good job Atif!

回复

要查看或添加评论,请登录

Dr. Atif Farid Mohammad PhD的更多文章

  • Quantum Computing - Foundational Start

    Quantum Computing - Foundational Start

    People have been curious about the next stage in computing, which is Quantum Computing. We're used to traditional…

  • GPT/LLM use in Remote Patient Monitoring... & Beyond

    GPT/LLM use in Remote Patient Monitoring... & Beyond

    #rpmgpt OmniAGI.ai has been working on LLMs (#rpmgpt) and has created an OmniSmart AI Agent to gather/process & train…

    11 条评论
  • LLM/GPT Hallucinations - We care.

    LLM/GPT Hallucinations - We care.

    We are in the era of "LLM hallucinations". These are a phenomenon that occurs when Large Language Models (LLMs)…

    3 条评论
  • Generative AI (LLM/GPT, etc.): Reality Check

    Generative AI (LLM/GPT, etc.): Reality Check

    The use of Generative AI can be significant in the enhancement for an organization using an Omnichannel..

    4 条评论
  • GPT & More - The Set Theory Implementation

    GPT & More - The Set Theory Implementation

    Set theory is a powerful tool to analyze and understand language models of any size. In a large language model, set…

    5 条评论
  • ChatGPT & the Role of Generative AI

    ChatGPT & the Role of Generative AI

    ChatGPT & more of such are based on Generative AI, which is an umbrella term encompassing an array of artificial…

    9 条评论
  • 2023 Cyber Security Brief

    2023 Cyber Security Brief

    The word “data” is being spoken in almost every industry, in every domain. What is data? It is something measured…

  • Democratizing Generative AI

    Democratizing Generative AI

    According to HBR Generative AI models are incredibly diverse. They can take in such content as images, longer text…

    4 条评论
  • NFT - What, Why & More

    NFT - What, Why & More

    Hopefully the following article will give you a detailed comprehension, what NFTs are? Shall you buy/sell/create NFT…

  • Web 3.0, IPFS & PIE- NFT, Blockchain & Beyond

    Web 3.0, IPFS & PIE- NFT, Blockchain & Beyond

    IPFS or InterPlanetary File System is a P2P (Peer to Peer) Data Communication Protocol. Where PIE stands for Personal…

    3 条评论

社区洞察

其他会员也浏览了