Edition 2: Introduction
Sitaram Choudary Yarlagadda
Data Technology Architect and Engineer Capable of utilizing the People, Process, and Technology framework as well as the DAMA-DMBOK concepts to effectively create and manage mission-critical enterprise data platforms.
Overview
In the ever-changing world of technology and information, data has become a powerful force that has radically changed the way we collect, analyze, and understand information. To develop a Data-Driven firm, corporate leaders must possess not just a comprehensive comprehension of the necessary modifications, but also a thorough awareness of the tactics that will push those modifications forward.
Traditional definitions of data highlight its function in reflecting objective information about the world. In the context of information technology, data refers to information that has been kept in a digital format. However, it is important to note that data is not exclusively restricted to digitized material, since data management concepts also apply to information gathered on paper as well as in databases.
Evolution of Data
The development of data has seen a remarkable transformation, progressing from its modest origins to its present prominence as a crucial discipline in both the corporate and academic realms. The origins of data may be traced back to the 1960s and 1970s when the fields of computer science and statistics started to converge. During this period, statisticians were among the early adopters who saw the immense potential of computers for doing data analysis. The development of tools and software for data processing and analysis was a pivotal milestone in the progression of data. Nowadays, data has become an omnipresent and influential factor in several sectors, guiding choices and fostering creativity. The widespread availability of large amounts of data, sometimes referred to as "big data," has required the creation of advanced data analysis tools and processes. Accurate data is now crucial for making well-informed judgments. Businesses use it to optimize operations, improve customer experiences, and identify innovative opportunities for growth.
Machine learning models, driven by artificial intelligence, are now omnipresent. They enable the adoption of predictive analytics, recommendation systems, and task automation, thereby transforming companies and services. The integration of artificial intelligence with data will become more seamless. AutoML will enable anyone without knowledge in the industry to easily use advanced analytics. The fusion of data with augmented reality will result in enhanced user experiences, encompassing several areas like gaming, entertainment, education, and training.
The influence of AI may be seen in domains like banking, healthcare, marketing, and other fields. The ongoing evolution of data presents tremendous potential for the future. Data professionals will have the capability to tackle complex problems at unprecedented speeds.
Data-Driven Organization
??????????? In recent times, companies, both domestic and international, have developed a high level of expertise in collecting, organizing, and using data to inform their strategic decision-making processes. Many companies use a combination of descriptive, diagnostic, predictive, and prescriptive analytics to enable decision-makers to successfully guide their operational path in an increasingly digitalized world. Enterprises that are anticipating the future have started a process of reformation to include a Data-Driven strategy. What does this entail, and how can you position yourself as a leader in identifying and executing this shift? Regarding a Data-Driven strategy, there are three main pillars: Data As A Product, Data As An Asset, and Data As A Platform.
Data As A Product
??????????? ??????????? A data product is a fusion of three elements: a profound understanding of data, expertise in a certain field, and the application of product management principles.
???????????
??????????? Once a mature data infrastructure has been established to collect, organize, and disclose vital information for digital decision-makers inside the organization, the subsequent stage is to determine if any of that data can be monetized. The process of monetizing data has traditionally been the purview of a relatively limited group of data brokers. These data brokers have concentrated their efforts on data points that are often requested by companies but are difficult for them to get. These data points are necessary for businesses to fill gaps in their data analytics efforts. As firms shift towards a business model that prioritizes data, entities other than data brokers are now considering whether they possess the necessary resources to provide their data as a product. Data monetization will have a greater impact on the development of an organizational strategy and will place a higher burden on data teams to verify the quality of data and provide safe mechanisms for accessing the information.
Data As An Asset
??????????? In accordance with generally accepted accounting principles (GAAP), the process of accounting for the data that is present on the company's books has proven to be difficult. The intrinsic significance of the data is brought into focus by the difference that exists between the book value of a company and its now market value. It is a difficult process to quantify the worth of your data and to determine whether your firm can transform that data into something that can be used to benefit the company.
??????????? To identify the worth of their data as a valuable resource, businesses are now expected to reevaluate this challenge and use creative techniques to accomplish this obligation. Within the context of this subject, there are varying degrees of responsibility for the asset accountability of the data. In the beginning phases, the value of raw data is not yet known, but it has the potential to provide value in the future. The value of the resource increases when the unprocessed data passes through a series of phases that include purification, merging, change, and augmentation. This is comparable to the processes that take place in a manufacturing plant between raw materials and the finished goods that are produced. In the same way that a manufacturing company provides value to real raw materials, Data-Driven businesses should be able to attribute value to the data at each and every degree of refinement.
??????????? During the process of data monetization, companies are required to evaluate if the raw data, as it moves through a number of stages on its way to becoming a finished product, just contributes value to the market or whether there is a chance to price and sell the data to third parties.
??????????? With the goal of gaining an understanding of data as an asset and determining its worth, it is necessary to concentrate on four crucial areas.
Data Capture: For the business to be able to drive its profitability, it will be helpful to understand what data is relevant and how it is gathered.
Data Store: After the data has been collected, it is essential to have a clear understanding of where the data is kept, how long it is retained for, and who has access to it.
Data Users: Identify the users who are responsible for data collection and the use of that data for business analytics, as well as the roles and responsibilities that they play within the organization.
Data Usage: In order for businesses to fully appreciate the potential that lies inside their data, it is essential to have a solid understanding of who uses it, how they use it, and how it is incorporated into their decision-making processes.
Data As A Platform
??????????? Enterprises do not directly sell their data, but rather offer the capabilities that come with models, such as huge language models, which have been developed, trained, and improved using their data. The data empowers the platform, and these firms are offering access to the functionalities linked to these platforms. Large language models (LLMs) are versatile and exhibit high performance over a diverse range of queries. Companies are using the notion of transfer learning to develop industry-specific models that excel inside a particular domain by leveraging these foundation LLMs. Organizations that prioritize data are exploring possibilities to develop platforms that use their exclusive data to expand their foundation and provide business enablement processes to their customers.
One option they may explore is enhancing the basic models with context-specific insights and data points, as well as implementing regulatory and security measures. Another option is to develop customized interfaces on top of their data-powered platforms to stand out from competitors.
??????????? A data platform architecture diagram visually represents the many components and service areas involved in efficient data management.
Data Ingestion Layer: The data intake layer establishes the connection between the source systems that generate the unprocessed data.
Data Storage Layer: The primary function of this layer is to store the data in order to facilitate its processing and analysis.
Data Processing Layer: The processing layer is responsible for cleansing and manipulating the data according to the specific requirements of the company.
Data Interface Layer: After the data is processed, it is sent to user interface apps where business executives may use graphs and charts to examine it and extract valuable insights that can inform decision-making in the firm.
Data Pipeline Layer: The data pipeline layer is the ultimate component of a data architecture plan, serving as the foundation for the whole process. The data pipeline layer is responsible for ensuring a continuous and uninterrupted flow of data across all tiers.
DAMA
??????????? The Data Management Association (DAMA), formerly referred to as the Data Administration Management Association, is an international non-profit organization that seeks to promote and enhance knowledge and methodologies related to information management and data management. The group defines itself as vendor-neutral and entirely run by volunteers. Its membership includes both technical and business experts. DAMA International, commonly known as DAMA-I, is the international division of DAMA. Additionally, DAMA has other branches at the continental and national levels worldwide.
DMBOK
??????????? DAMA has released the Data Management Body of Knowledge (DMBOK), which provides recommendations on optimal methodologies and a standardized terminology for managing corporate data. It covers subjects such as data architecture, security, quality, modeling, governance, big data, data science, and other related areas.
To create a data platform that can support a variety of use cases & business requirements and help a company become a Data-Driven organization, this paper will adhere to the DAMA-DMBOK guidelines while using Amazon Web Services (AWS) and other pertinent technologies.
领英推荐
Data Characteristics
Data characteristics may be categorized using five primary dimensions, often referred to as the Five Vs: Volume, Velocity, Variety, Veracity, and Value.
Volume
??????????? The primary attribute of data is its volume. Every day, a staggering amount of data, amounting to trillions of gigabytes, is generated globally, and this figure is expected to continue increasing in the future. Every day, a significant amount of data is generated from text, photographs, videos, and apps. This data volume is expected to grow much more in the future, particularly due to the increasing usage of mobile phones. With the exponential increase in data volume, there will be a need for innovative database management solutions and IT personnel to effectively manage it.
Velocity
??????????? Velocity, sometimes known as speed, is the second defining attribute of Data. The term "unprecedented" describes the exceptional pace at which data is produced and analyzed. If you send an SMS or publish anything on social media platforms like Facebook, Twitter, or Instagram, the data is processed immediately.? The process of processing and presenting information used to be time-consuming, but with the introduction of new technologies, it now takes next to no time.
Variety
??????????? The first and second attributes of Data are closely related to the third attribute, which is the Variety of Data. Data is characterized by its vast volume and rapid processing capabilities, while also exhibiting a diverse range of varieties: Structured, Semi-Structured, and Unstructured.?Organizations and people produce, and handle data based on their particular requirements, resulting in a diverse range of data on planet.??
Veracity
??????????? Veracity is the quality of being true or accurate, and it specifically relates to the correctness of data. Data veracity pertains to the reliability of the data and encompasses its precision and excellence.? Information rapidly becomes obsolete, and because of its proliferation, it is challenging to ascertain the veracity of what one encounters. That is why several high-ranking company executives are hesitant to base their judgments only on statistics. This also motivates data scientists and IT professionals to organize and analyze the appropriate data in order to use it accurately. The level of veracity directly correlates with the significance of the data, as it determines the extent to which the data may be analyzed and transformed into useful information.
Value
??????????? The fifth crucial aspect of Data is its intrinsic Value or importance. When data is appropriately organized and processed, it may be transformed into useful information.? In the contemporary Data-Driven environment, organizations that fail to have a data strategy are likely to lag behind their peers. Organizations that use their data have much higher profitability, as data yields crucial insights and consumer context. Contextualized data offers valuable information about client behavior, enabling businesses to optimize their operations and enhance service delivery.
?? ???????? Organizations have perpetually had to oversee their data, but advancements in technology have broadened the extent of this management need as they have altered individuals' comprehension of what data entails. These modifications have empowered firms to use data in novel ways to generate products, disseminate information, cultivate knowledge, and enhance organizational achievement. However, the exponential advancement of technology and the subsequent increase in human ability to generate, collect, and analyze data for significance has heightened the need to efficiently handle data.
Data and Information
??????????? The term "raw material of information" has been used to refer to data, whereas the term "data in context" has been used to indicate information.
?????????? Frequently, a hierarchical pyramid is used to illustrate the correlation among data (at the foundation), information, knowledge, and wisdom (at the pinnacle). Although the pyramid may provide insights into the need of effective data management, it also poses several challenges in the field of data management.
??????????? It is predicated on the premise that data inherently exists. However, data does not just exist. Data generation is required. The description of a linear process from data to wisdom overlooks the fact that knowledge is required to generate data first.? This statement suggests that data and information are distinct entities, yet in actuality, these two notions are interconnected and reliant on one another. Data and information are interchangeable terms, since data refers to raw facts and figures, while information is the processed and organized version of data.
??????????? Within an organization, it might be beneficial to establish a distinction between information and data in order to facilitate clear communication on the specific needs and expectations of various stakeholders.
Data As An Organizational Asset
??????????? An asset is an economic resource that may be possessed or managed and has the capacity to retain or generate value. Assets have the ability to be transformed into currency. Data is often acknowledged as a valuable resource for businesses, while the concept of managing data as an asset is still developing. In the early 1990s, some organizations questioned whether goodwill could be valued. Profit and loss statements typically contain "goodwill" as a line item. Similarly, while not generally embraced, the practice of monetizing data is become more prevalent.
??????????? In order to maintain competitiveness, businesses must refrain from relying on intuition or instinct when making judgments, and instead use event triggers and employ analytics to get practical insights. Being Data-Driven involves acknowledging the need to properly and professionally handle data, with the collaboration of business executives and technological experts. Moreover, in today's fast-paced corporate environment, change has become a need rather than a choice, with digital disruption becoming the standard. In order to respond to this, businesses must collaborate with technical data specialists and work together with their line-of-business colleagues to produce information solutions.
References
Acceldata. (2022, September 7). How to Architect a Data Platform. Retrieved from acceldata.io : https://www.acceldata.io/article/what-is-a-data-platform-architecture
Amazon Web Services. (n.d.). AWS Well Architected Framework. Retrieved from aws.amazon.com : https://aws.amazon.com/architecture/well-architected/?wa-lens-whitepapers.sort-by=item.additionalFields.sortDate&wa-lens-whitepapers.sort-order=desc&wa-guidance-whitepapers.sort-by=item.additionalFields.sortDate&wa-guidance-whitepapers.sort-order=desc
Amazon Web Services. (n.d.). What is AWS? Retrieved from aws.amazon.com : https://aws.amazon.com/what-is-aws/?nc1=f_cc
DAMA International. (2024). DAMA-DMBOK: Data Management Body of Knowledge: 2nd Edition, Revised. Los Angles: Technics Publications.
en.wikipedia.org . (n.d.). Data Management Association. Retrieved from en.wikipedia.org : https://en.wikipedia.org/wiki/Data_Management_Association
Groover, M. (2021). Speed of Advance. Lion Crest Publications.
Hiltbrand, T. (2024, May 9). From Data-Driven to Data-Centric: The Next Evolution in Business Strategy. Retrieved from tdwi.org : https://tdwi.org/Articles/2024/05/09/PPM-ALL-From-Data-Driven-to-Data-Centric-Next-Evolution-in-Business-Strategy.aspx
Intrepid Tech Ventures. (n.d.). Understand your data asset. Retrieved from theintrepidventures.com : https://theintrepidventures.com/value-proposition/understand-your-data-asset/
Khan, S. M. (2024, May 9). The data product lifecycle: Getting the most out of your data investments. Retrieved from starburst.io : https://www.starburst.io/blog/data-product-lifecycle/
Roberts, S. (2023, April 18). Understand the four Vs of Big Data. Retrieved from theknowledgeacademy.com : https://www.theknowledgeacademy.com/blog/4-vs-of-big-data/
Rowshankish, R. L. (2023, July 31). The evolution of the data-driven enterprise. Retrieved from mckinsey.com : https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/tech-forward/the-evolution-of-the-data-driven-enterprise
Simon, B. (2021, July 21). Complete Guide to PPT Framework | Smartsheet. Retrieved from smartsheet.com : https://www.smartsheet.com/content/people-process-technology#:~:text=for%20IT%20%26%20Ops-,What%20Is%20the%20People%2C%20Process%2C%20Technology%20Framework%3F,maintain%20good%20relationships%20among%20them .
Tharran, A. S. (2023, October 22). The Evolution of Data Science: Past, Present, and Future. Retrieved from linkedin.com : https://www.dhirubhai.net/pulse/evolution-data-science-past-present-future-aditya-singh-tharran-bmmre/
?
?