Edition 4a: Data Management - Data Lifecycle, Types of Data, Data Strategy and Data Management Frameworks
Sitaram Choudary Yarlagadda
Data Technology Architect and Engineer Capable of utilizing the People, Process, and Technology framework as well as the DAMA-DMBOK concepts to effectively create and manage mission-critical enterprise data platforms.
Data Lifecycle
??????????? In order to efficiently handle data resources, companies must possess a comprehensive comprehension of and strategize for the data lifecycle. Efficient data management involves strategic planning and a clear understanding of how a company intends to use its data. A strategic company will establish clear specifications for both its data content and data management needs. These include guidelines and requirements for use, excellence, regulations, and protection; a comprehensive strategy for structure and planning; and a long-lasting approach to both infrastructure and software creation.
?????????????????????? The data lifecycle encompasses the procedures involved in generating or acquiring data, as well as those involved in its movement, transformation, storage, maintenance, sharing, use, and disposal. Data undergoes several processes over its lifespan, including cleansing, transformation, merging, enhancement, and aggregation. The use or improvement of data frequently leads to the generation of new data, resulting in internal loops within the lifecycle that are not shown in the diagram. Data is seldom unchanging. Data management encompasses a series of interrelated procedures that are in line with the data lifecycle.
Data Lifecycle Management
·?????? Creation and Usage: Effective data management requires a comprehensive grasp of the processes involved in data production or acquisition, as well as the ways in which it is used. Data production incurs financial expenses. The value of data is realized only when it is used or put into practice.
·?????? Data Quality Management: Data Quality Management is essential to the core of data management. Data of inferior quality incurs expenses and poses risks, rather than providing benefits. Businesses often struggle with managing data quality due to the fact that data is typically generated as a result of operational procedures, and businesses often fail to establish clear quality standards. Given that many lifecycle events may affect the quality of data, it is necessary to include quality planning as an integral element of the data lifecycle.
·?????? Metadata Quality: Given that Metadata is a kind of data and companies depend on it to handle other data, it is essential to maintain the quality of Metadata in the same manner as other data.
·?????? Data Security: Data management encompasses the tasks of safeguarding data and minimizing the potential dangers connected with it. Sensitive data needs safeguarding at every stage of its existence, from inception to elimination.
·?????? Critical Data: Organizations generate a substantial amount of data, a significant proportion of it remains unused. Attempting to oversee every individual data point is unattainable. Lifecycle management entails prioritizing an organization's most important data and reducing the presence of data ROT (Redundant, Obsolete, Trivial data).
Different Types of Data
??????????? Data administration is further complicated by the existence of several data kinds, each with distinct needs for lifecycle management. Data may be categorized based on its kind (such as transactional data, reference data, master data, or metadata), or based on its content (such as data domains or topic areas), or based on its format, or based on the degree of security required for the data. Data may also be categorized based on the methods and locations used for its storage and retrieval.
??????????? Due to the varying needs, hazards, and responsibilities associated with various kinds of data inside an organization, many data management technologies primarily concentrate on categorization and control.
Risk
??????????? Data not only has inherent worth, but also entails potential risks. Data of low quality, characterized by inaccuracy, incompleteness, or obsolescence, inherently poses a danger due to its erroneous nature. However, data has inherent risks since it has the potential to be misinterpreted and exploited.
??????????? Organizations get optimal benefits from data that has the greatest level of quality, characterized by attributes such as availability, relevance, completeness, accuracy, consistency, timeliness, usability, meaningfulness, and comprehension. However, when it comes to significant choices, there are often gaps in our knowledge - the disparity between what we currently know and what we need to make a successful decision. Information gaps are weaknesses inside a company that may have significant negative effects on its capacity to operate efficiently and make a profit. Organizations that acknowledge the importance of valuable data may implement specific and proactive measures to enhance the quality and usefulness of data and information while adhering to regulatory and ethical guidelines.
The growing significance of information as a valuable resource in many industries has resulted in regulators and lawmakers paying more attention to the possible ways it might be used and misused.
Data Strategy
??????????? A data strategy should include strategic strategies for using information to gain a competitive edge and align with the objectives of the organization. The formulation of a data strategy should be based on a comprehensive comprehension of the data requirements that are inherent in the business plan. This includes identifying the specific data that the company requires, determining the methods for obtaining the data, establishing protocols for managing and maintaining its dependability over time, and devising strategies for effectively using the data.
Components
·?????? An intriguing and captivating perspective for the administration of data.
·?????? A concise business case outlining the importance of data management, accompanied by specific examples.
·?????? Core beliefs, ethical ideals, and managerial viewpoints.
·?????? The objective and overarching objectives of data management.
·?????? Proposed metrics for evaluating the effectiveness of data management.
·?????? The goals of the Short-term (12-24 months) Data Management program should be SMART, meaning they should be specific, measurable, actionable, realistic, and time bound.
This document provides detailed explanations of data management positions and the corresponding organizations, as well as a concise overview of their duties and authority in making decisions.
·?????? Explanations of the many elements and efforts of the Data Management program.
·?????? An organized and ranked plan of tasks with defined boundaries.
·?????? A preliminary implementation plan with specific projects and actionable tasks.
Deliverables
·?????? Charter: The key components of the data management are the overall vision, business case, objectives, guiding principles, measurements of success, crucial success factors, acknowledged risks, and operational model.
领英推荐
·?????? Scope Statement: Goals are set for a certain planning period, often three years, and the responsibility for accomplishing these objectives lies with the roles, organizations, and individual leaders involved.
·?????? Roadmap: Identifying distinct programs, projects, task allocations, and delivery milestones.
Data Management Frameworks
??????????? Data management encompasses a collection of interconnected functions, each with its own distinct objectives, tasks, and obligations. Data management professionals must consider the inherent challenges of extracting value from an abstract enterprise asset. They must also balance strategic and operational objectives, specific business and technical requirements, risk and compliance obligations, and conflicting interpretations of the data's meaning and quality.
??????????? Frameworks created at various levels of abstraction provide a variety of viewpoints on how to handle data management. These viewpoints provide valuable understanding that may be used to explain strategy, establish roadmaps, structure teams, and synchronize functions.
Strategic Alignment Model
??????????? The Strategic Alignment Model (SAM), developed by Henderson and Venkatraman in 1999, provides a conceptual framework that identifies the key factors that influence data management strategies.
??????????? The core of this topic is on the connection between data and information. Information is mostly linked to corporate strategy and the practical use of data. Data is linked to information technology and procedures that facilitate the physical administration of systems, enabling data to be easily accessed and used. The four core areas of strategic decision that include this idea are: business strategy, information technology strategy, organizational infrastructure and processes, and information technology infrastructure and procedures.
Amsterdam Information Model
??????????? The Amsterdam Information Model (AIM), similar to the Strategic Alignment Model, adopts a strategic approach to the alignment between business and IT (Abcouwer, Maes, and Truijens, 1997). The 9-cell model is characterized by its acknowledgment of an intermediate level that emphasizes structure and tactics, including activities such as planning and architecture. Furthermore, it acknowledges the importance of information transmission.
??????????? The architects of both the SAM and AIM frameworks provide a comprehensive explanation of the interconnection between the components, including both the horizontal aspect (Business / IT strategy) and the vertical aspect (Business Strategy / Business Operations).
References
Acceldata. (2022, September 7). How to Architect a Data Platform. Retrieved from acceldata.io : https://www.acceldata.io/article/what-is-a-data-platform-architecture
Amazon Web Services. (n.d.). AWS Well Architected Framework. Retrieved from aws.amazon.com : https://aws.amazon.com/architecture/well-architected/?wa-lens-whitepapers.sort-by=item.additionalFields.sortDate&wa-lens-whitepapers.sort-order=desc&wa-guidance-whitepapers.sort-by=item.additionalFields.sortDate&wa-guidance-whitepapers.sort-order=desc
Amazon Web Services. (n.d.). What is AWS? Retrieved from aws.amazon.com : https://aws.amazon.com/what-is-aws/?nc1=f_cc
DAMA International. (2024). DAMA-DMBOK: Data Management Body of Knowledge: 2nd Edition, Revised. Los Angles: Technics Publications.
en.wikipedia.org . (n.d.). Data Management Association. Retrieved from en.wikipedia.org : https://en.wikipedia.org/wiki/Data_Management_Association
Groover, M. (2021). Speed of Advance. Lion Crest Publications.
Hiltbrand, T. (2024, May 9). From Data-Driven to Data-Centric: The Next Evolution in Business Strategy. Retrieved from tdwi.org : https://tdwi.org/Articles/2024/05/09/PPM-ALL-From-Data-Driven-to-Data-Centric-Next-Evolution-in-Business-Strategy.aspx
Intrepid Tech Ventures. (n.d.). Understand your data asset. Retrieved from theintrepidventures.com : https://theintrepidventures.com/value-proposition/understand-your-data-asset/
Khan, S. M. (2024, May 9). The data product lifecycle: Getting the most out of your data investments. Retrieved from starburst.io : https://www.starburst.io/blog/data-product-lifecycle/
Roberts, S. (2023, April 18). Understand the four Vs of Big Data. Retrieved from theknowledgeacademy.com : https://www.theknowledgeacademy.com/blog/4-vs-of-big-data/
Rowshankish, R. L. (2023, July 31). The evolution of the data-driven enterprise. Retrieved from mckinsey.com : https://www.mckinsey.com/capabilities/mckinsey-digital/our-insights/tech-forward/the-evolution-of-the-data-driven-enterprise
Simon, B. (2021, July 21). Complete Guide to PPT Framework | Smartsheet. Retrieved from smartsheet.com : https://www.smartsheet.com/content/people-process-technology#:~:text=for%20IT%20%26%20Ops-,What%20Is%20the%20People%2C%20Process%2C%20Technology%20Framework%3F,maintain%20good%20relationships%20among%20them .
Tharran, A. S. (2023, October 22). The Evolution of Data Science: Past, Present, and Future. Retrieved from linkedin.com : https://www.dhirubhai.net/pulse/evolution-data-science-past-present-future-aditya-singh-tharran-bmmre/
Wikipedia. (n.d.). Agile Software Development. Retrieved from en.wikipedia.org : https://en.wikipedia.org/wiki/Agile_software_development
Wikipedia. (n.d.). Kanban_(development). Retrieved from en.wikipedia.org : https://en.wikipedia.org/wiki/Kanban_(development)
Wikipedia. (n.d.). Scrum (software developent. Retrieved from en.wikipedia.org : https://en.wikipedia.org/wiki/Scrum_(software_development)
Wikipedia. (n.d.). Scrumban. Retrieved from en.wikipedia.org : https://en.wikipedia.org/wiki/Scrumban
?#DataManagement #DAMA #DMBOK #DataDrivenCompany #DataDriven #BusinessStrategy #PPT #People #Process #Technology #Organization #Data #DataLake #DataWarehouse #Databases #OLTP #OLAP #BigData #Hadoop #AWS #WellArchitectedFramework #DataManagement #DMBOK #DataGovernance #DataIngestion #DataVisualization #DataProcessing #ETL #ELT #MasterData #Metadata #DataSecurity #Security #OperationalExcellence #Relaibility #Sustainability #CostOptimization #PerformanceEfficiency #Kenesis #DynamoDB #Redshift #RedshiftSpectrum #QuickSight ?#Trino #Iceberg #Parquet #S3 #Lambda #EC2 #ECS #EKS #VPC #SecurityGroups #Python #PySpark #Spark #SparkSQL #SparkStreaming #DataFrames #RDDs #CoudFormation #AWSConfig #MachineLearning #AI #AI/ML #DataEngineer #MLEngineer #LLMs
Catalyst for Creativity in Big Data & High Performance Computing
4 个月Knowing what all the data is before first move and after in real time is core to ALL data initiatives across the data ecosystem. Proven well with some of the Biggest. 100x better than anything on the planet. Takes just 30 minutes to know how data simplicity is delivered well and fast.