Building Big Data Center of Excellence with IBM Cloud and Hadoop
Image taken from https://www.ibmbigdatahub.com/blog/building-big-data-center-excellence

Building Big Data Center of Excellence with IBM Cloud and Hadoop

I often ask customers what inhibits big data initiatives in their organization. Frequent answers include: no compelling business need, or difficulty identifying use cases; lack of data science skills; not enough staff to support them; and the complexity of collecting and managing the data. The concept of a center of excellence (CoE) for big data, which I attempt to demystify here, helps ensure these responses are not inhibitors in any organization.

The key to a data-driven business is in bringing data and insight to all workflows in the business and integrating it into the decision making at every step. This approach enables organizations to take advantage of the longitudinal analytics available with new technology advances such as Hadoop and Spark as well as machine learning for past-, present- and future-looking analytics simultaneously.

Defining big data centers of excellence

A big data CoE is a framework that takes an organization from zero knowledge to having a fully functional practice of Hadoop, Spark and emerging open source technologies to deliver robust business results. A CoE is where organizations identify new technologies, learn new skills and develop appropriate processes that are then deployed into the business to accelerate adoption.

A centralized big data CoE can be the bedrock for establishing a data-driven company that treats data as a strategic asset. The big data CoE can partner with the business to identify data that is invaluable, explore use cases that differentiate its products and services in the market and help jump-start the business with insights that can yield real-time client value. Data’s strategic importance is the value it represents for the business, but success with big data is not just about data. The people and the organization also play a vital role in that success:

A) Building big data success stories with Use Cases

In many cases, the business comes up with the use cases, but the CoE has the responsibility of facilitating this work. The CoE needs to assume a leadership role in understanding which applications and use cases can be driven with available sets of data sources. Sometimes businesses can be more proactive by bringing use cases to the CoE because the list of use cases can be overwhelming and put a strain on available resources. A transparent process for prioritizing these use cases is important and should be adopted. The CoE needs to prioritize use cases based on parameters such as ease of data availability, data quality, business revenue–based value and impact, costs and risks.

B) Applying agile methodology—the fail-fast approach

Agility and the ability to fail fast are essential to reaching the potential of big data. A lightweight agile process provides tools to deliver outcomes quickly and transparently, typically within two- to three-week sprints. The ability to fail fast is a key big data opportunity; business and technical roadmaps for delivering value need to change more often than in a traditional waterfall environment.

Data itself is also highly agile when it is collected in native form and transformed potentially many times to meet the needs of different use cases. Using the basic ideas of agile development methodology, a CoE can provide the leadership across the organization to ensure business users can quickly gain value from the data.

C) Developing financial models

At the heart of a big data CoE is creative financial models that support the innovation. The charge-back strategy can be a function of data as a service, insights as a service or analytics as a service.

As is often the case with shared services, a charge-back model is necessary to properly handle the maintenance and growth of the emerging technologies, which in this case can be Hadoop and Spark clusters. An organization needs to develop a charge-back model for the business units that will be engaging with the CoE for project, personnel, infrastructure and application resources. Some important questions need to be considered when determining the charge-back model for business units: 

  • How many users will access the application and cluster?
  • How much data will be ingested initially?
  • How much data growth is expected over time?
  • What is the data retention policy? 

Business leaders and decision makers acknowledge that creating a data-driven organization requires a change of culture. Big data CoEs can be the key to this culture change. An important recommendation for building a CoE framework is starting with a small, secure data lake—a Hadoop- or Spark-based service—that can store and process data from various internal groups to support multiple use cases. When building a data lake, organizations learn and employ operational best practices for a number of processes: 

  • Cluster build out
  • Data exploration
  • Data ingestion and processing
  • Disaster recovery
  • General operations and maintenance
  • Hadoop and Spark development
  • Infrastructure integration
  • Model building and testing
  • Multitenancy and security
  • Third-party software evaluation and integration
  • Use-case evaluation 

A leading telecommunications firm, for example, began by developing a CoE that asked each business division to come up with business use cases that would generate powerful insights through analytics. It then established regular training boot camps in which business users learned how to use data with self-service tools, and it created a community of data scientists and data engineers to support line-of-business managers in their analyses and to validate findings. As a result, this CoE enabled big data as a shared service that opened up the conversation for creative financial models that involve charge backs and show backs.

Leveraging big data centers of excellence

I foresee creative CoE adaptations such as the one just described helping businesses move beyond the hope of becoming a data-driven organization enabled by big data to the reality of an organization using a data-ingrained business model.

The article has been adapted from my original post at IBM Big Data Hub. The Big Data Hub is created and curated by IBM. It is the home for current content and conversation regarding big data and analytics for the enterprise from thought-leaders, subject matter experts and big data practitioners. 




要查看或添加评论,请登录

Karan Sachdeva的更多文章

  • Finding the Right Role in the AI Era

    Finding the Right Role in the AI Era

    The rise of AI is transforming industries, reshaping business models, and creating new opportunities at an…

    1 条评论
  • Who Is Responsible When AI Makes the Wrong Decision?

    Who Is Responsible When AI Makes the Wrong Decision?

    I was in London last week, meeting top executives in the AI space—leaders from enterprises, startups, and regulatory…

    3 条评论
  • Agentic AI: Revolutionizing Business Operations

    Agentic AI: Revolutionizing Business Operations

    According to Gartner, by 2028, about 33% of enterprise software applications are expected to incorporate agentic AI, up…

  • 2024: Moments that Matter

    2024: Moments that Matter

    To accomplish great things, we must not only act, but also dream, not only plan, but also believe.-Anatole France(poet,…

    2 条评论
  • 5 AI Skills to Master in 2025

    5 AI Skills to Master in 2025

    Artificial intelligence continues to reshape businesses across every industry. As we move toward 2025, the skill sets…

    3 条评论
  • 2025: Three Big Bets in Technology

    2025: Three Big Bets in Technology

    Much to my wife’s chagrin, I’ve always enjoyed putting a bet or two in casino.unlike most gamblers, I win a lot more…

    2 条评论
  • Finding Your Voice in 2025

    Finding Your Voice in 2025

    “Speak your mind, even if your voice shakes” - Maggie Kuhn, American social activist. "There is no greater agony than…

  • 5 Mind-Bending Use Cases of Generative AI

    5 Mind-Bending Use Cases of Generative AI

    Generative AI has quickly emerged as a transformative force, unlocking creative and operational possibilities across…

  • Open Source: The Unsung Hero of the Generative AI Revolution

    Open Source: The Unsung Hero of the Generative AI Revolution

    The generative AI revolution, a phenomenon that has transformed industries and redefined human-computer interaction…

    1 条评论
  • 5 Key Bets to Close Q4 Strong

    5 Key Bets to Close Q4 Strong

    As we approached the last quarter of the year, I found myself standing on the edge of both anticipation and reflection.…

    1 条评论

社区洞察

其他会员也浏览了