Implementing Data Foundations Using Agile (Part 2) – Potential Approaches
Introduction
Effectively matching the development approach or method to the problem being addressed is critical to increasing the odds of a successful implementation. Agile, a prominent?development methodology in use today can be leveraged everywhere! But should it? Is it an effective match to all types of solutions? More specifically, is it the best method to develop and implement data foundational or data-backbone components, e.g., Master Data Management solutions, Operational Data Stores, or even Data Warehouse solutions?
In my previous article, “Implementing Data Foundations Using Agile (Part 1) – The Challenges”, I discussed five key challenges I’ve faced when implementing foundational data solutions leveraging the Agile approach.
Challenge #1 - Obscure view of complete picture of data requirements. This relates to how can the Data Team know requirements prior to when each is actually determined or needed by the other teams, i.e., how does it predict the future.
Challenge #2 – Data Solution facades – the Hollywood Western Town. This relates to the backup of data foundational work due to the creation of service facades to keep the other Agile streams moving forward while the Data Team tries to keep up.
Challenge #3 – Absence of foundational product solution and tooling. This relates to the unforeseen tool requirements caused by not knowing all of the functional tooling needed to address all (known and unknown) requirements.???
Challenge #4 – Suboptimal development sequencing of long running data solution components. This relates to “unnaturally” fitting the “natural” sequence of data foundational component development into the sprint-based discipline driven by the Agile model.
Challenge #5 – Data solution delays due to bi-directional mis-alignment of planning expectation. This relates to the inability of a Data Team to successfully implement functionality for which the need is surfacing during the same or a parallel sprint, and subsequently impacting one or more teams’ schedules.
This article, the second in the two-part series, discusses approaches and/or solutions I’ve leveraged or seen used to address the challenges listed above when implementing data foundational solutions leveraging the Agile approach.?
?
Necessary “Tweaks” To the Agile Approach / Method
At the basic level, each of these challenges can be traced back to a lack of understanding of what’s actually required from the data foundational components, i.e., the use cases to be supported and/or the functionality and capabilities to be enabled. Without addressing the functionality and capabilities gaps, each inevitably leads to missed commitments by the Data Team to the other Agile teams and the overall program.
I previously asserted that “The effective matching of the approach or method to the problem being addressed is critical to increasing the odds of a successful implementation.” Working on the premise that the challenges I discussed will continue to surface unless addressed, tweaks or alternative approaches to the Agile method are necessary in order to properly match it to data foundational development, the next question is “What are those necessary tweaks or alternative approaches?”
The alternative approaches I’ve seen work fall into five general buckets – most focused on providing the Data Services team with insight into what’s required prior to the actual functionality and/or capability being needed by the other Agile teams and/or the program as a whole.
?
Alternative Approach (Tweak) #1 - Hybrid approach. Iterative for some data sub-components, Agile for others, and maybe Waterfall for a couple of specific things:
The Hybrid Approach primarily focuses on three key activities supporting data foundational implementations: Architecture Definition, Requirements Gathering, and Technology Stack selection and deployment.
Having a good understanding of the requirements relating to the data foundational components, e.g., data sources, required ingestion and curation rules, required APIs, etc. prior to Agile development starting is critical to eliminating (or at least reducing) the “surprise factor” on the Data Team. Understanding the bulk (understanding all is not possible) of the required data foundational functionality gives the Data Team an obvious head-start on understanding what is needed. This gives it the ability to start development on the solution components, or at least on framework and code snippet development, thereby reducing the time needed during the Agile build phase to build out the required use case functionality in the timeframe required to support the other Agile streams.
Similar to requirements gathering, defining the supporting architecture prior to Agile development starting (driven to a large extent by the requirements gathered) is also critical to enabling the Data Team. Although there is a discipline referred to as “Agile Architecture” (taught to me by my mentor Darcy Lalor) in which you build out the architectural components as required, it doesn’t negate the need to understand, at least at a high level, the overall architecture of data foundational components. This definition of the architecture gives the Data Team the understanding and perspective necessary to ready itself to enable what will eventually be needed. Defining and understanding the architecture prior to development starting gives the Data Team the opportunity to prove out supporting technologies and make the proper selections – before the use cases requiring those capabilities are even defined. This allows the Data Team’s Agile implementation phase to focus on enabling the required capabilities rather than the underlying supporting architecture.
Understanding functional requirements within the context of a defined architecture drives effective technology selection and therefore increases the probability that the correct, and complete, technology stack can be selected prior to Agile development starting. This greatly reduces the chance that required functionality, uncovered during the Agile implementation, cannot be supported, or enabled thereby reducing the potential schedule impact due to the slowdown caused by the need to deploy unforeseen required tool functionality.
By starting the requirements gathering, architecture definition, and technology stack selection validation, prior to the actual Agile implementation phase of a project, the negative impacts of Challenge #1 and Challenge #3 will be greatly mitigated.
?
Alternative Approach (Tweak) #2 – Stagger the data component development relative to the development of the consumers leveraging that functionality:
Getting an early start on defining the data foundational architecture, even selection of the technology stack can arguably be part of the initial or early sprints. However, complete requirements gathering prior to full Agile development starting, is a tough sale, especially with all other Agile teams “Chomping at the bit” to get going.?What then?
An approach that I’ve used successfully is that of staggering the data foundational requirements gathering sprint and the sprint where the functionality and/or capability is actually required. Here the requirements gathering sprint precedes the implementation sprint by a number of program sprints. E.g.,portal data API requirements gathered in sprint 6, with the portal actually needing that API to be finished development not until sprint 8. Using this approach, the Data Team has (theoretically) enough time to build out the capability in advance of the timeframe required by the leveraging application, ?minimizing the need for a fa?ade to be developed and the challenges that evolve from it.
This approach achieves many of the benefits of knowing the requirements ahead of time without the need to determine all of them up-front and/or prior to the start of any Agile development. As such the negative impacts of Challenge #1, Challenge #2, and to some extent Challenge #4 will be largely mitigated. But what if the potential consumers of data foundational capabilities don’t know their requirements two or three sprints earlier? Many have difficulty knowing their data requirements as little as one sprint ahead of the need. What then?
?
Alternative Approach (Tweak) #3 - Parallel Data Team development steams. Fa?ade development team to support other Agile teams, build team to support Data Team comments.
The Agile Method by its very design looks at two-week[1] intervals of time to develop features and/or functionality, based on defined stories, that are required during that time interval. The idea of thinking three, four, or even a single sprint ahead of the need is somewhat contrary to the method. So how does the Data Team support the continued progress of the other Agile teams without running into the Western Town fa?ade challenge and the associated work backup challenges.
An approach that tends to address this situation is having two parallel development teams: one team focused on developing the necessary API and service fa?ades supporting the immediate progress of the other Agile teams, the other focused on the full implementation of the required capabilities to support the target solution. As requirements are tabled, one team works with the other API / service consuming teams to determine what is needed to keep them moving, and the other tackles the development and build-out of the full capabilities.
Governance between the two teams is critical. The team creating the fa?ade is essentially defining the integration and interface “contracts” between the data foundational component and the consumer of the services from that component. The fa?ade team must ensure alignment with the patterns defined by the architecture, and the development stream must align with the integration and interface patterns contained within the fa?ade. For example, the full functioning API must contain the same input and output parameters, with the same meaning, as those contained with the fa?ade API provided to the consuming Agile team. Failing this, rework and the associated cost and schedule impacts will occur on the consumer side.
This approach focuses directly on mitigating the negative impacts associated with Challenge #2, i.e., the backup of work caused by the creation of ?fa?ades. In addition, it will assist in reducing the negative impact of Challenge #5 by developing the target functionality in parallel with the fa?ades with two development teams rather than in sequential order with only a single team.?It should be noted however that this alternative will lead to additional costs relating to the larger Data Team.
?
Alternative Approach (Tweak) #4 – End-to-End Up-front Planning:
Regardless of the development methodology being leveraged, planning, at least at a high level, is critical. More specifically, end-to-end planning is critical. The end state has to be known and understood but more importantly, dependencies between program components must be understood, as well as the timing of those dependencies. Knowledge and timing of bi-directional dependencies is of particular importance to the team implementing the data foundational component – primarily due to the fact that most solution components tend to either be dependent on the data foundation to get data, or the data foundation is dependent on them to source data. Not knowing that a dependency is coming or not understanding it until the time of the sprint in which the dependency exists leads to very reactive development, and in the case of data foundational components, fa?ades.
Whether the program as a whole performs some form of end-to-end up-front planning or not, it is critical that the Data Team does it in order to understand its upstream and downstream dependencies. Lacking this knowledge, the Data Team will be in a continuous reactive state, unable to plan and “get ahead of the curve” with respect to enabling the required functionality and capabilities in support of the data foundational consumers. Equipped with the understanding of both upstream and downstream dependencies, the Data Team can plan its sprints and stories with these dependencies in mind enabling increased communications and more effective expectation management (Challenge #5).?
领英推荐
?
Alternative Approach (Tweak) #5 – Up-front Training:
Although not actually a tweak to the Agile method - Training is key! The Agile method and the development of data foundational solutions and/or components leveraging it are new to many data people. The necessary paradigm shifts can be difficult for even the most seasoned members of the data development teams.
Referring back to Challenge #4, there is an inherent dependency and/or coupling of the steps and components of a data solution. Data requirements are gathered, data is sourced, curated, modeled, stored, and consumed based collectively on a singular purpose to fulfill a data consumer need, i.e., the end-to-end development of a data use case. Fitting this “natural” sequence of data foundational component development into the sprint-based discipline driven by the Agile model to many seems contrary and “unnatural”. Up-front education and training are critical for the Data Team to understand how to fit the world they know, and are comfortable with, into two-week parcels. It may seem simple, but in reality, when stories have been defined, story points assigned, its anything but simple. ?
Training focuses directly on mitigating the negative impacts associated with Challenge #4. Without it, the necessary paradigm shifts at best will take too long to achieve and at worst never happen, either situation putting the overall program at risk.
?
Conclusion
I stated earlier that the effective matching of the approach or method to the problem being addressed is critical to increasing the odds of a successful implementation. Empirical evidence suggests the Agile approach requires some tweaks or alternative approaches in order to match it to the data foundational problem set. ?I discussed five alternative approaches (tweaks) that I’ve observed and leveraged, and how each worked in addressing the challenges I laid out in the previous article.
Alternate Approach (Tweak) #1 - Hybrid approach. Iterative for some data sub-components, Agile for others, and maybe Waterfall for a couple of specific things:
Challenge #1 - Obscure view of complete picture of data requirements. This relates to how can the Data Team know requirements prior to when each is actually determined or needed by the other teams, i.e., how does it predict the future.
Challenge #3 – Absence of foundational product solution and tooling. This relates to the unforeseen tool requirements caused by not knowing all of the functional tooling needed to address all (known and unknown) requirements.???
Alternate Approach (Tweak) #2 – Stagger the data component development relative to the development of the consumers leveraging that functionality:
Challenge #1 – See above.
Challenge #2 – Data Solution facades – the Hollywood Western Town. This relates to the backup of data foundational work due to the creation of service facades to keep the other Agile streams moving forward while the Data Team tries to keep up.
Challenge #4 – Suboptimal development sequencing of long running data solution components. This relates to “unnaturally” fitting the “natural” sequence of data foundational component development into the sprint-based discipline driven by the Agile model.
Alternate Approach (Tweak) #3 - Parallel Data Team development steams. Fa?ade development team to support other Agile teams, build team to support Data Team comments:
Challenge #2 – See above.
Challenge #5 – Data solution delays due to bi-directional mis-alignment of planning expectation. This relates to the inability of a Data Team to successfully implement functionality for which the need is surfacing during the same or a parallel sprint, and subsequently impacting one or more teams’ schedules.
Alternate Approach (Tweak) #4 – End-to-End Up-front Planning:
Challenge #5 – See above.
Alternate Approach (Tweak) #5 – Up-front Training.
Challenge #4 – See above.
?
As with Part 1, the purpose of this article was not to undermine the value or effectiveness of the Agile method or that it should never be used when implementing data foundational data initiatives, but rather to drive discussion relating to challenges (Part 1) when using it and some approaches / tweaks to address those challenges (Part 2 – this article). As I stated in the previous article, it sometimes appears to me that, from the perspective of matching the most effective methodology to problem set being addressed, using Agile may be trying to fit “a round peg into a square hole” when it comes to matching it to data foundational development. In this article I’ve attempted to demonstrate, based on empirical evidence and my experience, how to make that peg a little more square, and the hole a little more round.
I’d love to hear your comments or thoughts on the content - good or bad, agreement or counter views, additional challenges, etc. If you like it, great. But more interesting to me (and potentially to other readers) are comments and potentially differing opinions and views that will drive further discussion.
Thanks for your time, and I look forward to hearing from you!
?
Author:
Chris Todd, Chief Architect, Data Platform Services - IBM Consulting – IBM Canada Ltd.
?Projects will fail if we don’t get the data part right – and Chris has devoted his entire career to this. Chris is an award-winning?Chief Data Solution Architect and global thought leader in Information Architecture, data solutions, and Analytics – including strategy and governance – but is also able to “go deep” into design, and implementation. Chris has spent his career deploying foundational, and First-Of-A-Kind IA-based data driven solutions forming the data and analytics backbones to some of the largest organizations in Canada.?He has worked with customers, and IBM teams, in Canada and the United States, as well as in countries such as Taiwan, Spain, Scotland, and England.
As a recognized expert providing technical leadership in the architecting, designing and implementation of a vast array of data solutions, Chris is acknowledged as the “go-to guy” for anything data by many of his peers. Continually pushing the envelope to find new and innovative ways to address the world’s data needs, Chris is skilled in driving the implementation and delivery of complex data solutions that drive the enablement and realization of data driven use cases. By successfully addressing some of the most complex data challenges clients have faced, Chris has been a trusted advisor to IBM clients in a broad range of industries and sectors. Chris is also an acknowledged mentor to IBM as well as client personnel, taking pride in the development and growth of the next generation of data enablers.
?
Reviewers:
Wayne Pakkala – Senior Managing Consultant, Data Platform Services, IBM Consulting – IBM Canada Ltd.
[1] Agile implementations also tend to base work on a Planning Increments which can vary between 6-12 weeks. Each PI containing 3-6 two-week sprints.