Kanban vs. Scrum vs. Agile: Where do we sit in the Data Science World?

Kanban vs. Scrum vs. Agile: Where do we sit in the Data Science World?

I recently had the pleasure of delivering some data science training, and like all training sessions, this one didn't disappoint me with discussion points and areas to think about. Specifically, we explored project management in Data Science, what it looks like, and which is the best to utilize.?

Firstly, let's stop to understand these project management styles from the title.?

Kanban – First developed in the 1940s by a Toyota engineer named Taiichi Ohno, the word in Japanese means' sign' or 'visual board.' Which gives a good clue as to how this technique should be used. Its flexibility and adaptability are something that many in the technology industry are drawn to. The key to the methodology is to help teams reduce bottlenecks, improve efficiencies, increase quality, and boost output.?

There are four main principles:

  1. Focus on now: fully understand the process already in place and what works/doesn't work.?
  2. Incremental approach: slowly changing the process over time and avoiding the big bang.?
  3. Keep roles: Work within roles your team already has, if you are a DBA remain a DBA; if you are a Data Scientist, remain a data scientist.?
  4. Encourage leadership: Everyone should be encouraged to bring forward new ideas or areas we can improve.?

Next, let's stop and explore Scrum. Scrum project management is an iterative and incremental technique for product delivery through feedback and collaborative decision-making. The methods are time-fixed, enabling costs to control requirements and use collaboration through specific feedback cycles to prioritize the product backlog.?

Some essential items of the Scrum technique:?

  • The product owner creates a product backlog, a wish list of items that the product should have.
  • The Scrum team conducts a sprint planning session where the tasks necessary to complete items on the wish list are broken down into small, more easily manageable chunks, and people estimate how long they will take.
  • The team creates a sprint backlog and plans its implementation.?
  • The team gets together every day for a brief Scrum meeting (often referred to as a Daily Stand up) where each team member shares daily updates, helping the team and the project manager assess the project's progress.

So where does this fall…. Enter Data Science and Machine Learning.?

We will use the CRISP-DM methodology (The Cross-industry standard process for data mining) for many Data Science projects.

No alt text provided for this image
Source: https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining

The problem is that often, we can't tell you how long it will take to understand the data – it depends on the history of the data, data quality, and the data's variability, to name a few. That's before we have that in the context of the business understanding – the problem I'm trying to solve, is it even solvable?

So, boxing these tasks can be challenging; it might take an hour, but equally, it could take a week. This is the first point where Kanban can be more conducive to the CRISP-DM approach. It focuses on the now – what is the business problem – but also respects the iterative nature of machine learning. Redevelopment of data elements through enrichment, as well as learning in an experimental nature. Using the board to represent where we are in the process visually also helps others understand while respecting that a data scientist should do data scientist tasks. Where this doesn't quite fit is the deployment aspect – How do I time my machine learning model to fit into the overall structures of the organization?

Here we are looking holistically at the problem. What I mean is that the machine learning model (Data Science aspect) may be iterative. However, it needs to slot into the upstream element of data feeds and, hopefully, the downstream part of data products being created for users. This is now not an isolated problem but has a broader team, from business users, data engineers, and IT deployment expects through to potential trainers of front-end users.?

No alt text provided for this image
ML Models sit within a stream, with parts upstream and downstream that must be considered in the wider environment.

This is an agile process, but wider teams need regular feedback, an understanding of their needs, and endpoints with regular feedback. There are also three specific roles, product owner, scrum team, and scrum master. This is interesting, as the product owner keeps the overall project flowing towards the data application/ product, ensuring a sense of "target" or definition of done.?

We can see a case for both in the world of Data Science. Enter stage right Scrumban - initially created in 2009 to propose a transition methodology from Scrum to Kanban and Lean.

Scrumban, the teamwork is organized into small iterations and monitored with the help of a visual board, Kanban boards. Teams working in similar areas or spaces use a visual board to represent where they are in the work. Planning meetings determine what User Stories to complete in the next iteration. The User Stories are then added to the board, and the team meets them, working on as few User Stories at a time as practical.?

So, there isn't one; different teams use different techniques, and understanding how your Data Science/ ML will fit within your organization means you can arrange your project management appropriately. The key to remember is the fluidity needed in something that can be experimental.?

要查看或添加评论,请登录

Abi Giles-Haigh的更多文章

  • Gartner: Collective Intelligence

    Gartner: Collective Intelligence

    This week I had the pleasure of attending the UK Gartner Data and Analytics summit 2024. The event is held at Excel…

  • The first 3 weeks of Oracle…. Finding a nice surprise

    The first 3 weeks of Oracle…. Finding a nice surprise

    I joined Oracle at the start of September, it’s been a crash landing into the world that is Oracle. I mean, the 3rd…

    1 条评论
  • Embracing the randomness

    Embracing the randomness

    A recent trip to the cinema to enjoy the new Oppenheimer film reminded me of some of the mathematics that came out of…

    2 条评论
  • Celebrating Neurodiversity

    Celebrating Neurodiversity

    After reading and being inspired by Debra Lilleyand her blog, I’ve been meaning to write a blog for a while, and world…

    14 条评论

社区洞察

其他会员也浏览了