How To Get SCRUM Done on a Hybrid Data Team
Our Data Team is an even split between Data Scientists and Data Engineers, for a total of about 12 people. As our team took shape, we needed to acknowledge that Data Science and Data Engineering have different life cycles. We wanted to be mindful of this as we implemented Scrum into our practices.
In the software development life cycle, engineers ideally should spend most of their time on design, implementation, testing and maintenance. These steps represent the tail end of the life cycle. Steps such as planning and analysis are typically owned by other teams in the organization, such as Senior Leadership, Product, UX or Project Management.
Therefore, the emphasis for Data Engineers centers around:
- Clear analysis
- Tech stack design
- Development
- Delivery
- Efficient support and maintenance
Conversely, during the Data Science development life cycle, Analysts and Scientists will spend most of their time on business understanding, data mining and data cleansing or exploration — the beginning of the cycle. This is the essential process to understand and frame problems to tackle. Once the problem is clearly identified and the data is structured in such a way to solve that problem, most of the work is done.
As such, the emphasis for Scientists and Analysts centers around:
- Clear problem definition
- Fast and effective data munging
- Feature engineering
- Quick and iterative model engineering
- Feedback from stakeholders
The big challenge is reconciling these life cycles and identifying the most effective way for the teams to work together. Data Scientists usually focus on talking to customers to identify problems; creating data profiles; modeling and cleansing data; and creating machine learning models. Many times, ad-hoc Data Science solutions need to be built out with the help of Data Engineers. These solutions can help with the automation, cleansing, execution, monitoring and alerting of the Data Science models.
How We Do It
Now that we’ve established the workflow of each team, let’s take a look at how we execute Scrum, starting at a high level and funneling down to smaller details:
- Yearly Goals
- Quarterly Planning and Initiatives
- Scrum Ceremonies
We’ve found that in standardizing our Scrum practices — that is, having both teams use the same ceremonies and cadence — we’re able to automate the process and reconcile the different development life cycles.
In addition, as the teams got more familiar with what to expect and when, it increased the developers’ ownership over the process and productivity by at least 20%.
Did you just read the word process and want to stop reading? We get it. But for us, process is not that bureaucratic contortionism that ends up slowing everything down and directly conflicts with Agile. We believe our process provides developers with the right balance of flexibility and structure. It’s helpful for them to know exactly what to expect, and when. And when something isn’t working, we can talk through it as a team and try something new. This is also helpful for the Product and Leadership teams; it allows them to have a simple handle on complex software development and predictable check-ins with the team, without having to micromanage.
1. Yearly Goals
Every organization should have yearly goals — and ideally, multi-year goals. Data Scientists and Engineers are smart, capable and responsible people. In order to be most the most effective at their jobs, they want to know where the company is going. In a perfect world, these goals will come in the form of OKRs or SMART goals — clear, measurable and backed by a “why.”
At our organization, goals are set top-down. Leadership identifies the strategic metrics we should be moving toward as a company, and then departments, teams and individuals huddle to create ideas on how to move the needle on those initiatives.
Once high-level company goals are set, our team drafts yearly goals and then splits those into quarterly milestones that we track during our bi-weekly team meeting (more on that later).
2. Quarterly Planning and Initiatives
The Data Team leaders (including our Product Owners) take a first stab at crafting quarterly goals, filling in as much detail as possible for the team to understand the why and just some of the what (initiatives) — but definitely not the how. Every initiative has two elements — content and process:
- Content: What you commit to achieving, such as writing software or implementing a visualization
- Process: The activities you perform to keep the work on track, such as having OKR check-ins
The goal over the span of the year is for the Product team to be at least a quarter ahead of Engineering/Scientists in terms of planning. In our business we also have to account for seasonality, but for illustrative purposes that has been left out of the diagram.
Each quarter, we hold a one-day planning session attended by both Scientists and Engineers. The team huddles around the initiatives and has representation from both sides as needed. Initiatives are brought to the team two weeks prior:
- A lead is assigned to each initiative.
- The team and the lead are in charge of studying the initiative, asking questions, developing potential solutions and defining the details in collaboration with others in the organization.
- Bigger initiatives may require as much as a whole quarter of a lead working part-time to make sure the solution is defined and approved appropriately before planning.
Outcomes of quarterly planning:
- Stories are created: The team breaks an initiative into mid-level stories (based on INVEST criteria) and points them.
- Commitment is made: The team identifies the commitment for the quarter and the priority of the stories. The commitment can be made based on past velocity. If velocity is not defined, assume approximately 4-5 points per person, per sprint.
- Sprints are staged and ready for work: The Product Owner and Team Leads stage all the sprints for the quarter and prioritize the work accordingly.
- Non-product work is considered: For each sprint, we allocate 10-20% of time to achieve personal or technical goals. This could include training, spiking, tech debt or a personal project.
Our motto? Things change, plan for the best and prepare for the worst.
At the end of the quarterly planning session, everyone should have a clear understanding of what is expected to be delivered by the end of the quarter.
It’s worth noting our viewpoint that Data Science work can and should be planned the same as other engineering work. Our team has the flexibility to follow processes that work for them; in the case of Data Science, we huddle around an experiment and establish set goals. Initial estimation of these experiments might not be accurate at first, but as the team gets more familiar with the data, industry and problems they are trying to solve, estimates become more accurate over time.
3. Scrum Ceremonies
One of the best things we did for the team was to establish consistent Scrum ceremonies. As mentioned earlier, the simple fact that the team knows exactly what to expect and when increased average productivity by at least 20%.
In addition to increased productivity, we also saw an increase in the teams’ ownership of the process. The developers know how to prepare for each ceremony and what to do during the ceremony. The presence of an Engineering Manager or Scrum Master was not needed as much — the team really started owning the process.
Our Scrum ceremonies include:
- Daily stand-up
- Sprint planning
- Sprint review prep
- Sprint review
- Sprint retrospective
- Backlog refinement
- Release planning
Our two-week sprints are staggered between the teams; when Engineers are on week 1 of the sprint, Scientists would be on week 2.
Example Sprint Schedule
Scrum Lessons Learned
Over time, we’ve tweaked our ceremonies to make Scrum work for us. Here are a few examples:
Team Agreements for Daily Stand-ups:
- Be mindful. Don’t tune out. Really listen to what your teammates say, and look for opportunities for pair programming or helping someone who is stuck.
- Please be on time. Be seated/standing and ready to roll.
- Please don't bring your phones, and if you have your laptop, close it. The whole point is to be actively listening to all of your teammates for the duration of the standup. (If you're on-call, have your phone, but please keep it out of sight.)
- Be as concise as humanly possible. Avoid big-picture stuff and implementation details. Let us know what you're working on today and what's blocking you from achieving that.
- When applicable, provide an ETA for when you expect to put up a PR for your ticket, and hold yourself accountable to that date.
Sprint Reviews:
We’ve found that Sprint Reviews are a great way to keep the whole team in the loop as to what the other team has delivered in the last sprint and what is coming up next. We use the “Sprint Goal” feature in JIRA, which guides the team in articulating what they will deliver at the end of each iteration. The Product Owner kicks off the Review by stating what the sprint goal was, and describing if we reached that goal or not. During Sprint Review the team demos what was accomplished. We also added in a Sprint Review Prep meeting the day before a Review. This usually takes only 5-10 minutes, and allows us to plan out who’s demoing what, in what order, and which stakeholders (if any) should be invited to the Review. The Scrum Master sends out an agenda the day before so everyone knows what to expect.
This is also the meeting in which leadership can attend to see progress, ask questions and hear the goals for the next sprint. Having this checkpoint with the team at the end of each sprint helps leadership stay up-to-date with the team deliverables and not micromanage their daily activities.
Retrospectives:
We’ve experimented with doing one-hour retros once a month, but now do 30-minute retros once every two weeks. We use funretro.io, which allows remote employees to contribute to the retro board and post their thoughts about the last sprint. The Scrum Master posts the notes to the team’s Slack channel, and then we review any action items at the next retro.
Release Planning:
As we began to deploy more frequently, we identified the need for a weekly release planning meeting. Attendees include any developers who have code to deploy, a QA rep, Product Owner and Scrum Master. We agree on the deploy schedule for the following week, assess risk, talk through the testing plan for each ticket and discuss rollback steps, should we need them.
Sprint Protector/ On-Call Rotation:
Because our on-call issues were fairly low volume, we added a second set of responsibilities to this role: Sprint Protector. The Sprint Protector takes care of any ad-hoc requests that come to the Data Team so that the rest of the team can focus on the sprint goals. This person is not expected to commit to sprint work for the week they are on-call, but are free to pick up tickets as time allows. The Sprint Protector is expected to track actions that take more than a few minutes to resolve; this helps the next person on-call and keeps track of recurring issues that might get addressed by a permanent fix.
No Meeting Days:
You read that right! We decided to make Wednesdays a “No Meeting Day” for the Data Team. This has helped to increase productivity, decrease context switching and give the Engineers and Scientists much-needed heads-down time.
Data Team Meeting:
This one-hour, bi-weekly meeting has become an important part of our team culture. It’s an opportunity for the whole team to come together (Leadership, Engineers, Scientists, Product Owner, Scrum Master and QA) and regroup on where we’ve been and where we’re headed. The meeting used to be held on Mondays, but we moved it to Tuesdays since energy tended to be low Monday mornings. During this meeting we review metrics (such as quarterly and yearly goals, sprint velocity and time to release), talk about what’s going on at the company and open it up to anything that people want to discuss. This meeting is truly about checking-in and bonding.
Make The Process Work For You
While software and Data Science development should be independent due the life cycle differences outlined at the beginning of this article, it is possible to smoothly integrate the two using tools and practices that emphasize speed, flexibility and performance.
Scrum is a tool that has helped us organize and collaborate in a systematic way, while still allowing us the freedom to be flexible. The process doesn’t have to be cumbersome. We like to think about the big picture first, and from there, funnel down to yearly goals, quarterly initiatives and finally day-to-day Scrum ceremonies.
That’s how we get Scrum done on our hybrid data team — we hope you can utilize some of these practices on your own teams.
Fractional Technical Product Manager AI & ML | Entrepreneur
5 年Corinne M. check this out
Data Scientist
5 年Great article Marina! I specially enjoyed reading about your No-Meeting Wednesdays.
Engineering Leader @ Calendly | Driving Innovation & Team Success
5 年Fantastic post!
Visionary Technology and Product Leader
5 年Very good outline of your processes - well done Marina Malaguti.