Driving Data Science Initiative: a Simple Four-Stage Model
Photo by Erik Witsoe on Unsplash

Driving Data Science Initiative: a Simple Four-Stage Model

An “initiative” is defined as “a new plan or process to achieve something or solve a problem”, according to the Cambridge dictionary; and a “data science initiative” creates value from data: it may manifest into a series of data insights to guide company making strategic decisions, or data product to directly serve the customers like a recommendation engine. For Data Scientists, driving initiative is a common approach to creating impact inside the organization, and it is usually more complicated than just “writing code to build a machine learning model”.

In this article, I will share my thoughts about the general stages (or components) involved in driving a data science initiative, along with one of my previous projects as a case study. Hope this could shed light on those who are on the Data Science career path.

In my opinion, driving initiative usually involves the following stages (or components):

  • Vision?is “the act or power of anticipating that which will or may come to be”. It usually requires the initiative lead to paint a vivid picture of “what if” the solution was developed and how things would be different, what benefits it brings, and how it may change the organization. Having a vision is the first step to getting people excited and start to rally around you.
  • Sponsorship?refers to executive or senior management support, which is necessary to ensure the initiative resources (e.g. people, funding) be granted and help achieve high-level organizational agreement on initiative adoption once the solution is built. This is especially critical for initiatives started from a “bottom-up” fashion, where executive sponsorship is not a given.
  • Alignment?is to have members, partners and stakeholders understand, agree, and commit to their expected engagement. The expected engagement for team members might be to deliver an ML component, for partners to collect external customer feedback, or for stakeholders to sign off the solution release approval. Your initiative would have expectations for others, but others may not always be on the same page as you are for various reasons; so keep everyone aligned all the time.
  • Execution?is the stage where the promised vision gradually turns into reality. Although there is less ambiguity on goals and organizational risks at this stage, it is very important to set tangible milestones, check-in with team members to monitor progress, remove blockers, actively communicate progress, and deliver the milestones on time. This usually requires good project management skills and you want to make sure the initiative is healthy and moving in the right direction.

The four stages in driving an initiative

This four-stage model serves as a simple way of conceptualizing the complexity involved, and the stages can also be viewed as components as well, as each permeates across the initiative’s full life cycle: for example, Vision is not only established just at the beginning but also requires constant reinforcement; Execution not necessarily comes after everyone is aligned but one need to execute on the idea (e.g. proof-of-concept) even before the sponsorship is granted. It is also worthy to point out that different initiative types may have shifted focus needs in stages: “top-down” initiatives might require more effort on “execution”, while “bottom-up” initiatives may need an equal (or even more) effort in “vision” and “sponsorship”.

With the four-stage model introduced, let’s take a look at a case study, which is one “bottom-up” initiative about creating one data product inside a large organization. I mainly focus on the structural perspective rather than the technical side, to highlight how this initiative moves through these four stages. For those who are interested in the technical side, please refer to these patent applications (https://uspto.report/patent/app/20200005215,?https://uspto.report/patent/app/20200005412,?https://uspto.report/patent/grant/11,270,234) for more details.

Case study: the creation of “Skill Match Index”

It all starts with a vision

In the summer of 2017, I joined the LinkedIn Data Science team focusing on the Learning Solution business. LinkedIn Learning is an online learning platform that helps members to develop and build new skills through e-learning and online classes, and the brand was formerly known as “Lynda.com”. I was very excited about this opportunity, as Lynda.com helped me a lot back in graduate school: I learned many useful tools (e.g. SQL, Python) through high-quality online classes from Lynda.com, and eventually landed my first Data Scientist job. This is all thanks to Lynda.com and its close partnership with the university which offered me access to these online learning resources.

So, in the first month I onboarded and was amazed by the tremendous data available on LinkedIn (commonly referred to as the “LinkedIn Knowledge Graph”), one question came up: can I leverage such data to help students to land their first job smoothly?

My experience tells me that: learning something new is difficult, but what’s more challenging is knowing what to learn, especially for students with the goal to land a job right after graduation. I remember back then in 2012, only after reading hundreds of job postings and chatting with alumni, I learned that SQL was more important than C for data analytics, and Python was becoming more popular than Perl in the industry. Could there be a better solution for graduating students? What if we can inform the “skill gap” between the students with the industrial jobs? This would be fantastic.

So, I advocated the concept and convinced several onboarding Data Scientist peers to work together on this idea as part of the Data Science New Hire project. We developed a simple yet reasonable algorithm to quantify the skill gap, and the presentation was well-received (in the Data Science organization).

No alt text provided for this image

(A featured picture about LinkedIn Knowledge Graph)

The path seeking sponsorship

The Data Science onboarding project was a fun experience. However, to move this idea forward, we need to have sponsorship from the Learning organization, which grant the resource to start the initiative and explore applications. Our previous presentation was viewed as an interesting idea with an obvious issue: a student may only refer to such insight once a year, and with such a low-frequency engagement needs, it would not impact much on either the engagement or monetization metrics. So one key next step is to connect this vague vision toward reasonable applications that materialize into real business impact and get the right sponsorship.

Then we brainstormed with many, and eventually came up with the following three potential applications to address each team’s business target:

  • for the Product team, we could leverage the identified “skill gap” to provide better learning content recommendations for general members, to increase member engagement
  • for the Sales team, we could define the “skill match” as a unique insight to help education institution evaluate their student career readiness, to increase customer retention
  • for the Marketing team, we could leverage the “skill gap” to identify the right content to incorporate into the marketing email campaign, to increase marketing effectiveness

To seek sponsorship, we took two approaches in parallel:

  • On the indirect route, I assembled teams to participate in two organization hackday events to build prototypes to showcase different applications. Both teams include not only data scientists, but also frontend engineers, backend engineers, and designers so that we can create something that “works” and concretely share the application’s vision. It was pretty successful: one won the “Best Hack” award and the other wins the “Best Design” award. After the hackdays, our idea raised quite a few awareness across the Learning organization
  • On the direct route, we made three different presentation decks and tried hard to share them with executive leaders and related managers in respected teams. We got quite a few feedback: questions about how the algorithm works, questions about the data quality, and result reliability, and also received more clarifications about the business needs. These help us to further refine our vision and customize the applications.

These efforts were not made in vain; eventually, we got the sponsorship from the Sales executive. Now, this initiative is formally established, and we could allocate resources to work on it!

No alt text provided for this image

(One figure inside one patent application to illustrate the concept)

Better alignment with others

When executive sponsorship is granted, an initiative is supported from the leadership perspective, however, organizations usually have competing forces, especially when an initiative requires many other teams to engage and contribute. The same applies to our case.

Our initiative aims to build a data product to provide insights for customers, and it requires multiple partner teams to be involved: we worked with various Engineer teams who create upstream datasets to identify the proper data interpretation; we worked with the Insights team to ensure consistent communication on sales message. Moreover, we requested to sit in related client calls to collect first-hand information and feedback along with our sales partners.

It takes quite an effort to get the alignment across and I remember biking across the LinkedIn campus on a daily basis to chat with various partner teams. Every time I walked into conference rooms, I’d start with an introduction (about myself and the initiative), and then with more follow-up discussions and meetings. Eventually, we reached the status that other teams are 1) aware of our initiative and 2) supportive to provide the needed help.

No alt text provided for this image

(Source:?https://en.wiktionary.org/wiki/alignment)

Execution toward milestones

This is the time we fire up all cylinders to build up the data product! It doesn’t take much time for me to realize the following two project management skills are quite practical.

[Set milestones]?Although our algorithm design is advanced, it is neither realistic nor effective to directly go with the most complicated version. We need to deliver a reasonable solution with the required reliability to enable v1 launch, and then continuously iterate upon with more improvements. So, setting proper milestones for what is v1, v2, and more iterations is foremost important. In our case, a simple representation of milestones are:

  • Milestone 1 [by time A], deliver the v1 version that enables opt-in customers the insights to be refreshed quarterly
  • Milestone 2 [by time B], scale the v1 version to enable insights for all customers
  • Milestone 3 [by time C], improve the algorithm into v2 to address more edge cases and roll it out to all customers

[Proper delegation]?With a few team members joining forces, it’s also important to ensure each member’s work is in sync and delivers the expected results. It is impossible for me to know every execution detail, so proper delegation is the key. For our initiative, we have each team member owning one specific area with a concrete deliverable, for example, one owns the data foundation part, with the goal to create the right result dataset and pass specific quality checks; one owns the communication with the sales team, with the deliverable to educate on methodology details and consolidate customer feedback. Through delegation, each team member has their clear ownership and together we moved toward the milestones.

It was April 2018, the insights from this data product were announced at a summit for education institution customers, and the feedback was predominantly positive: this could be a game-changer for higher education institutions to understand student career readiness and curriculum effectiveness. Subsequently, such insights were incorporated along with our product offerings to better serve LinkedIn Learning’s mission. For me, it was dream come true: finally, I contributed to the product that helped me in the past.

No alt text provided for this image

(Presented the work as keynote speaker at the?2018 LinkedIn Data Science MeetUp)

Learnings and Looking Back

There are so many learnings driving this data science initiative from start to end, while at the high level, I found the following three elements much relevant and important:

  • Your strong passion?is a key hidden factor to make an initiative successful. Many times there are extra things that need to do besides those on paper (e.g. think about communication strategy, evaluate team dynamics), and passion would be the driving force to push yourself forward. Meanwhile, your passion also has a halo effect on the team: others would easily feel the active atmosphere and are more likely to collaborate if they know you, the leader, has a strong passion and belief in the initiative.
  • Clear communication?is very, very important. An initiative is usually not any person’s solo game, hence consistent and clear communication with both internal team and external people are necessary to make everyone aware of progress and what’s happening next. In our case, we compiled monthly newsletters to share initiative updates to inform all partners, and it was well-received.
  • Let go of your ego, and think for others first. It might be a cliche to say that “leaders need to think about how to make others successful”, but it is indeed the case even for technical leaders driving initiatives. If one does not delegate the right scope and ownership to others and make them rewarded proportionally, others would be less motivated and eventually negatively impact the team’s dynamics and progress. In our case, although I created the first version of the algorithm and had deep thought on architecture, that does not mean I should be the person taking the spotlight on algorithm improvement. Having everyone involved in the initiative being successful is real success.

Looking back, the story is always smoother than it was, and there’re many challenges in terms of reality. For instance, in the “vision” stage, I was uncertain whether my thought was only inspirational or non-practical, and something concrete may never be landed. In the “sponsorship” stage, there were multiple times I walked into a meeting with hope but left without it. In the “alignment” stage, at one time I was told the same idea was already done by another team (luckily found out later we just need to revise our name to avoid conflict). In the “execution” stage, it tickles me when the date is moving toward the milestone deadline but there were still unresolved bugs in the system.

Still, I had a lot of fun: meeting lots of great people along the way and bringing value to the organization. Even after moving to another different business organization a few years later, I still received “Thank you” emails telling me how insights from our data product win the customer back. Driving initiative is an essential skill for advancing one’s data science career, and I wish this article could be helpful.

— — —

The same content is also published on Medium and can be accessed?here. If you enjoyed this article, please help spread the word by liking, sharing, and commenting.

Here are also a few other articles you may be interested in as well:

Kutay Aydin

Senior Project Manager | Scrum Master | Data Scientist

1 年

Dear Pan. Having recently completed a data science bootcamp, I found your article on driving data science initiatives within an organization, particularly within the project management lifecycle, incredibly insightful. Your breakdown of the stages and the practical insights from your case study provide valuable guidance as I embark on my journey to gain data scientist experience. Your emphasis on passion, communication, and delegation aligns with the skills I aim to develop, and I would greatly appreciate any advice you could offer on how to gain the relevant experience needed to work at companies like Meta or LinkedIn.

You are a very good technical writer, I will have to read more of your work and snag some tips.

Xiang(Stacy) Li

Data Science @ LinkedIn

2 年

Thanks for sharing Pan! I found this four-stage model tremendously helpful.

Heidi Kj?r

Data Engineer at Tryg

2 年

The proces of alignment repeatedly (ongoing) is so important. People might have unexpected expectations of the outcome - and the whole project might end up the grave after a lot of effort has already been put into the project.

Muhamad Ikhwan Fauzan

berusaha untuk beradaptasi

2 年

manusia dan obyek yang perlu dilakukan big data ??

要查看或添加评论,请登录

武攀的更多文章

  • Horizontal Innovation in Data Science

    Horizontal Innovation in Data Science

    Innovation is a key driver of progress and can be found in every field, taking on various forms. In the context of Data…

    8 条评论
  • Research Mindset for Data Scientist: the “First Principle” Thinking

    Research Mindset for Data Scientist: the “First Principle” Thinking

    “Do I need a Ph.D.

    5 条评论
  • Problem Solving as Data Scientist: a Case?Study

    Problem Solving as Data Scientist: a Case?Study

    There are two myths about how data scientists solve problems: one is that the problem naturally exists, hence the…

    16 条评论
  • Learnings from Parenting: Clockwise

    Learnings from Parenting: Clockwise

    My son helps me to notice an unconscious egocentric bias and to become better at work One day, I opened a new toy for…

    7 条评论
  • How to innovate in Data?Science

    How to innovate in Data?Science

    Sharing my thoughts on innovation, along with one favorite Data Science project in 2014 Millions of innovations happen…

    12 条评论
  • Building my 2020 reading list with a simple Python script

    Building my 2020 reading list with a simple Python script

    Using Requests and BeautifulSoup, I extracted the historic New York Times Best Sellers (Business topic) lists to enrich…

    43 条评论
  • Using Git in Data Science: the Solo Master

    Using Git in Data Science: the Solo Master

    Git is a very popular version control system for tracking changes in computer files and coordinating work on those…

    4 条评论
  • My First Data Science Project

    My First Data Science Project

    Why am I writing the post As one who is in the Data Science field for a while, I received quite a few questions from…

    8 条评论
  • One Suggestion for Uber's Airport Pickup Experience

    One Suggestion for Uber's Airport Pickup Experience

    Why am I writing the post Yesterday, while I was at SJC airport and ready to Uber back home, one thing strikes me that…

    24 条评论

社区洞察

其他会员也浏览了