The Truth Behind Healthcare ML Project Life Cycle

The Truth Behind Healthcare ML Project Life Cycle

All you want to know but haven't asked before is here:)

While the ML Healthcare field is actually booming these days, there are not so many good prepared managers who will be able to run the hard Ml-related projects in this field. But as you know, everything is hard only until you will disassemble it to the smallest pieces and will see that there is no magic behind there, just hard work and passion to make your project happened.

Today we will speak about the ML Project Life Cycle in general and how it differs in the terms of its application in the healthcare field. Starting from the real gound, it needs to be said that all the ML projects have quite standardized Project Life Cycle (PLC) which consist of the next stages:

  1. Planning and Project Setup Stage. Here you will think about the project as the one that will be your guided star for the next couple of month/years/tens of years and will answer a few basic questions like "why we should spend our time on it?", "what goals shall we put in front of us in the far far future and which of them we can achieve in the next year or two?", "who are our stakeholders and why should they care about this type of tool?" and so on. And if you lucky enough to find all the answers and eager to go further with this project you will need to think about allocating the needed resources for your project (both from the software/hardware part as well as human partners who will help you to make your project come true).
  2. Data Collection and Labelling the Data. Uh, this is when the show starts and you will have to pay a bit of your time, network connections and luck to do it, because as the data is the fuel for any ML project you need to be very very hard working to gather as much good day as you can. Using public available resources can be good in the terms of RnD part, however, going on the market with a valuable solution will cost you more specific and relevant data in a high volume. Think about how much of the raw real-life data you will be able to gather in the next year, who will help you to label the data with the highest level of precision and where you will find additional data quickly if that will be needed. Good network and ability to find partners can play a big role in this stage.
  3. Training and Debugging. Finally, you feel like "changing this world with the ML technology", because you will get into the real-code mode. At this stage, you will need to find the baseline for your project, learn about the State of the Art in your field and try to reproduce the achieved results, debug your model and get into the non-ending process of improving your model.
  4. Deploy and testing. Hurra, the birds are singing for you and you are really the one who can say that ML-luck was on your side, because you get on the stage where you can make your project happened in the real world. And while it will need continuous improvement, testing to prevent regressions and re-thinking some concepts, this stage indicated enormous work that was done by the team and great progress in terms of the going-to-market strategy.

Now you might think that 'this is really easy, why is everyone talking that this is a magic trick to run Ml projects? And here is a bit of bitter truth - in reality, all their stages are sporadically interconnected with each other, and moving from one to other won't be a kind of the waterfall, rather it will be a net of the drunk spider.

A great level of ML-projects uncertainty will affect the moving on the PLC and you have to be flexible enough not to stop this natural process, but structure enough not to lose control about what's going on on the project. There is no silver bull that will allow you to be 100% prepared for the possible ups and downs, but talking in general - a good prepared and labeled data, right allocated resources, and ability to accept the possible mistakes on the go increase the chances for success in 2X time.

Now let's come back to healthcare ML projects, what's the difference? In general, it is quite the same with some additional points to consider:

  1. Planning and Project Setup Stage. Here you will need to be really close to the main stakeholders - doctors and patients. And as close you will be as better it will be for the results of your project because it is maybe the only case when Google won't help you to solve all your issues. You will be in the Sherlock Holms mode for the new few months to gather all the details of your possible project, evaluate the needed metrics of your success and find the way how to deliver them in a case of luck. Double-checking of your ideas and additional advisors in a variety of similar fields will be your Nord Star at this stage, so pay very close attention to them.
  2. Data Collection and Labelling the Data. It's not a secret that personal healthcare data is the one that patients and hospitals will want to share with somebody the least. Even the one who wants to help the both of them. There are publicly available datasets like the ones from Standford University, but they are too sporadic and heterogeneous to use for the real-life tool. Your work here will be to prove clinics or the private doctors from your team to help you find and label images for training and testing purposes, but as you know doctors are pretty busy guys, so be patient to work quite a long before getting a first thousand of data pieces to train your model. And one more note here - even if you have the luck to find the top-experienced doc to label your images, but he agrees to make it for you only one time - think twice about this step. It's better to find someone not so experienced who will label you all the dataset, than the one, who will be able to help you only with 100 pieces of data. Stability on the labeling here is more than welcomed.
  3. Training and Debugging. You will spend a lot of time with your team reviewing the latest articles in this space to find the best-achieved results and used methods. Just remember - statistics in the articles in the majority of cases have nothing in common with the real-life tools. And while it can be a good guide for you in developing your model, think twice about the baseline that you will use and make it more grounded. And remember the simple truth - going from 70 to 90 percent of the accuracy is simpler than move from 90 to 91%. So be precious and transparent about the time spend on each of the stages not to make your team burn out in a matter of months by flagging them the wrong time parameters.
  4. Deploy and testing. Testing, testing, and testing. Even if everything will look perfect you have to test it one more time. You make something that can affect the quality of life, so if you feel that some pieces are missing and you need to train your model further - do it before going into the real world.

As you can see, this simplicity hides a very difficult work. But you can do it, if you will be curious enough to beat the problem and if you will remember that you are a real hero because you bet the problem that will save a lot of human lives. I believe in you! :)

要查看或添加评论,请登录

社区洞察