Machine Learning Workflow

Machine Learning Workflow

Machine learning is in the spotlight now, but has actually been around since the 1950s and 1960s. Over time a series of steps has been created which define the machine-learning workflow. Our first step is to define the question we want the solution to answer. We need to define this question in a way that guides the remaining steps in the correct direction. This requires thinking carefully about the goals we want to achieve, the data we need, and the processing we can perform. Once the question is defined, we can gather the data we need to answer the question. This can be tricky, because often we need data from many sources. Fortunately, Azure has tools that can make it easier to handle the processing of data from different sources. With data, we often have issues with the quality and cleanliness of the data. That is, data are often incomplete, inaccurate, and conflicting. So before we use the data in machine learning, we need to clean it. Azure Machine Learning has tools that can aid with data cleaning and transformation. But even with these tools, do not be surprised that you may need to spend a considerable amount of time massaging your data into the format you need. When you have the question defined and the data the way you need it; you can consider which algorithm to use. This is not an easy task, as there are many algorithms available. But if we properly define the question, the question will help us select the proper algorithm. As we will see when we build models, Azure Machine Learning supports a wide variety of algorithms which are optimized to work in the Azure environment. These are grouped according to the type of learning being performed and the type of results we want. Once we have the algorithm selected, we need to use a subset of the data we have to train the algorithm. This training process will result in creating a training model that predicts results on similar data, and like with the rest of Azure Machine Learning, setting up the training is a drag-and-drop operation. Once we have a training model, we need to test its accuracy on new data that was not used to train the model. We do this in Azure Machine Learning with built in modules that provide both graphical and numeric information on the performance of our algorithms. Evaluating the results will generate statistics which we can use to determine if the model will meet our requirements, or needs further refinement. If refinement is needed, it is often necessary to rework steps in the workflow. We may need to alter or get more data, change to a different algorithm, adjust parameters, and often, some combination of all of these. 

No alt text provided for this image

When using the machine-learning workflow, there are some important things to keep in mind. First, there's an inherent hierarchy of the steps with the earliest steps being the most important, since the later steps are dependent on them. That is, you need to correctly define the question for which you are creating a solution. Then you have to get the correct data which will allow you to train your algorithm to come up with the prediction, and only when you have a model trained can you evaluate its accuracy. When moving through the workflow, it's not unusual to have to return to a previous step. For example, as you work with data it may become apparent that you are asking the question incorrectly. And regarding data, data that you find will almost never be in the format you need. And expect to spend a considerable amount of time locating and transforming the data into a structure that you can use. Also, within reason, more data is usually better. Remember the mathematical equations needed to model your data may be complex with strange quirks. The more corner cases you can cover with your data, the better the model will be trained and the more accurate your results will be. Finally, try not to push a bad solution. It's easy to fall into the trap of thinking with just a few more tweaks, the stars will align and your model will start performing correctly. If you do find yourself in this situation, it's better to take a step back and ask yourself, do I have the right data, do I need to pre-process more data, do I need to do more pre-processing, or do we even have enough information to continue?

See you in my next Blog Which will be about Azure and Machine Learning.......:-)

要查看或添加评论,请登录

Rao Nisar的更多文章

  • Unlocking Business Potential with Microsoft Dynamics 365

    Unlocking Business Potential with Microsoft Dynamics 365

    Welcome to my new blog series! I'm Rao Nisar Ahmed, a developer and technical trainer, and I'm excited to embark on…

    1 条评论
  • BlockChain (Introduction)

    BlockChain (Introduction)

    In January 2009 we witnessed the birth of the first viable digital currency. Called bitcoin, this new form of money was…

  • Azure and Machine Learning

    Azure and Machine Learning

    So far, we have talked about the traditional machine-learning workflow in my Previous Blog of the series , which has…

    1 条评论
  • Types of Machine Learning (Part 2)

    Types of Machine Learning (Part 2)

    Machine-learning algorithms learn from data by utilizing one of two primary techniques, supervised or unsupervised…

    1 条评论
  • Machine Learning in Action (part 1)

    Machine Learning in Action (part 1)

    Hi Every one well come back to the series..

    1 条评论
  • Azure Machine Learning (Intro)

    Azure Machine Learning (Intro)

    Overview Hi, my name is Rao Nisar Ahmed. Welcome to my new blog of Series, Azure Machine Learning.

  • Version Control and Collaboration

    Version Control and Collaboration

    So, you might know Git if you've used it before as a version control software, whereby you can maintain multiple…

  • Java Script

    Java Script

    In this introductive blog, I am going to answer four frequently asked questions about javascript. What is JavaScript?…

  • Database First || Code First

    Database First || Code First

    Database First or Code First Well, in my opinion, I come back to you and your preferences. Anything you want to do its…

  • Code First Workflow

    Code First Workflow

    In this blog, I am going to show you Code First Workflow in action.so I am going to start with creating a simple domain…

    4 条评论

社区洞察

其他会员也浏览了