What are the top challenges around working with machine learning algorithms?
Photo: Getty Images

What are the top challenges around working with machine learning algorithms?

This article was an early beta test. See all-new collaborative articles about Machine Learning to get expert insights and join the conversation.

The goal of machine learning is to identify relationships and patterns by analyzing data and then apply that understanding to new data. This is achieved by building models, which can then be used to make predictions about new data points. However, there are some challenges involved in working with machine learning algorithms. Here are some of the most common issues that people face.?

1. Data preprocessing: Machine learning algorithms require data to be in a specific format in order to be able to learn from it. This means that if the data is not in the required format, it can be time-consuming and difficult to preprocess it. Machine learning algorithms may also be overwhelmed when there is a large quantity of data to be processed.?

2. Choosing the right algorithm: Because there are a variety of machine learning algorithms out there, each with their own strengths and weaknesses, it may be difficult to know how to choose the right one for your needs. And at times, the algorithm that you are hoping to use can be extremely costly.?

3. Tuning parameters: The performance of an algorithm can be affected by the values of the parameters. In turn, outcomes may vary based on which parameters you choose to use with a specific algorithm. It can be challenging to know which values to use for the parameters, and time-consuming if you are testing multiple parameters.??

“Simplifying procedures, such as dimensionality reduction and feature engineering, will allow you to get better results under the right conditions [...] Creating a very complex, intricate and detailed model can make that model fragile and unwieldy, and end up not generalizing well to unseen data.”

Josep Ruiz has been a data scientist at Nasdaq for the past two years. He holds a masters of engineering from Johns Hopkins University.?

4. Evaluating the algorithm: It is important to know how well the algorithm is performing once it is running. This can be difficult when you are working with a number of evaluation metrics and you are unsure of which to choose. In addition, evaluating results can be difficult if the dataset is not representative of the real-world, and the algorithm’s prediction may not be accurate.

“ML solutions are no different than engineering simulations, FMEA reviews, or other technical studies/analyses -- it's not enough to do the task correctly, you have to ask the right questions to have a shot at getting the right answers, otherwise you are providing the correct answer to the wrong question.

Bart Kemper has been the principal engineer at Kemper Engineering Services, LLC for the past 16 years.?

Explore more

How this article was made: An AI generated an initial answer to the question addressed in this article. The response was then fact checked, corrected, and amended by editor Felicia Hou . Any errors or additions? Please let us know in the comments.

Scott Beach

AURELIUS Operations Advisory | Director Technology - North America | Private Equity | Business Transformation | Value Creation

2 年

It all comes down to data. If we want to accelerate the value of machine learning, we have to feed the beast. Without historical data (including video and outcomes/decisions), we must accept that decision quality will be less than desired for an extended period of time. My advice is this: if you have data that can be used to train your algorithms, find it and store it for training even if you do not have an ML program in place today. After all, unless you have a Time Machine, it is extremely difficult to back-generate training datasets.

回复
Nimish Sanghi

Founder & Partner - AI & Data Science at Cloudcraftz | Founder & Board Member at SOAIS | Co-founder & CTO at zipperHQ | Author | Mentor

2 年

As someone consulting clients to make use of data/ML in their organizations, one thing I see clients/decision makers struggle with is the way to get started - build data pipelines first or do small pilots to get a confidence before making significant investment. I always try to steer them towards somewhere in the middle. Both ends in extreme do not work. Building data pipeline first can be long and fruitless till you know what data you need and why. Doing pilots with small disjoint datasets has high chances of failure with early wrong conclusions. Neither success nor failure of pilot is indicative of the possible outcome of actual initiative in a larger, practical context. The other thing I see with many customers is to expect a ML solution to progress like a standard software project – “just call all these great AI/ML commercial apis from big commercial cloud providers and get done with my ML initiative…”.?

Michael A. Covington, Ph.D.

Director of Research at FormFree?

2 年

It's all in the choice of features. There are many different algorithms that will "learn" more or less the same thing from the same data. The important thing is to give good data and ask important questions.

回复
Bart Kemper, P.E.

F.ASME, F.NSPE, F.NAFE, DFE, Int.PE

2 年

Why is my name used here. I was not interviewed for this.

回复

要查看或添加评论,请登录

Machine Learning的更多文章

社区洞察

其他会员也浏览了