Artificial Intelligence #8 - insights from simple algorithms like K-nn
Hello all
Welcome to edition #8
As expected, we crossed 10K members
Also, thanks for the insights and engagement on these posts
Before we begin, we are looking for an intern (my company feynlabs )
This will be a paid online position working with me. It’s difficult topic .. you need to have an interest in maths and AI. This is an intern – so not an expert. Can be based anywhere in the world
Last week, we had an interesting discussion in class about instance based algorithms.
- K nearest neighbour (k-nn) is an example of an instance based algorithm (not to be confused with k-means ie clustering)
- k-nearest is used for classification and regression
- input for k-nearest consists of the k closest training examples in data set.
- For classification, the output is a class membership. For regression, the o/p is the average of k nearest neighbours
- You could assign weights so that nearer neighbours contribute more than distant ones.
A diagram on K-nn (along with other algorithms as below) – source
So far so good
K-NN is actually a simple algorithm because you simply look at your nearest neighbours and find the class or a predicted value (depending on classification or regression problems).
However, k-nn has some interesting characteristics
- There is no explicit training. The act of exploring the contribution of neighbours is a training of sorts
- For the same reason, K-nn is sensitive to the local structure of the data.
- Finally, k-nn is an example of a non-parametric algorithm. This needs some explanation
- parametric algorithms have a fixed number of parameters. Parametric algorithms make stronger assumptions about the underlying data. Linear regression is an example of a parametric algorithm (ex: gradient m and offset c are examples of parameters)
- In the case of non-parametric algorithms, the number of parameters are not fixed – but rather grow with the data. Non-parametric algorithms make fewer assumptions of the data. K-nn is a non-parametric algorithm. Decision trees can also be thought of as a non-parametric algorithm.
- As a side note, the term 'non parametric' is a misnomer. Its not that we have no parameters as the term implies, rather, the parameters are not fixed.
So, when would you use K-nn?
One of the interesting uses of K-nn is when you have small datasets which are homogenous and noise free. K-nn learns from memory i.e. learns from previous examples which maybe seen as ‘from memory’
This simple technique can be used for complex problems where these characteristics exist(small data but clean and noise free data) ex in bioinformatics Prediction of protein subcellular locations using fuzzy k-NN method
PS this is another good link I found when I was researching this https://machinelearningmastery.com/types-of-classification-in-machine-learning/
What this shows is
a) Studying even seemingly simplistic algorithms can help us to gain important insights and potential applications (ex for in small data situations where data is clean)
b) It can be used to explain more complex concepts such as parametric or non parametric
One final point, like many real-life cases, we do not apply individual algorithms - rather we apply algorithms in an ensemble. that's true here also
Some interesting jobs this week:
Devops engineers at CARIAD VW - CARIAD is the automotive software company within the Volkswagen Group that bundles and further expands the Group's software competencies to transform automotive mobility https://www.dhirubhai.net/posts/jan-zawadzki_devopsmlops-engineer-activity-6810460697906966528-Iqd9 via jan zawadzki. We are happy to work with Jan and his team at VW at Oxford. So, very much recommended
University of Plymouth – fully funded doctoral scholarship via Edward Meinert https://www.dhirubhai.net/posts/activity-6809362405932462080-dXVK
Via Thanos Mourikis - Senior Expert I, Oncology Data Science at Novartis Institutes for BioMedical Research (NIBR) https://www.dhirubhai.net/posts/activity-6809122372264710144-30Cv
Senior front end full stack developers via Dr Fabio Ricciato. I worked with Fabio before in his previous job and very much recommend his work. https://www.dhirubhai.net/posts/fabioricciato_hiring-digital-customerexperience-activity-6808491556656422912-CqeD
Finally, this week I reread one of my favourite books – Thomas Campbell’s Hero with a thousand faces – in an attempt to get Daliana Liu to see Star Wars ?? https://www.dhirubhai.net/posts/ajitjaokar_after-working-in-tech-for-7-years-i-have-activity-6809825850603659264-lfYu
I very much recommend following Daliana for her motivation and insights especially for people who want to be a part of the AI industry. She is a great mentor to all and always ready to share her insights in her own special way
Consultant Business Analyst, Speaker, Mentor, Director
3 年Thanks Ajit. It prompted me to read up the basics on: https://towardsdatascience.com/machine-learning-basics-with-the-k-nearest-neighbors-algorithm-6a6e71d01761
In this edition Daliana Liu Fabio Ricciato Jan Zawadzki Edward Meinert Thanos Mourikis