登录查看更多内容

Regional and Online Learnable Fields

Gerald Gibson

Principal Engineer @ Salesforce | Hyperscale, Machine Learning, Patented Inventor

发布日期: 2022年6月7日

Regional and Online Learnable Fields is a type of data clustering algorithm invented in the early 2000's. It was presented at the IEEE International Joint Conference on Neural Networks in 2004. It can be thought of as a combination of K-Means, Nearest Neighbors, and Network Graphing. It is also very flexibly capable of finding clusters of various shapes such as is possible with DBSCAN.

Here I present my first implementation of this algorithm in Python. In future articles I will present more advanced versions that add support for other features such as categorical data clustering.

In this first implementation the Scikit-learn API is supported (fit, partial_fit, and predict) as well as some additions not included in the original IEEE presentation linked above. This includes cluster id stability between partial fits, configurable inclusion distance of perceptive fields into the final cluster graph and pruning of perceptive fields to reduce the complexity of the final model.

In the first step the algorithm iterates over the data points and measures the distance between them to either create a new perceptive field (the circles in the plots), adjust the center of an existing field, or adjust (expand or shrink) an existing field's radius. Then in the second step it creates a graph between all the perceptive fields that are close enough to overlap. Perceptive fields that are too far away to be connected to the graph result in a new graph being created. Each graph created represents a cluster. Clusters indicate a type, categorization, or class of data point (similar data means similar objects in the population).

Click the link below to see the Jupyter notebook of the code and plots for this implementation.

Datacamp.com Workspace of the Python code for the RegionalOnlineLearnableFields class and plots of sample data.

Wayne Mehl

2 年

I will take a look. TBH, I still use intuition for most of my work, and visual tools like Tableau. I'll see if I can apply these methods to my work. thanks!

1 次回应

要查看或添加评论，请登录

Gerald Gibson的更多文章

Chat That App Intro

2023年12月9日

Chat That App Intro

Chat That App is a Python desktop app I created by using ChatGPT from OpenAI to generate classes, functions, etc. that…
ChatGPT + Timeseries Anomalies

2023年8月23日

ChatGPT + Timeseries Anomalies

Over the past five years, I have been transforming my career from software engineering to machine learning engineering.…

2 条评论
Airflow + PostgreSQL + WSL

2023年7月18日

Airflow + PostgreSQL + WSL

Airflow is a software service that provides asynchronous and distributed execution of workflows. There are several…

3 条评论
TensorFlow-GPU + Ubuntu + WSL

2022年12月20日

TensorFlow-GPU + Ubuntu + WSL

This article walks you through the steps I discovered recently for setting up a working environment to create…

4 条评论
Probabilistic Data Separation

2022年6月17日

Probabilistic Data Separation

Clusters, modes, distributions, categories, sub-populations, sub-signals, mixtures, proportions, ratios, density curve.…
Designing an architecture for MLOps

2022年3月10日

Designing an architecture for MLOps

A large part of architecting anything complex (think software, large buildings, aircraft, etc.) is the skill of mental…
Splunk & Datacamp Training

2021年11月19日

Splunk & Datacamp Training

Not a real article. Just a place to host these since the one drive sharing option is not working.
Random, Stochastic, Probabilistic

2021年9月18日

Random, Stochastic, Probabilistic

At the end of the previous article it was mentioned that we would show how, from a computer programming perspective…
Bayesian probabilities visualized 2

2021年8月21日

Bayesian probabilities visualized 2

In the previous article we covered the basics about what some of these words / phrases used in the Bayesian world…
Bayesian probabilities visualized

2021年8月14日

Bayesian probabilities visualized

I once saw an interview of Benoit Mandelbrot in which he described as a child in his math studies he saw shapes in his…

See all articles

Regional and Online Learnable Fields

Gerald Gibson

Principal Engineer @ Salesforce | Hyperscale, Machine Learning, Patented Inventor

Gerald Gibson的更多文章

社区洞察

其他会员也浏览了

How to build Image Classifier from scratch using Python and TensorFlow

Implementing AdaGrad Optimizer in Spark

Artificial Intelligence No 30: How to understand the maths for data science – part two

Real-time 'me-not_me' Face Detector

My Book on Generative AI Now on Amazon

Object Fractal Dimension

Issue #159 - THE ML ENGINEER ??

Kaggle “Dogs vs. Cats” Challenge?—?Complete Step by Step Guide?—?Part 1

NumPy vs PyTorch

PRACTICAL LOSS FUNCTIONS EASY, NO FORMULAS

Gerald Gibson的更多文章

Chat That App Intro

ChatGPT + Timeseries Anomalies

Airflow + PostgreSQL + WSL

TensorFlow-GPU + Ubuntu + WSL

Probabilistic Data Separation

Designing an architecture for MLOps

Splunk & Datacamp Training

Random, Stochastic, Probabilistic

Bayesian probabilities visualized 2

Bayesian probabilities visualized

社区洞察

其他会员也浏览了

How to build Image Classifier from scratch using Python and TensorFlow

Implementing AdaGrad Optimizer in Spark

Artificial Intelligence No 30: How to understand the maths for data science – part two

Real-time 'me-not_me' Face Detector

My Book on Generative AI Now on Amazon

Object Fractal Dimension

Issue #159 - THE ML ENGINEER ??

Kaggle “Dogs vs. Cats” Challenge?—?Complete Step by Step Guide?—?Part 1

NumPy vs PyTorch

PRACTICAL LOSS FUNCTIONS EASY, NO FORMULAS