Regional and Online Learnable Fields
Gerald Gibson
Principal Engineer @ Salesforce | Hyperscale, Machine Learning, Patented Inventor
Regional and Online Learnable Fields is a type of data clustering algorithm invented in the early 2000's. It was presented at the IEEE International Joint Conference on Neural Networks in 2004. It can be thought of as a combination of K-Means, Nearest Neighbors, and Network Graphing. It is also very flexibly capable of finding clusters of various shapes such as is possible with DBSCAN.
Here I present my first implementation of this algorithm in Python. In future articles I will present more advanced versions that add support for other features such as categorical data clustering.
In this first implementation the Scikit-learn API is supported (fit, partial_fit, and predict) as well as some additions not included in the original IEEE presentation linked above. This includes cluster id stability between partial fits, configurable inclusion distance of perceptive fields into the final cluster graph and pruning of perceptive fields to reduce the complexity of the final model.
In the first step the algorithm iterates over the data points and measures the distance between them to either create a new perceptive field (the circles in the plots), adjust the center of an existing field, or adjust (expand or shrink) an existing field's radius. Then in the second step it creates a graph between all the perceptive fields that are close enough to overlap. Perceptive fields that are too far away to be connected to the graph result in a new graph being created. Each graph created represents a cluster. Clusters indicate a type, categorization, or class of data point (similar data means similar objects in the population).
Click the link below to see the Jupyter notebook of the code and plots for this implementation.
I will take a look. TBH, I still use intuition for most of my work, and visual tools like Tableau. I'll see if I can apply these methods to my work. thanks!