Text Classification with BERT and .Net
Transformer based models are currently the state-of-the-art for text classification and other natural language related machine learning tasks. One popular popular model type is BERT from Google - described in this paper. Google has released the weights of several pre-trained BERT models and this has enabled researchers/practitioner to easily build models that perform very well on several text/language related tasks. I have read that BERT-based models are now an integral part of Google search.
Training language models is expensive and time-consuming. To train a 'base' BERT model from scratch can take several weeks with a compute cost of around $500 on the Google cloud. The availability of pre-trained weights is a huge boon for the rest of us. We can "fine tune" BERT (or other language models) to perform new text/language related tasks - often quite easily.
It is still a challenge to tightly integrate such models into applications that need them. Often the models are deployed as a service and a consuming application will need to make remote calls to access the model's functionality. This gives rise to cost, latency, security, integration-complexity and reliability related concerns.
What if we need to embed this functionality into the application itself. To address this need for .Net applications I have constructed a notebook that outlines how a BERT model may be defined, trained for text classification and consumed from .Net. The code is pure .Net/F# (no Python needed).
The code and notebook output can be viewed here. The code was developed using the new .Net Interactive notebooks now available in VS Code.