Saving and loading Trained ML Models
Carlos Salas, CFA, CQF
Portfolio Manager | Investment Research Consultant | Lecturer Data Science & ML
Machine Learning models training can be a tedious task. Unless we have real time needs to train our model - i..e. high frequency trading or short term investment horizon strategies - the best way to be productive is to train the model once, save it and upload later when needed most for tasks such as comparing against other model or simply predict new results using newly released predictors's data e.g. earnings data.
Moreover, we may consider using the same model results in another project/strategy. Hence, saving a trained model and upload it at our convenience will deliver significant time savings in our production pipeline. These "saving" and "loading" processes are also known as serialization and deserialization.
The good news is that Python allows multiple approaches to carry out serialization and deserialization of our ML models: pickle module, joblib module and proprietary development. A quick comparison overview is provided in the next table with apparent trade-offs between flexibility and complexity when comparing pickle and joblib against a more in-house/proprietary development approach.
For more details and code examples in Python about how to implement the aforementioned serialization and deserialization approaches you can read the full article: