课程: Google Cloud Professional Machine Learning Engineer Cert Prep
免费学习该课程!
今天就开通帐号,24,700 门业界名师课程任您挑!
Extracting features from public datasets - Google Cloud Platform教程
课程: Google Cloud Professional Machine Learning Engineer Cert Prep
Extracting features from public datasets
A very common way to build machine learning systems is to use public datasets. Let's talk through a few of the common public datasets that are available. A very popular merging public dataset is the Hugging Face datasets, and you can use it to fine-tune a model. So let's say you get a pre-trained model from Hugging Face and you use an environment that has GPU enabled like GitHub Codespaces or an Amazon SageMaker environment with GPU enabled, you can then take that Hugging Face dataset and fine-tune it based on the new data that's available and then create a new model and put it either into production or back into Hugging Face. Likewise, with Amazon S3, it's a very common scenario to have a big public dataset and you can pull that dataset into, let's say, a Jupyter Notebook, do exploratory data analysis on it, find out what it is you're trying to build, and then create a model based on that S3 dataset. Another common public…