Seed in machine learning means the initialization state of a pseudo-random number generator.?If you use the same seed you will get exactly the same pattern of numbers.
This means that whether you're making a train test split,?generating a NumPy array from some random distribution, or even fitting an ML model,?setting seed will be giving you the same set of results time and again.
- It is used for reproducibility.
- We use the seed in multiple places, the purpose remains the same which is reproducibility.
- When your train and test data are ready, we train and test the model, in between that we train the model and validate it until there is no underfitting or overfitting. When doing so we play with the hyperparameters, so when you work with hyperparameters the randomness should be on the same set of data to make sure the change in the model performance is due to the hyperparameter that we changed and not due to the seed change.
- if you get a very high accuracy model with a specific seed but not with a different seed it means your model is no good.
- So we use Cross-validation to over come that by training and testing the model and different sets of data.
Senior Data Scientist @ Citi | GenAI | Kaggle Competition Expert | PHD research scholar in Data Science
2 年https://en.wikipedia.org/wiki/Phrases_from_The_Hitchhiker%27s_Guide_to_the_Galaxy#Answer_to_the_Ultimate_Question_of_Life.2C_the_Universe_and_Everything_.2842.29