Some common probability concepts for machine learning are probability distribution, likelihood, Bayes' theorem, maximum likelihood estimation, and Bayesian inference. Probability distribution is a function that describes how likely different values or outcomes are for a random variable or event. For instance, a normal distribution can model the height of people, and a binomial distribution can model the number of heads in a coin toss. Likelihood is the probability of observing the data given a model and its parameters. For example, the likelihood of seeing 10 heads in 20 coin tosses given a fair coin is 0.176. Bayes' theorem is a formula that relates the probability of an event given some evidence and prior knowledge. Maximum likelihood estimation is a method of finding the model parameters that maximize the likelihood of the data. Lastly, Bayesian inference is a method of updating the probability distribution of the model parameters based on new data and prior knowledge.