Hello, I'm Daniel Egger. Welcome to the probability portion of data science math skills. I'll start by providing a definition of probability and a probability distribution. Probability is the degree of belief in the truth or falsity of a statement. So, whenever there is uncertainty, there's a value assigned to a statement that is greater than 0 and less than 1. So we can define this as the range of uncertainty. When I am certain that a statement is true then that statement is assigned probability 1 and if I'm certain the statement is false, then it's assigned probability is 0. So let's say I'm sitting in my office that has no window to the outside world, and I don't know whether it's raining or not. I'm going to assign some probability within this range and then when I learn the true state of the weather outside, then I'm going to either have certainty that this statement is true. So this is true with certainty. Or I am going to have certainty that the statement is false. Okay. The notation that we use is, we write P of x meaning the probability of the statement x or the probability of the outcome represented by the statement x. We also use the tilde to indicate the negation of a statement and whenever we have a statement and its negation, we have a simple binary probability distribution. In other words, any time we have a statement like it is raining, and the negation to that statement, it is not raining, those statements together will form a probability distribution. One of those statements must be true when we have complete information about the situation. Note also that even before we have complete information about a situation, the probabilities for those two statements must still sum to 1. So for example, if I think there's a 3 out of 4 chance, so 75% chance that it's raining outside right now, then I must think that there's a 1 in 4, or 25% chance that it is not raining outside right now. All right, this principal is known as the law of the excluded middle, and this illustrates the basic rule of probability distributions, which is that all of the outcomes that make up a distribution must have probabilities that sum to 1. In fact, what defines a probability distribution is it is a collection of statements, Two or more, where those statements are exclusive and exhaustive. Exclusive means that given complete information, no more of one of the statements can be true. So it should be obvious that we have the statements it is raining and it is not raining, only one of those statement can be true at a time. In addition the statements that make up a probability distribution must be exhaustive. And that means that at least one of the statements must be true when we have complete information, okay? So I'm sitting inside, I don't know whether it's raining or not. I assign a degree of probability that reflects uncertainty to both the statement it is raining and the statement it is not raining, but when my information is complete, exactly one of those statements must be true. Now it's often the case that we have more than two statements that form a probability distribution. These statements are exclusive and exhaustive. And many situations, we have a large number of statements and we have no real basis to choose one outcome as more probable than another. So for example, we might have a deck of 52 cards, in that case n = 52, and I'm wondering what is the probability that I might draw an ace. And let's say, what is the probability that I would draw an ace of spades from a well-shuffled deck of 52 cards? There's nothing special about the ace of spades as far as I know. We assume the deck is well-shuffled. So under the principle of indifference, the probability that I would assign would be 1/52. It follows from the principal of indifference that we can calculate many, many probabilities as follows. The probability of a certain event, an event being a collection of individual outcomes, is defined as the number of outcomes that are in the event divided by the total number of possible outcomes in the universe. So in our deck of cards example, we might say that our event is drawing a queen and there are four queens in the deck of 52 cards. So, we have four outcomes, queen of hearts, queen of diamonds, queen of spades, queen of clubs, that are within the definition of the event. And we have a total number of outcomes that's equal to total number of cards, and so our probability of drawing a queen would be 1 in 13. Let me give you another example. What is the probability that a six-sided die will come up even? The event is even. So we look for the outcomes that meet the definition of even and they are two, four and six. There are three of those outcomes and the universe of possible outcomes contains six outcomes, and so the probability that the die comes up even is three over six or one half. And this simple concept allows us to solve a very large number of probability problems.