This lecture's about ROC curves, or Receiver Operating characteristic curves. These are very commonly used techniques to measure the quality or goodness of a prediction algorithm. So in binary classification, you're usually predicting one of two categories. So, again, you might be predicting whether someone's alive or dead, or sick or healthy. You might also be predicting whether they will click on an ad or they won't click on that ad. But your predictions often come out to be quantitative. In other words almost all the modeling algorithms that we have won't just assign you to one class or the other, they might give you say a probability of being alive, or a prediction on a scale of 1 to 10 about whether you will click on the ad or not. The cut iff you choose gives different results. In other words, if we predict a probability that you're going to be alive is 0.7 and we say all the people above 0.5 will be assigned to be alive, then that's one prediction algorithm. Alternatively if you say only people with a probability of about 28 will be predicted to be alive. Then that will give you a different prediction for that same person. And the different cut offs will have different properties, they will be better at finding people that are alive or better at finding people that are dead. So if people use what's called an ROC curve. And so the idea is on the y axis what they usually plot is one minus the specificity in other words the probability of being a false positive and on the y axis they plot the sensitivity over the probability at true positive. And so what they're trying to do then is make a curve where every single point along this curve corresponds to exactly one cut off. So for a particular cutoff, you get a certain probability of being a false positive or a certain specificity, and that's this point, one minus the specificity is this point right here on this x axis. At the same time, you get a certain sensitivity, and that's this point here, on the Y-axis. And so, you can use these ROC curves to define, whether an algorithm is good or bad by plotting a different point for every single, cutoff that you might choose, and then plotting a curve through those points. So this is an example of a couple of, algorithms that are used to predict. So these are examples of what real ROC curves look like. So, for example, if you want to look at, say, the NetChop algorithm. And you look at say 1 minus the specificity is equal to 0.2 in other words this specificity is quite high, it's 0.8. You can trace that number up and you can get about here on the curve the red curve in this case and that corresponds to a sensitivity to only about 0.4. So it's very highly specific but not very sensitive. And similarly if you move out here to the right on the Y X axis. You'll get less and less specificity, and more and more sensitivity. And so this curve tells you a little bit about the tradeoffs. Now although the curve tells you a little bit about the tradeoffs, you may actually want to know which of these prediction algorithms is better. And one way that people use to quantify one curve versus the other, is they basically calculate the area under the curve. In other words, they follow, say, the red curve here, and they trace the entire area underneath that curve, so that'd be this area down here, and so that area quantifies how good, that particular prediction algorithm is. So, the higher the area the better the predictor is, and some standard, there are some standard values that make sense to pay attention to. So, if the area under the curve or AUC is equal to 0.5 that's equivalent to a prediction algorithm that's just on the 45 degree line and that's equivalent to basically randomly guessing whether you're going to be a true positive or false positive. So, 0.5 is actually quite bad, anything less than 0.5 is actually worse than random guessing. So, it's worse than flipping a coin. AUC of 1 is a perfect classifier, in other words you get perfect sensitivity and specificity for, for a certain value of the prediction algorithm, and so that's the perfect classifier. In general, it depends on the field, and it depends on the problem you are looking at, people often think of an AUC above 0.8. Could be considered to be something that's sort of a good AUC of a particular prediction algorithm. In general something to pay attention to is what, how did you just look at a ROC curve and decide whether it's a good ROC curve or a bad ROC curve. So remember the 45 degree line corresponds to just guessing. In other words, the sensitivity and the specificity match each other on this predictive o line. So, a perfect classifier goes at one minus the specificity. So, one minus the specificity of zero means perfect specificity and it jumps straight away up to perfect sensitivity and then straight over. So that curve represent a perfect classifier when it goes straight up to this corner. So the further you are towards the upper left hand corner of the plot the better the ROC is. And the farther you are down to this bottom right hand corner of the plot the worse the ROC is. And so points along the, the, graphs along the forty five degree line or below are considered pretty bad. And you want your curve to look, to go as straight up towards this top left hand corner and over as over as much as you possibly can. So that's how you interpret an ROC curve, which will be used later when we're picking out particular values of, predictors of binary outputs that output quantitative numbers.