Now let's slightly extend our previous example, where we have outcome y that's nx1, and we have a single predictor x that's a vector, that's also nx1. And we want to minimize, That quantity right there. Now what I'd like you to do is go through the same arguments we used in the last lecture to discover that the beta hat that we want to get is the interproduct of y and x over the interproduct of x with itself. So the insight is something like this. Imagine if n was 3. So our data, our outcome, is three dimensional. Then our y is some vector in r 3. So this is r 3 that I'm plotting right here, and here is our y vector, okay. Our x vector, lives, let's say, here, and then the space x times beta just lives along that line. And so what Is beta hat? This quantity right here. That is the multiplier times x that we have to get to the point beta hat times X, that is the projection of Y, right? The minimizer of the distance between Y and the collection of X betas, the projection of Y onto this red line, okay? So beta hat is the particular multiplier times X that we have to get to that point. And then beta hat times X is this specific purple vector right there, okay? So what least squares is doing is it's finding the projection of our point y onto the space that is the constant multiples of the constant multiples of x which we're labeling x times beta. So this is a fairly, fairly famous interesting old equation. I would also note, that in the specific case, where X is equal, let me get my, thing to do black again. In the specific case, where X is J N, X is just a constant of ones. Notice that the inner product of y and x is just n y bar, whereas the inner product of x with itself is just n sub beta hat. Works out to be y bar in that special case. So we haven't changed anything from our previous lecture. If x happens to be a constant, we still arrive at the same answer of course. So commit this formula to memory, because it's a very useful one. I would say there's this startling fact too, which is that you can develop all of multivariable regression, by considering only this one example, which is regression through to the origin. Before we finish this lecture, let's just draw a picture of what we're doing kinda in more data space. So, remember this picture right here was the outcome wide plotted in three dimensional space when we happen to have a three dimensional vector. But imagine if instead we plotted our data something like this, where every x,y pair combination we plotted the y value versus the x value, okay. So then we would get a bunch of points like this, let's say, if they look nice and behaved. What is regression to the origin doing in this sense, if we're thinking about this plot? Okay? So we have a series of potential lines, right? Because the equation y equal to beta x is a line that goes to the origin. So, we have a series of potential lines and what our least squares equations says is take your data and project it, not project it. I'm sorry. Take the vertical distances between the data points and the line, right? That's the distance between the yi points and xi times beta. So take the vertical distances, those vertical distances, and add them up squared. And find the line that has the least sum of the absolute, sum of the squared vertical distances. Okay. So if we picked this line right here, we'd do a lot better than if we did this line right there. Okay?