Okay. So now we're going to talk about how do we abstract this information, how we put this information in an algorithm. So the main thing is, how do we use a computer to solve this problem? Our main input are videos. So we want to understand what videos are. Videos are basically a sequence of images. You just project a lot of images about 24 frames per second and that's what we see in videos and that's what is a video in a computer. So a computer stores videos as a sequence of images, and the image is simply a series of numbers that represent the colors we see. Which means that to manipulate an image with a computer, it will have to manipulate these numbers with the computer. So what can we do with these numbers that represent the videos? Well, of the three things, if you think about the problem that we want to do is first of all we would like to separate what's on the background so we know what it's moving. Then we would like to identify what is a package and what is people and what is actually part of the surroundings. Last of all, you want to be able to track these packages and people, right? If they're moving where they're going into frame and if it's relevant to what we want to observe or identify. So the very first step would be separating the background. There are many algorithms out there right now that you can use to separate background but in this case, one important thing is that you can do it on real-time. So a University of Michigan Professor of Electrical Engineering and Computer Science, Laura Balzano and her colleagues, actually developed an algorithm called GRASTA that can separate fixed, means the background objects from moving objects in real time. Meaning if you're filming something, you can instantly detect who's moving and who's not or what things belong to that landscape and what is just transient. If you're wondering what GRASTA comes from, it's an acronym for Grassmannian Robust Adaptive Subspace Tracking Algorithm. Okay. So a GRASTA works if you have the original video here, this is just the check in area of an airport and if you see the regional video includes people moving, people at the counter and right now you see there's a couple of not moving packages over here. This is what it appears to be in the background, which means it's people that are not moving or are barely moving. So the algorithm is not detecting movement in this elements of a video and in the foreground or things that are moving, you can see the things that are moving or moving barely. So it's actually very useful because it's not just moving or not, you can detect how much they're moving just to see how much they appear in the foreground video. So a nice example that could be for us is you see here a person is carrying his luggage and he's clearly moving and that's what the algorithm is detecting. Now that we have an algorithm that can identify what is moving and what's not in a video, we want to be able to identify or characterize what are the moving elements. So how can we identify objects and people? The answer to this in very general terms, you'd say is what we call artificial intelligence and this is a good field of study that gives computer's the ability to learn without being explicitly programmed and one subset of that is called Machine Learning. Pretty much what you do is you train your computer with a lot of information and then the computer is able to do tasks given what they've learned. So to identify or characterize moving elements, we'll use this Machine Learning. Nowadays there's a lot of Databases and a lot of programs that can do that and for example to identify dogs or the difference between dogs and cats, they actually have loaded millions and millions of pictures and frames of animals. So when you ask them to identify for example in Google a dog, you can just run it through these database of pictures and they say, oh, it looks enough like the dogs I have seen before. So that is a dog, that's how they can identify. So pretty much, you teach them what a dog looks like or what a cat looks like or what a car looks like, and they are able to tell one from the other. So if you're able to separate the background from the moving pieces and now you are able to identify what is an object, what is a person, what's an animal, then the last thing you need to do is to track the objects and track people. So as we explained before, videos are images and images are numbers. So to track objects and people, you are tracking numbers in a matrix. GRASTA can identify what is moving as we said before. The machine learning algorithm can identify and track each as they enter and exit a frame. So the last thing we have to do is put it altogether and use these for the surveillance problems we have. So to put it all together, we'll use a flowchart. A flow chart is very useful to represent our algorithm or workflow and be able to input it in a computer. So in our case, we have gone through the steps. Our input is video. We put the video into the computer, we'll have an algorithm that will do something to that input and finally our output will be a warning or no warning in case an object was left alone or not at the airport. As we explained before too, a video is a series of frames projected and the frames are the numbers. So to the computer, what we want to do we'll take each of these frames. Then the first step we'll do is separate the background from the foreground objects. Then after separating the background, we have to make a list of objects in the frame, right? We need to know or have some account of what objects are in my frame and what people are in my frame. From those objects, I want to know if an object that was in my previous frame is not seen in this frame. In this case, there is no warning, right? It just means that the object just entered and left and now there's nothing suspicious about that. But if there's an object for a previous frame that it's moving, there's no warning too because it hasn't been left alone because something is moving obviously. But the third choice is that if there's an object left on the frame and it's not moving, you have to check a few things now. If the object accompanied by the human that brought it in or the person that brought it in, also there's no warning because it hasn't been left alone but if not then that's where you have to check. How long has the object been there? Is it being a long time or not? If it's been a long time then you have to give a warning. Once you give a warning, you have to check for other objects. Right? You will go through these through the length of your video and once you've finished with all the frames in your video, then you're done, your algorithm is done. I mean, in a real life situation this shouldn't ever be done because the video in an airport won't ever end. Okay. So just to quickly illustrate our flow chart and this is a still image so it will be just in one frame. For all these objects that are not moving, if you realize like here and here and where else? This one, these are all objects that are not moving. The first two are actually luggage that it's there for a reason and this one seems to be part of the landscape. So somebody put it there and it's not suspicious, and also some of the people are not moving so they weren't up here in the frame. So you'll have to be able to track all this information you should be able to track the people. So your algorithm will count how many people are in your frame and if they're still there, if they're carrying anything. Like this person is carrying a bag here, this person is carrying his bag here. This person we probably saw carrying a bag but then when everybody's moving, you see this person here is actually moving with its bag. So your algorithm for each frame will check each of these person, first eliminate everything that's not moving and then check what is moving. If somebody brought in a luggage like this, they probably would give a warning because you notice this is all by itself, but then the human will be able to decide if that was a suspicious behavior or not.