If you look at Internet traffic patterns, I'm talking about why the Internet. There are multiple independent data flows between the entities on the internet. So for instance, if you're doing a file transfer, you're in BitTorrent or something to download files and things like that. That's a data flow and several people could be doing several different things that could be watching a movie, camera, could be downloading a file, and [inaudible] can be doing G-mail, all of these things can be going on simultaneously. It is also the case that there are well-behaved flows. TCP is an example of that. What I mean by well-behaved is a characteristic of the transport protocol, that it has self-regulation, monitoring the state of the network to see if it is congested, should I back off and things like that. There are not so well-behaved flows as well. Example is the UDP, where you just dumping packets without realizing what's going on. Both of those types of customers are injecting flows into the network. Of course, I'm not even mentioning malicious traffic, which probably the dominant one today, which is sometimes indistinguishable at the packet level until it hits the host and it is detected at higher levels of the protocol stack. Let's not worry about that. We'll just talk about you and the good flows, that is TCP and UDP flows. So if you look at these data flows that are packaged level issues that you have to contend with in the Internet. The first thing is out of order delivery. That is, you take a message and a message is broken up into packets and sent on the Internet. These packets, because they're going through several different intermediate roads to get to the destination, they can get out of order. That's something that the protocols have to worry about. Then the other thing that protocols have to worry about is the packet loss. Mostly this packet loss is happening because of packets getting dropped due to congestion. That is, it hits a router and the router says, "I don't have the capacity, my buffer is to hold yet another packet," so it drops it. So that can happen. Another thing that can happen is packet corruption, which is bits in a packet get flipped. That can happen because of all environmental reasons. But it turns out that this is very rarely introduced traffic less than 0.1 percent. So the most important thing is this right here, that you may have packet loss, but out of order delivery is the nature of the protocols that can handle them. No problem with that. But packet loss is something that is important to consider because it has a snowballing effect because when there is a packet loss at one level, it continues to snowball through all the levels. So that is something that can be catastrophic. So network congestion. The term network congestion is used to indicate that the offered load to the network exceeds the network capacity. This is a very simple road traffic example that we're all used to, that you're putting cars on the highway from surface streets and cars are getting off of the highway on the surface streets. So long as the rate at which cars are getting on to the surface streets doesn't severely exceed the rate at which they're getting off of the highway, then you can sustain a good traffic flow. So there is a capacity associated with the highway. You want to make sure that you don't exceed the capacity and that the same problem that is occurring in network traffic as well, and the problem is common to both Wide Area Network as well as the Cloud. The reason is quite simple because the Cloud network fabric is the same as commodity switching elements that are used in the Wide Area Network and therefore, this problem is common to both, the idea of traffic congestion. Of course, TCP is the dominant transport protocol in both of these environment, both the Cloud environment as well as in the Wide Area environment. What happens with TCP is that it's a well-behaved protocol and what each sender does, it dynamically sizes the window to decide when to inject new packet into the network. It locally observes the congestion to make sure that this dynamic window sizing doesn't lead to congestion. So that's the general philosophy. Generally, the strategies that are used for congestion control are the following. First of all, there's a principle of conservation of packets. What that is saying very simply is that don't put a new packet into the network till an existing packet ejects from the network. So in other words, you want a stable set of data packets in transit and at any point of time, and that's what is called the equilibrium state of the network. This is similar to what happens today in our highways. There are metered on-ramps and the metering rate may dynamically change commensurate with the observed traffic congestion. So that's the same thing that needs to happen in the Wide Area Network also but there is problem with that conservation of packets and that comes about because a sender starts injecting packet before the old one leaves because everybody if they are well-defined flows, a well-behaved flows, they're monitoring only the local state. They don't know the global state. So they may start injecting packets before an old packets leave and you may never reach an equilibrium state because of resource constraints along the path. Because of resource constraints, you may never reach this equilibrium state. So these are issues with congestion control and congestion avoidance schemes, they have been several proposed, and I'll give you some flavors of that. The first one is what is called the slow start congestion avoidance scheme. The sender may start with a small send window. For example, let's say that the send window is exactly one packet. It starts with a small send window. When it receives an ACK, it indicates that the packet went through safely. So I'm going to increase the send window and this is also sometimes referred to as the additive increase property. The problem with this is because we are doing with local state only, it can potentially increase the send window exponentially despite the slow start and that's something that can happen. The other thing that has been proposed also is something called conservation at equilibrium. This what you're doing is you're saying, well, let's know what the round-trip time is for the packets that I've sent. If I have a good estimate of round-trip time, then I can set a retransmission timer. You need that in order to make sure that you resend packets, because the packet may have gotten lost. If you don't receive an ACK in a timely manner, you have to assume that the packet has been lost, can be at least retransmitted. So if you have a good estimate of the retransmission round-trip time, you can set a retransmission timer and when you have to do a retransmission in the protocol stack, if the retransmission timer goes off, you decide, packet must have gotten lost, I'm going to retransmit it. So when you retransmit it, that's an indication that most likely the retransmission is happening because of congestion, because as I said earlier, packet losses happen usually because of congestion, routers are dropping packets. So what you're going to do is, when you have retransmission, you have the window size. So it's also sometimes referred to as exponential backoff or multiplicative decrease of the window size. So this is another tactic that is used and if you combine the two together, the scheme for congestion avoidance is a combination of slow start and conservation at equilibrium. Slow start is saying that when you start with small window size and upon an ACK, you increment the window size. Every once in a while, if your retransmission timer goes off and you have retransmit the packet, then you say, well, there's congestion observed, I'm going to half my window size. So this is the way you can rapidly choke the congestion that may have happened. Slowly, you're going to increase the window size, but when you observe congestion, you'll do something drastic, and that's why this is reducing it by half in order to do that. So this is the congestion avoidance scheme that had been suggested and difficult to use it in the context of Wide Area Network and is used, but it is difficult to have very good properties associated with that because there's so many variance when you think about the Wide Area Network. You don't control the entire network and the switches are all at different speeds and all of these things make it very difficult to have a very robust congestion avoidance scheme.