Professor: Uzma Maroof | Term: Fall 2025

Lecture 1

Question

What is a computer network?

A system that enables computers to exchange information.

The information exchange is governed by protocols.

Question

What is a protocol?

A system of rules that explain the correct conduct and procedures to be followed in formal situations.

Humans follow protocols.

Network protocols are between computers (devices) rather than humans.

A protocol defines:

The format and order of messages sent and received among network entities, and
Actions taken on message transmission and receipt

It is preferred to transmit data in chunks. Otherwise if something fails, the rest fails.

Question

How does the data get transmitted?

Sender breaks data into smaller chunks. Over the internet, these chunks are known as packets. “Packetize”

Receiver receives data chunks and reassembles the data. Reassembles the chunks.

When you want to connect many computers, things get complicated very fast.

In principle, if there are $N$ computers, we can use $\frac{N ( N - 1 )}{2}$ dedicated links between each pair.

But this does not scale well. $N$ could be thousands or millions of computers.

Success

Use a collection of shared network devices and links.

A single-”hop” becomes multi-”hop”

Switches/routers’s job is to relay the data.

Multiple pairs of senders and receivers will use the shared devices and links simultaneously.

Ideally, we’d like

data transfer between each pair to be done within a reasonable time
the network to be utilized well - i.e., not to be overwhelmed with too many simultaneous requests or be left idle for too long

To make that happen, networking people have to solve several challenging problems:

How to decide when a sender gets to transmit data?
How to pick good paths for getting data from its source to its destination?
How to adapt when a switch/router or a link fails?

Solution 1: Reserve before sending.

$D_{1}, D_{2}, D_{3}$ , and $D_{4}$ are network devices.

Suppose that if $D_{i}$ is connected to $D_{j}$ , it can send $X$ data units to $D_{j}$ every second.

In diagram, each link has four “slots”.

When $A$ wants to communicate with $B$

$A$ tells the network how many data units per second it needs to send ( $\frac{x}{4}$ in this example)
Asks the network to reserve resources along a path between $A$ and $B$ accordingly.

To reserve resources, each device along the path records that it needs to reserve $1/4$ th the capacity of its link to the next device for the data from $A$ to $B$ .

Call gets 2nd slot in top link and 1st slot in right link.

Once the network confirms the reservation, $A$ can start sending.

Once $A$ is done sending, the network can end the reservation.

Reservations are great. While $A$ is sending data to $B$ , $1/4$ of the capacity of the top and right links is dedicated to that data transfer.

Pro: No one else can use it

Con: No one else can use it

Question

Why is this a pro and a con?

No sharing means dedicated resources and circuit-like (guaranteed) performance.

No sharing also means if $A$ and $B$ are not using the circuit, it will go non-utilized.

Reserve before sending = Circuit Switching

Circuit: The reserved path over the network

Switching: moving data from one switch “port” to another (more on this later)

Circuit switching is commonly used in traditional telephone networks.

There were some circuit-switching-like proposals in the early days of Internet

The term ‘switching’: Moving data from one port/channel to another.

Question

What are those slots?

Frequency Division Multiplexing (FDM)

Optical electromagnetic frequencies divided into narrow frequency bands
- each call is allocated its own band, can transmit at max rate of that narrow band

Time Division Multiplexing (TDM)

Time divided into slots
- Each call allocated periodic slot(s), can transmit at maximum rate of (wider) frequency band (only) during its time slot(s)

Lecture 2

Reserve before sending can result in underutilizing. Send on demand can fix this.

When there are no reservations, we have to decide which sender gets the link.

We can still accommodate both, but one has to wait longer. It has to sit in a queue on a network device, and if the network device runs out of memory, these packets are lost.

Send on demand = Packet switching. No reservation is needed, and allows more senders to send data simultaneously over the network.

As long as transmission pattern is bursty enough, the probability that they all send at the same time is quite low, so the network should be able to handle this (statistical multiplexing)

This is the approach used on the internet.

Circuit Switching vs Packet Switching

Circuit switching

Network resources divided into pieces
Dedicated end-to-end resources
Circuit-like guaranteed performance
Packet loss does not happen. No concept of queues. Sending data will definitely be delivered.
Used for things like telephone switchboards

Packet Switching

Each packed uses full link capacity
Packets from different users share network resources
Resource contention: Congestion - packets queue in network devices, and wait for link use

From now on we assume packet-switched network.

Question

How to pick good paths for getting data from its source to its destination?

Many different potential paths to the destination. We want to pick the “best one”.

There are many factors that can make a route better than another route. Owner of router and whether they charge, speed, real time availability, least number of shared paths.

Routing protocols tell us what path a packet takes. Network devices implement variants of Bellman-Ford / Travelling Salesman algorithms (CS341) in a distributed manner.

Through routing protocols, network devices exchange messages about their routes to different destinations. Coordinate on a (good) end-to-end path.

Routing protocols are difficult to design since there are many factors to consider:

Message overhead
Convergence time
Metrics to decide “best” route

Problem with packets tacking different routes is that they can arrive out of order.

Question

How do we know what we have designed is a good network?

Network performance metrics

End-to-End delay

How long does it take for a piece of data to go from one end of the network to the other?
- Live call / video should be fast, e-mail can afford to be slower.

Loss

How often does the network lose data during transfer?
- E-mail shouldn’t lose data, but live call / video we can afford some loss.

Throughput

How much data per second can the network transfer?

End-to-End Delay

Device $A \to R_{1} \to R_{2} \to C$

Network Delay

$D (x, y)$ : time it takes for packet to go from $x$ to $y$

$D (x)$ : time it takes for the packet to go through $x$ .

$D (A, C) = D (A, R_{1}) + D (R_{1}) + D (R_{1}, R_{2}) + D (R_{2}) + D (R_{2}, C)$

Network delay over a link. (Remember internet is serial, bits are sent one at a time).

$d_{t r an s} (A, R_{1}, P)$ : The time it takes for all the $L$ bits in the packet to go from $A$ onto the link (it depends on the packet $P$ ‘s length).

$d_{p ro p} (A, R_{1})$ : The time it takes for one bit to transfer from $A$ to $R_{1}$ . Influenced by the length of the link and the type of link.

$D (A, R_{1}) = d_{t r an s} (A, R_{1}, P) + d_{p ro p} (A, R_{1})$

$t r an s$ = load onto the link, $p ro p$ = send to $R_{1}$ (if fibre optic then speed of light).

$d_{t r an s}$ : transmission delay:

$L$ : packet length (bits)
$R$ : link transmission rate (bps)
$d_{t r an s} (A, R_{1}, P) = L / R$

$d_{p ro p}$ : propagation delay:

$m$ : length of physical link
$s$ : propagation speed
$d_{p ro p} (A, R_{1}) = m / s$

Caravan Analogy #1

Car $: =$ bit
Caravan $: =$ packet
Toll service $: =$ link transmission

Toll booth takes 12 seconds to service a car (bit transmission time)

Propagate at 100 km/hr

Question

How long until caravan is lined up before 2nd toll booth?

Time to “push” entire caravan through toll booth onto highway = $12 \times 10 = 120$ seconds.

Time for last car to propagate from 1st to 2nd toll booth: $100 km / (100 km / h r) = 1 h r$ . $A = 62$ minutes.

Caravan Analogy #2

Suppose cars now “propagate” at 1000 km/hr.

And suppose toll booth now takes one minute to service a car.

Question

Will cars arrive to 2nd booth before all cars serviced at first booth?

Yes. After 7 minutes, first car arrives at second booth; three cars still at first booth

The first bit of a packet can arrive at 2nd router before the packet is fully transmitted at the first router. This doesn’t happen.

Network delay within a node

$d_{p roc}$ : processing delay

Time it takes to process the packet

Check bit errors
- What are some possible causes of bit errors?
Determine output link

Typically $<$ microseconds.

Question

What experiences more bit flips (errors). Wireless or wired?

Wireless. But we still prefer it.

Packets queue up in network device buffers (queueing delay), waiting for their turn for transmission.

Queue length grows when arrival rate to link (temporarily) exceeds output link capacity. When could this happen.

Example

Queue build-up in a network device

Suppose each link can transfer 4 packets per second in each direction.

Suppose $A$ and $B$ send 4 packets on $L_{1}$ and $L_{2}$ , respectively, in one second, and all $8$ packets are supposed to go out $L_{3}$ .

$L_{3}$ can only send $4$ packets out in one second.

The next 4 should wait in the queue another second before they are transmitted.

$d_{q u e u e}$ : queueing delay

Time waiting at output link for transmission
Depends on how “congested” that link is

$R$ = link bandwidth (bps)

$a$ = average packet arrival rate

$L$ = packet length (bits)

$L \cdot a$ is average bits arrival rate

Traffic intensity = $\frac{L \cdot a}{R}$

$\frac{L \cdot a}{R} \approx 0$ ?
- Average queueing delay small
$\frac{L \cdot a}{R} \to 1$
- Delays become large
$\frac{L \cdot a}{R} > 1$
- More work arriving than can be serviced, average delay infinite.

Lecture 3

$d_{n o d a l} = d_{p roc} + d_{q u e u e} + d_{t r an s} + d_{p ro p}$

Processing delay is typically a few microsecs
Queueing delay depends on congestion
Transmission delay is significant for low-speed links
Propagation delay is a few microsecs to hundreds of microsecs

$D (x, y)$ : transmission delay ( $d_{t r an s}$ ) + propagation delay ( $d_{p ro p}$ )

$D (x)$ : processing delay ( $d_{p roc}$ ) + queueing delay ( $d_{q u e u e}$ )

Loss

Packet loss/drop happens when a packet’s transmission over the network is aborted before it reaches the destination.

Common causes are corruption and network congestion.

Loss due to network congestion: Queues in network devices have finite capacity. Packet loss occurs when memory to hold queued packets fills up. Packets arriving to full queue are dropped.

Question

How do we detect flipped bits?

We use error correcting codes.

Question

Who retransmits a lost packet (sender or next hop)? And who figures out that it was dropped?

The sender retransmits the lost packet (so that routers don’t store memory). If the sender transmits, it is easier to manage, but it takes longer.

Question

How does the sender know that it needs to retransmit?

The final destination sends back an ACK. If the sender doesn’t receive the ACK, then it is assumed the packet is lost.

Sometimes we don’t retransmit a packet at all. Sometimes it’s not worth it. Speed and order of packets matter more than a missing packet for live streaming video.

Throughput

Throughput is the rate (bits/time unit)

Suppose we abstract a segment of a network as a data pipe.

To find the throughput of that segment, we measure how many bits exit the end of the pipe within one time unit.

Suppose we have one pipe that admits/accepts data at rate $R_{s}$ to another pipe that admits/accepts data at rate $R_{c}$ . The total throughput is $min (R_{s}, R_{c})$ . This is the bottleneck.

Throughput: network scenario

Per-connection end-end throughput is $min (R_{c}, R_{s}, R /10)$ We need $R /10$ since the link is being shared by 10 connections. If everything is fair, all connections get $R /10$ of the link.

In practice: $R_{c}$ or $R_{s}$ is often smaller than $R / N$

Throughput units: bps (bits per second)

1 byte = 8 bits or 1B = 8b

In networking, K/M/G are powers of 10 ( $1 0^{3}, 1 0^{6}, 1 0^{9}$ respectively).

1M Bps $\to$ $M = 1 0^{6}$

$1 \times 1 0^{6}$ Bps = $1 \times 1 0^{6} \times 8$ bps

Recap: networks are complex

They have many peices
They can get large
They are often shared among many traffic flows
Very hard to design efficient and effective networks

Question

What is the internet?

Billions of connected computing devices running applications. Switches/routers that forward packets (chunks of data).

There are mobile networks, home networks, national ISP, enterprise networks, content provider networks, datacenter networks, etc.

A packet passes through many networks: Sender $\to$ access network $\to$ local regional network $\to$ global network $\to$ local network $\to$ access network $\to$ receiver

Network Edge

End systems are hosts

Connected computing devices
Running applications that communicate with applications on other end systems

Access networks

Access networks connect end systems to the edge routers
Edge router = entry point to the Internet
Wired or wireless

There are different kind of access networks:

Residential access networks
Wireless access networks

Wireless access networks: Shared wireless access network connects end system to router (via base station aka “access point”)

Wireless local area networks (WLANs), typically within or around building.

Wide-area cellular access networks, provided by mobile, cellular network operator (10’s km)

Datacenter networks are high-bandwidth links (10s to 100s Gbps) connect hundreds to thousands of servers together, and to Internet.

Network Core

Connecting access networks. Hosts connect to Internet via access networks. Access networks in turn must be interconnected. So that any two hosts (anywhere) can send packets to each other.

Resulting network of networks is very complex (evolution driven by economics, national policies).

Question

Given millions of access networks, how do we connect them together?

Connecting each access networks to each other directly doesn’t scale: $O (n^{2})$ connections.

Option: connect each access network to one global transit ISP?

Customer and provider have an economic agreement.

There are multiple global ISPs (and internet exchange points between them, as well as peering links).

Regional ISPs provide connection to access ISPs. Regional ISPs are connected to Tier 1 ISPs (Sprint, AT&T, NTT) and content provider networks (Google, Meta): private network that connects its data centres to the Internet, bypassing Tier 1 and regional ISPs.

Google : one of the leading examples of content-provider network. Google private network only carries traffic to/from Google servers. All are interconnected via Google’s private TCP/IP network, which spans the entire globe.

The Internet: a “services” view:

Infrastructure that provides services to applications:

Web, streaming video, multimedia, teleconferencing, etc.

Provides programming interface to distributed applications:

Hooks allowing sending/receiving apps to connect to and use Internet transport service

Computer networks are complex systems that have many pieces, are large, and are often shared among many traffic flows.

Question

Is there any hope of organizing all the functionality a network should provide?

Well $\dots$ yes.

Example

Organization of air travel

End-to-end transfer of person plus baggage

Purchase ticket, check baggage, load at gate, takeoff, route airplane, land, unload at gate, claim baggage, [ticket complaint]

flowchart LR
a(ticket purchase)
b(ticket complain)
c(baggage check)
d(baggage claim)
e(gates load)
f(gates unload)
g(runway takeoff)
h(runway landing)
i(airlpane routing)
j(airplane routing)
a-- ticketing service ---b
c-- baggage service ---d
e-- gate service ---f
g-- runway service ---h
i-- routing service ---j

Question

Why layering?

Approach to designing/discussing complex systems:

Explicit structure allows identification of system’s pieces and their relationships
- Layered reference model for discussion
Modularization
- Eases maintenance and updating of system (since there is an interface). Change in layer’s service implementation is transparent to the rest of the system (abstractions)

Layered internet protocol stack

Application: Supporting network applications (HTTP, IMAP, SMTP, DNS)
Transport: Process-process data transfer (TCP, UDP)
Network: Routing of datagrams from source to destination (IP, routing protocols)
Data link: Data transfer between neighbouring network elements (Ethernet, WiFi, PPP)
Physical: Bits “on the wire”

Lecture 4

Application exchanges messages to implement some application service using services of transport layer.

Transport-layer protocol transfers $M$ from one process to another, using services of network layer.

Transport-layer protocol encapsulates the application-layer message, $M$ , with transport-layer header $H_{t}$ to create a transport-layer segment. $H_{t}$ is used by transport layer protocol to implement its service.

Network-layer protocol transfers transport-layer segment $[H_{t} ∣ M]$ from one host to another, using link layer services.

Link-layer protocol transfers datagram $[H_{n} ∣ [H_{t} ∣ M]]$ from host to neighbouring host, using physical-layer services.

Link-layer protocol encapsulates network datagram $[H_{n} ∣ [H_{t} ∣ M]]$ , with link-layer header $H_{l}$ to create a link-layer frame.

Can think of the encapsulation here as Matryoshka dolls. Starts with message in application $\to$ segment in transport $\to$ datagram in network $\to$ frame in link $\to$ the physical layer, and then sent to the physical layer of the destination and unwrapped.

The end-hosts typically implement all layers of the stack. Depending on their functionality, devices in the network implement all or a subset of the layers.

We will start with the application layer.

Application Layer

Network apps are programs that:

Run on different end systems
Communicate over a network

No need to write software for network-core devices

Network-core devices do not run user applications
Applications on end systems allows for rapid app development and propagation

Client-server paradigm

Server:

Always-on host
Permanent network address
Often in data centres, for scaling

Clients:

Contact, communicate with server
- Client always initiates the communication
May be intermittently connected
May have dynamic network addresses
Do not communicate directly with each other

This has a single-point-of-failure. Also exist privacy concerns.

Peer-to-peer (P2P) architecture

No always-on server
Arbitrary end systems directly communicate
Peers request service from other peers, provide service in return to other peers
- Self-scalability - new peers bring new service capacity, as well as new service demands
Peers are intermittently connected and change network address
- Complex management

The client server is easier to maintain.

A hybrid paradigm is also possible (between client-server and P2P).

An application-layer protocol defines:

Types of messages exchanged
Message syntax
Message semantics
Rules

Open protocols

Defined in public standards (RFCs)
Everyone has access to protocol definition
Allows for interoperability
e.g. HTTP, SMTP

Proprietary protocols

e.g. Skype, Zoom

The application layer relies on the transport layer. The application is controlled by the app developer, but the other layers are controlled by the OS.

Question

What transport service may an app need?

Data integrity

Some apps require 100% reliable data transfer
Other apps can tolerate some loss

Throughput

Some apps require a minimum amount of throughput to be effective
Other apps make use of whatever throughout they get (elastic)

Timing

Some apps require low delay to be “effective”

Security

Encryption

Application	Data Loss	Throughput	Time Sensitive?
File transfer/download	No loss	Elastic	No
E-mail	No loss	Elastic	No
Web documents	No loss	Elastic	No
Real-time audio/video	Loss-tolerant	Audio: 5Kbps - 1Mbps Video: 10Kbps - 5Mbps	Yes: 10msec
Streaming audio/video	Loss-tolerant	Same as above	Yes, few seconds
Interactive games	Loss-tolerant	Kbps+	Yes: 10 msec
Text messaging	No	Elastic	Yes and No

Internet applications rely on Internet transport protocols. The application has an interface (socket) to the transport layer. There is reliable data transfer (TCP) or unreliable data transfer (UDP).

Internet transport protocol services

Reliable connection-based service (e.g. TCP):

Reliable transport?
- Between sending and receiving process
Flow control?
- Sender won’t overwhelm receiver
Congestion control?
- Throttle sender when network overloaded
Connection oriented?
- Setup required between client and server processes
Does not provide?
- Timing, minimum throughput guarantee, security

Unreliable connection-less service:

Unreliable data transfer?
- Between sending and receiving process. Data loss is possible
Does not provide?
- Reliability, flow control, congestion control, timing, throughput guarantee, security, connection setup

Question

Why do we need unreliable data transfer?

Unreliable data transfer has lower overhead $\to$ faster. Good for real-time apps.

Application	Application Layer Protocol	Transport Protocol
File transfer/download	FTP [RFC 959]	TCP
E-mail	SMTP [RFC 5321]	TCP
Web documents	HTTP [RFC 7230, 9110]	TCP
Internet telephony	SIP[RFC 3261], RTP [RFC 3350], or proprietary	TCP or UDP
Streaming audio/video	HTTP [RFC 7230], DASH	TCP
Interactive games	WOW, FPS (properitary)	UDP or TCP

Example

Web applications

Web page consists of objects, each of which can be stored on different Web servers

Object can be HTML file, JPEG image, Java applet, audio file, $\dots$

Web page consists of base HTML-file which includes several referenced objects, each addressable by a URL.

$host name www.someschool.edu / path name someDept/pic.gif$

Suppose you want to implement a simple web server and a browser. The user will enter the URL to the object they want to access (say the HTML file for cs.uwaterloo.ca). The file is stored in a server in the CS department, where your web server is also running.

Your browser should retrieve the file and display it

Question

How do you have the browser and server coordinate to retrieve the file?

HTTP Overview

HTTP: Hypertext transfer protocol

Web’s application-layer protocol
Client/server model:
- Client: Web browser that requests, receives (using HTTP) and displays Web objects.
- Server: Web server that sends (using HTTP protocol) objects in response to requests.

User enters URL: `www.someSchool.edu/someDepartment/home.index

1. HTTP client initiates connection to HTTP server (process) at www.someSchool.edu
2. HTTP server at host www.someSchool.edu “accepts” connection, notifying client

We will learn more about connections later during the transport layer section.

HTTP client sends HTTP request message (containing URL) on the connection. Message indicates that client wants object someDeparment/home.index
HTTP server receives request message, forms response message containing requested object, and sends message to the client
HTTP server closest connection
HTTP client receives response messages containing html file, displays html

Two types of HTTP message: request, response

HTTP connection: built on top of a reliable connection-based transport service.

HTTP is stateless: Server maintains no information about past client requests.

Protocols that maintain state are complex.

HTTP request message

ASCII (human-readable format)

Request Line

GET /index.html HTTP/1.1\r\h

Header Lines

HOST: www-net.cs.umass.edu\r\n
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X) Gecko/20100101 Firefox/80.0 \r\n
Accept: text/html, application/xhtml+xml\r\n
Accept-Language: en-us,en;q=0.5\r\n
Accept-Encoding: gzip,deflate\r\n
Connection: keep-alive\r\n
\r\n

The general format of a HTTP request message is:

Request line
Header lines
Body

Question

Other HTTP request message? What can a client request?

GET method

Requests the object at the specified URL

POST method

Web page often includes form input
User input sent from client to server in entity body of HTTP POST request message

HEAD method

Requests headers (only) that would be returned if specified URL were requested with an HTTP GET method

PUT method:

Uploads new file (object) to server
Completely replaces file that exists at specified URL with content in entity body of PUT HTTP request message

Lecture 5

Status code appears in the first line in server-to-client response message. Sample codes

200 OK
301 Moved Permanently
400 Bad Request
404 Not Found
505 HTTP Version Not Supported

So HTTP messages are either request or response.

HTTP Connection: Built on top of a reliable connection-based transport service.

Non-Persistent HTTP

Connection opened
At most one object sent over connection
Connection closed

Downloading multiple objects required multiple connections.

Persistent HTTP

Connection opened
Multiple objects can be sent over single connection between client, and that server
Connection closed

Example

Non-Persistent HTTP: Example

HTTP client initiates connection to HTTP server at URL

HTTP server at host “accepts” connection, notifying client

HTTP client sends HTTP request message (containing URL) into connection. Message indicates that client wants the object

HTTP server receives request message, forms response message containing requested object, and sends message to the client.

HTTP server closes TCP connection

HTTP client receives response containing HTML file, displays HTML.

Steps 1-5 repeated for each of 10 JPEG objects

Non-Persistent HTTP: Response Time

Definition

RTT is the time for a packet to travel from client to server and back

HTTP Response time (per object):

One RTT to initiate connection
One RTT for HTTP request and first few bytes of HTTP response to return
Object/file transmission time

Non-Persistent HTTP response time = 2 $\times$ RTT $+$ file transmission time

Non-persistent HTTP issues

A separate connection for each object

Higher response time

One object: $2 \times$ RTT $+$ file transmission

$N$ objects: $2 N \times$ RTT $+$ ( $\sum_{n}$ file transmission time)

Browsers often open multiple parallel TCP connections to fetch referenced objects in parallel to improve response time

Higher resource overhead - the end host operating system incurs overhead for maintaining each connection

Persistent HTTP (HTTP 1.1)

Server leaves connection after sending response. Subsequent HTTP messages between same client/server sent over the already established connection. Client sends requests as soon as it encounters a referenced object. This has a lower response time.

Response time for the first object: $2 \times$ RTT $+$ file transmission file
Response time for the next $(N - 1)$ objects: RTTR $+$ file transmission time

As little as one RTT for almost all the referenced objects. Cutting overall response time in half and lower resource overhead.

Question

Why not just always use Persistent HTTP?

Persistent HTTP can be expensive since maintaining open connections can be expensive.

HTTP is stateless, the server maintains no information about past client requests.

No notion of multi-step exchanges of HTTP messages to complete a Web “transaction”. All HTTP requests are independent of each other. No need for client/server to recover from a partially completed but never completely completed transaction.

Question

What happens if network connection or client crashes at $t^{'}$ in a stateful protocol?

The server is stuck in a broken intermediary state.

Web sites and client browser use cookies to maintain some state between transactions.

Four components:

Cookie header line of HTTP response message
Cookie header line in next HTTP request message
Cookie file kept on user’s host, managed by user’s browser
Back-end database at website

Example

Susan users browser on a laptop and visits e-commerce site for the first time. When initial HTTP request arrives at site, site creates:

unique ID (aka “cookie”)

Entry in backend database for ID

Subsequent HTTP requests from Susan to this site will contain cookie ID value, allowing site to “identify” Susan.

Cookies can be used for authorization, shopping carts, recommendations, etc.

Question

How do we keep state?

At protocol endpoints: Maintain state at sender/receiver over multiple transactions

In messages: Cookies in HTTP messages carry state

Cookies permit sites to learn a lot about you on their site. Third part persistent cookies (tracking cookies) allow common identity to be tracked across multiple websites.

A first part cookie is from a website that you chose to visit (provides the base html file). A third part cookie is from a website you did not choose to visit.

For example, user opens NYT (for the first time), receives the cookie in the HTTP reply. Site then does an HTTP GET for the ad, HTTP reply come with a cookie. The third party cookie is from the advertisement company.

Ad company will track web browsing over sites with those ads. Can return targeted ads based on browsing history.

Back to previous example; user then visits socks.com which also has an ad from the same ad company. The same cookie is sent in the HTTP GET from prior to the advertisement company (and it is noted that the ad was seen on socks.com).

Day later, user visits NYT again, uses same cookie in the HTTP GET when accessing. HTTP GET for ad company uses the same ad cookie from before, notices that the same cookie was used for socks.com and displays a sock advertisement.

Tracking may be invisible to the user. Rather than displayed ad triggering HTTP GET to the tracker, could be an invisible link.

So HTTP isn’t necessarily stateless, it can be stateful via methods such as cookies.

Cookies can help improve performance but so can caching.

Web Caching

Goal: Satisfy client requests without involving origin server. User configures browser to point to a local web cache. Browser sends all HTTP requests to cache. If the object is not in the cache, cache requests object from origin server, cache received object, returns the object to the client. Otherwise the cache returns object to client immediately.

Web cache acts as both client and server. Server for original requesting client, and client to origin server. Server tells cache about object’s allowable caching in response header.

Question

Why web caching?

Reduce response time for client request. The cache is closer to the client. Reduce traffic on an institution’s access link.

Caching

Access link rate: 1.5 Mbps

RTT from institutional router to server: 2 seconds

Average web object size: 750k bits

Average request rate from browsers to origin servers: 1.8/sec

Question

What is the average delay for a web object crossing the access link?

Mostly affected by queueing delay.

Average queueing delay is $= \overline{x} / (1 - λ \overline{x})$ , where $λ$ is the number of objects per second, and $\overline{x}$ is the average transmission time of each object.

Question

What is the average response time?

Response time $=$ Internet delay $+$ access link delay $+$ LAN delay (negligible).

In the above example,

delay \approx queueing delay = \frac{x}{1 - λ x} = \frac{0.75/1.5}{1 - 1.8 \times 0.75/1.5} = 5 secs

This means that the average response time is 7 seconds.

Option 1 is to buy a faster access link (rate goes from 1.5 Mbps to 15 Mbps).

Queueing delay cuts to 0.055 seconds and the average response time is now 2.055 seconds.

A faster access link is expensive.

Option 2 is to install a web cache. Web cache hit ratio is 0.6 (60% requests served by cache, with negligible delay, and 40% requests served by origin servers).

Average number of objects crossing the access link is $0.4 * 1.8$ /sec $= 0.72$ Mbps

Average delay $\approx$ Average queueing delay $= \frac{x}{1 - λ x} = \frac{0.75/1.5}{1 - 0.72 \times 0.75/1.5} = 0.78$ seconds

Average response time is $= 0.4 \times$ (delay from origin servers) $+ 0.6 \times$ (delay when satisfies at cache) $= 0.4 \times (0.78 + 2) + 0.6 \times (\approx 0) =\approx 1.11$ seconds.

Lower average end-end delay than with 15 Mbps link.

Browser caching: Conditional GET

Goal: don’t send object if browser has up-to-date cached version. No object transmission delay.

Client: Specify date of browser-cached copy in HTTP request.

Server: Response contains no object if browser-cached copy is up-to-date.

HTTP/2 key goal: Decreased delay in multi-object HTTP requests

HTTP1.1 introduced multiple, pipelined GETs over a single TCP connection. Server response in-order (FCFS: first-come-first-served scheduling) to GET requests. With FCFS, small object may have to wait for transmission (head-of-line (HOL) blocking) behind large object(s).

HTTP/2: Increased flexibility at server in sending objects to client:

Methods, status codes, most header fields unchanged from HTTP 1.1
Transmission order of requested objects based on client-specified object priority (not necessarily FCFS)
Push unrequested objects to client
Divide objects into frames, schedule frames to mitigate HOL blocking

HTTP/2 over single connection means:

Recovery from packet loss still stalls all object transmissions. Browsers have incentive to open multiple parallel TCP connections to reduce stalling, increase overall throughput
No security over vanilla TCP connection

HTPP/3 adds security, per object error and congestion control over UDP.

Back to the layers.

Question

How do applications communicate with the transport layer?

The application itself should specify the destination that will receive the data, what type of transport service it wants (i.e. TCP vs UDP), and the data that should be sent.

Communication endpoints are processes.

Process: Program running within a host. Within same host, two processes communicate using inter-process communication (defined by operating system)

Processes in different hosts communicate by exchanging messages over the network.

To receive messages, processes must have an identifier. Host device has unique 32-bit IP address.

Question

How do we find the IP address?

We find the address via DNS.

Question

Does the IP address of the host on which process runs suffice for identifying the process?

No. Many processes could be running on the same host.

The identifier includes both the IP address and port numbers associated with the process on host.

To send HTTP message to a specific web server, need to use the IP address and the port number.

For applications using Internet protocols, the common interface to the transport layer is the socket interface.

Lecture 6

Process sends/receives messages to/from its socket. Sockets are analogous to a door. Sending process shoves message out the door. Sending process relies on transport infrastructure on other side of the door to deliver message to socket at receiving process. Two sockets involved: one on each side.

Socket programming

Goal: Learn how to build client/server applications that communicate using sockets

Socket: Door between application process and end-to-end transport protocol.

Two socket types for two transport services:

UDP: unreliable datagram
TCP: reliable, byte stream-oriented

Example:

Client reads a line of characters from its keyboard and sends data to server
Server receives the data and converts characters to uppercase
Server sends modified data to client
Client receives modified data and displays line on its screen

Socket programming with UDP

UDP: No “connection” between client and server:

No handshaking before sending data
Sender attaches IP destination address and port number to each packet
Receiver extracts sender IP address and port number from received packet

UDP: transmitted data may be lost or received out-of-order

Application viewpoint:

UDP provides unreliable transfer of groups of bytes between client and server processes

UDP Interaction

Create socket on client and server. Create datagram with serverIP address and port=x; send datagram via clientSocket. Server reads datagram from serverClient. Server writes reply to serverSocket specifying client address, port number. Client reads the datagram from the clientSocket, then closes the clientSocket.

Example app: UDP client

client.py

from socket import *
 
serverName = 'hostname'
serverPort = 12000
clientSocket = socket(AF_INET, SOCK_DGRAM)
 
message = input("Input lowercase sentence")
 
#sending the message
clientSocket.sendto(message.encode(), (serverName, serverPort))
 
#reading the result
modifiedMessage, serverAddress = clientSocket.recvfrom(1024)
print(modifiedMessage.decode())
clientSocket.close()

server.py

from socket import *
 
serverPort = 12000
serverSocket = socket(AF_INET, SOCK_DGRAM)
serverSocket.bind(("", serverPort))
print("The server is ready to receive")
 
while True:
	message, clientAddress = serverSocket.recfrom(2048)
	modifiedMessage = message.decode().upper()
	serverSocket.sendto(modifiedMessage.encode(), clientAddress)

Socket programming with TCP

Client must contact the server. To establish a connection, server must have created a socket that welcomes client’s contact. Client creates TCP socket, specifying IP address, port number and server process. When the client creates socket: client TCP establishes connection to server TCP.

When contacted by client, server TCP creates new socket for server process to communicate with that particular client. Allows server to establish connections with multiple clients. Client source port # and IP address used to distinguish clients.

TCP provides reliable, in-order byte-stream transfer between client and server processes.

TCP Interaction

Server creates socket, port=x, for incoming requests: serverSocket = socket(). Wait for incoming connection request connectionSocket = serverSocket.accept(). This is where the TCP connection setup happens with the client. The client creates a socket, connects to hostid, port=x, clientSocket = socket(). The client sends a request using clientSocket. The server reads the request from connectionSocket. The server writes a reply to connectionSocket. The client reads the reply from clientSocket, and closes clientSocket. The server also closes connectionSocket at this point.

Example app: TCP client

client.py

from socket import *
 
serverName = 'servername'
serverPort = 12000
 
#create TCP socket for server
clientSocket = socket(AF_INET, SOCK_STREAM)
 
clientSocket.connect((serverName, serverPort))
 
#after the handshake, all communication is only between these two processes
sentence = input("input sentence")
clientSocket.send(sentence.encode())
 
modifiedSentence = clientSocket.recv(1024)
print("From server: ", modifiedSentence.decode())
clientSocket.close()

In the UDP client, sendto was sending the message along with the serverName, and serverPort.

In TCP client, after you connect to the serverName, serverPort, you can only send to this serverName and serverPort, there is no option to change this.

server.py

from socket import *
 
serverPort = 12000
serverSocket = socket(AF_INET, SOCK_STREAM)
serverSocket.bind(("", serverPort))
 
#number of simultaneous connections
serverSocket.listen(1)
 
print("The server is ready to receive")
 
while True:
	connectionSocket, addr = serverSocket.accept()
	
	sentence = connectionSocket.recv(1024).decode()
	capitalizedSentence = sentence.upper()
	connectionSocket.send(capitalizedSnetence.encode())
	connectionSocket.close()

The socket interface is not the only interface.

Video streaming: client-server

Video is a sequence of images displayed at a constant rate. A digital image is an array of pixels. Each pixel is represented by bits.

There are many ways to optimize this. We use redundancy within and between images to decrease the number of bits used to encode an image.

Instead of sending $N$ values of the same colour, send only two values: colour value and value $N$ . This is spatial coding.

Instead of sending the complete frame at $i + 1$ , send only the differences from frame $i$ . This is temporal coding.

CBR (constant bit rate): Video encoding rate fixed.

VBR (variable bit rate): Video encoding rate changes as amount of spatial, temporal coding changes.

Streaming stored video. Simple scenario is video server $\to$ internet $\to$ client.

Main challenges:

Server-to-client bandwidth varies over time due to congestion.
Packet loss, delay due to congestion will delay playout, or result in poor video quality.

The gap between the time it takes from the server to send the data and the client to receive the data is network delay. The client plays out an earlier part of the vide, while the server still sends a later part of the video.

Challenges:

Continuous playout constraint: During client video playout, playout timing must match original timing. Network delays are variable, so will need client-side buffer to match continuous playout constraint.
Pause, fast-forward, rewind, jump through video

Client-side buffering and playout delay compensate for network-added delay and delay jitter.

Video stream traffic is a major consumer of Internet bandwidth.

Question

How to efficiently get content to millions of users?

Idea 1: Content distribution networks (CDNs)

Option 1: Single, large “mega-servers”

Single point of failure
Point of network congestion
Long path to distant clients

This doesn’t scale.

Option 2: Store/serve multiple copies of videos at multiple geographically distributes sites (CDN)

Push CDN servers deep into many access networks.
Smaller number of larger clusters on Internet exchange points

Idea 2: DASH (Dynamic Adaptive Streaming over HTTP)

Intelligence at client. The client determines

When to request chunk, what encoding rate to request, where to request the chunk.

Streaming video = encoding + DASH + playout buffering.

Lecture 7

Example

Video Streaming: Netflix

Netflix stores copies of content at its worldwide OpenConnect CDN nodes. Subscriber requests content, service provider returns manifest. Using the manifest, client retrieves content at highest supportable rate. May choose different rate or copy if network path is congested.

Some interesting design decisions services like Netflix need to make:

What content to place in which CDN nodes?
From which CDN node to retrieve content? At which rate?

Bob manages Netflix account
Bob browsers Netflix video (to cloud)
Manifest file returned for requested video (from cloud)
1. Cloud uploads multiple versions of video to CDN servers.
DASH streaming from CDN server to Bob

P2P File Distribution: Peer-to-Peer

There is no always-on server. Arbitrary end systems directly communicate. Peers request service from other peers, provide service in return to other peers. Self scalability - new peers bring new service capacity, and new service demands.

Peers are intermittently connected and change network addresses.

A server distributes one copy of a large file to each of the $N$ hosts.

$u_{s}$ : upload rate of the server’s access link
$u_{i}$ : upload rate of the $i$ -th peer’s access link
$d_{i}$ : download rate of the $i$ -th peer’s access link.

Question

How much time to distribute file (size $F$ ) from one server to $N$ peers?

File distribution time: client-server

Server transmission: must sequentially send (upload) $N$ file copies.

No one else helps in uploading.
Time to send one copy: $F / u_{s}$
Time to send $N$ copies: $N \times F / u_{s}$

Client: Each client must download file copy

$d_{m i n} =$ min client download rate
Minimum client download time: $F / d_{m i n}$

Time to distribute $F$ to $N$ clients using client-server approach is $D_{c - s} \geq max {NF / u_{s}, F / d_{m i n}}$ . This is the lower bound (not the actual time).

File distribution time: P2P

Server transmission: must upload at least one copy

Time to send one copy: $F / u_{s}$

Client: each client must download file copy

Minimum client download time: $F / d_{m i n}$

Server and clients: as a whole, the system must deliver (upload) a total of $NF$ bits ( $F$ bits to each of the $N$ peers). Max upload rate is $u_{s} + \sum u_{i}$

Time to distribute $F$ to $N$ clients using P2P approach $D_{P 2 P} \geq max {F / u_{s}, F / d_{m i n}, NF / (u_{s}, \sum u_{i})}$ . This is the lower bound (not the actual time).

Consider distributing a file of $F = 360$ Mbits to $20$ peers. The server has an upload rate of $1$ Mbps, and each peer has upload rate of $100$ kbps and download rate of $1$ Mbps. What is the minimum file distribution time for client-server and P2P distributions respectively?

$F = 360$ Mbits
$N = 20$
$u_{s} = 1$ Mbps
$u_{i} = 100$ kbps

$D_{c - s} \geq ma x {\frac{20 \times 360 \times 1 0 ^{6}}{1 \times 1 0 ^{6}}, \frac{360 \times 1 0 ^{6}}{100 \times 1 0 ^{3}}}$

$D_{P 2 P} \geq max {\frac{360 \times 1 0 ^{6}}{1 \times 1 0 ^{6}}, \frac{360 \times 1 0 ^{6}}{1 \times 1 0 ^{6}}, \frac{( 20 \times 360 * 1 0 ^{6} )}{( 1 \times 1 0 ^{6} ) + ( 20 * ( 1 * 1 0 ^{6} ))}}$

P2P file distribution: BitTorrent

File divided into 256Kb chunks. Peers in torrent send/receive file chunks.

Tracker: tracks peers participating in torrent.

Torrent: group of peers exchanging chunks of a file.

Peer joining torrent:

Has no chunks, but will accumulate them over time from other peers
Registers with tracker to get list of peers, connects to subset of peers

While downloading, peer uploads to other peers

Peer may change peers with whom it exchanges chunks.

Peers may come and go (this is churn)

Once peer has entire file, it may selfishly leave, or altruistically remain in the torrent.

Requesting chunk:

At any given time, different peers have different subsets of file chunks
Periodically, Alice asks each peer for list of chunks they have
Alice requests missing chunks from peers, rarest first

Sending chunks: tit-for-tat

Alice sends chunks to those four peers currently sending her chunks at the highest rate
- Other peers are choked by Alice (do not receive chunks from her)
- Re-evaluate top 4 every 10 seconds
Every 30 seconds, randomly select another peer, starts sending chunks.
Optimistically ‘unchoke’ this peer. Newly chosen peer may join top 4.

Alice “optimistically unchokes” Bob. Alice becomes one of Bob’s top-four providers; Bob reciprocates. Bob becomes one of Alice’s top-four providers.

Transport layer: Overview

Provide service to the application layer.

Transport to Application: “If you give me some data and the ID of the other communication endpoints, I will get that data to that communication endpoint”. Using the services of the network layer.
Network to transport: “If you give me some data and the ID of the computer (host) that is the destination, I will get that data to the destination host”

Application on host $H_{1}$ : Send 3000B through $S_{1}$ to $S_{M}$ on host $H_{2}$ .

Network layer does its best to send packets of size 1500B from $H_{1}$ to $H_{2}$ , but may get lost or corrupted.

Transport-layer protocol:

How can I distinguish between traffic from different sockets?
- Port numbers, multiplexing, demultiplexing
How do I break data into packets and but it back together?
- Segmentation and reassembly
How do I make sure all bytes are delivered reliably?
- Reliable data transfer

Transport layer (sender)

Is passed an application-layer message
Attaches its own metadata to help with (de)mux segmentation and reassembly, and reliable data delivery
Creates segment (transport header + data)
Passes segment to the network layer

Transport layer (Receiver)

Receives segment (transport header + data) from network layer
Use header values to reassemble bytes if needed
Extracts application-layer message
Use headers values to demultiplex message up to application via socket.

Multiplexing/Demultiplexing

Multiplexing as a sender: Handle data from multiple sockets, add transport header. Using the same network for multiple apps. (Different lanes merging onto same highway)

Demultiplexing as a receiver: Use header info to deliver received segments to correct socket. (Exits off the highway)

Host receives datagrams. Each datagram has a source network address, destination network address. Each datagram carries one transport-layer segment. Each segment has source, destination port number.

Host uses network addresses & port numbers to direct segments to appropriate sockets.

When creating a socket, must specify host-local port # serverSocket.bind(("", 12000))

When sending data into UDP socket, must specify

Destination IP address
Destination port #

UDP sockets are identified with a pair of IP and port. When receiving hosts receives UDP segment, checks destination port # in segment. Directs UDP segment to socket with that port #.

IP/UDP datagrams with the same destination port #, but different source IP addresses and or source port numbers will be directed to the same socket at receiving host.

Example

Connectionless demultiplexing (UDP only needs to bind to source port)

mySocket2.bind(("",9157))
 
serverSocket.bind(("",6428))
 
mySocket1.bind(("",5775))

TCP socket is identified by 4-tuples:

Source IP address
Source port number
Destination IP address
Destination port number

Demux: Receiver uses all four values to direct segment to appropriate socket.

Server may support many simultaneous TCP sockets:

Each socket identified by its own 4-tuples
Each socket associated with a different connecting client

Lecture 8

Multiplexing, demultiplexing: based on transport segment and network datagram header field values.

UDP: demultiplexing at the destination host using destination IP and port number

TCP: demultiplexing at the destination host using 4-tuple: source and destination IP addresses, and port numbers.

UDP - User Datagram Protocol

Question

How does UDP break data into packetes and put it back together?

It doesn’t. You can only put as much data into a UDP segment that will fit into a single packet. UDP does not make sure all bytes are delivered reliably.

UDP use:

Streaming multimedia apps
HTTP/3
Other network apps or protocols like DNS

If reliable transfer is needed over UDP, then add needed reliability at application layer.

UDP: Transport Layer Actions

Sender actions

Is passed an application-layer message
Determines UDP segment header fields values
Creates UDP segment
Passes segment to IP

Receiver actions:

Receives segment from IP
Checks UDP checksum header value
Extracts application-layer message
Demultiplexes messages up to application via socket

UDP segment header contains source port #, destination port #, length (length in bytes of UDP segment, including header), checksum, and application data.

Question

What does the checksum do?

The goal is to detect errors in transmitted segment.

Transmitted $5 + 6 = 11$

Received $4 + 6 \neq = 11$

The sender treats the contents of the UDP segment as a sequence of 16-bit integers. Checksum adds (one’s complement sum) of segment content. Checksum value put into UDP checksum field.

The receiver computes the checksum of the received segment. Checks if the computed checksum equals the checksum field value.

Reliable data transfer (rdt) is one of the most important services a transport protocol can provide over an unreliable network layer.

rdt interfaces:

rdt_send() is called from above, passed data to deliver to receiver upper layer
udt_send() is called by rdt to transfer packet over bi-directional unreliable channel to receiver
rdt_rcv() is called when the packet arrives on receiver side of the channel
deliver_data() is called by rdt to deliver data to the upper layer (receiving process)

The complexity of reliable data transfer protocol will depend (strongly) on characteristics of unreliable channel.

The requirements of rdt are:

No corrupted bits
All bits delivered
No duplicates
Data is received in the order sent

Sender, receiver do not know the “state” of each other, unless communicated via a message.

We will use finite state machines to specify sender, receiver.

FSM refresher (turnstyle)

State	Event	Action	Next State
Locked	Push	None	Locked
Locked	Ticket	Unlock	Open
Open	Push	Lock	Locked
Open	Ticket	None	Open

Question

What if the underlying channel is reliable?

No bit errors and no loss of packets. We use separate FSMs to show the logic of the sender and receiver side.

Sending Side

State	Event	Action	Next State
Wait for call from above (app)	App-Layer says to send data	Send to network layer	Wait for call from above (app)

Receiving side

State	Event	Action	Next State
Wait for call from below (network)	Network layer packet arrives	Send to app-layer	Wait for call from below (network)

Unreliable channel v1: Channel with bit errors

Underlying channel may flip bits in packet. We use checksum to detect bit errors.

Question

How do we recover from errors?

Acknowledgements (ACKs): receiver explicitly tells sender that packet received OK

Negative acknowledgements (NAKs): receiver explicitly tells sender that packet had errors

Sender retransmits packet on receipt of NAK.

Example

Sender

Sends a packet

Waits to get an ACK/NAK

If NAK, resent the packet

go back to waiting

If ACK, proceed with sending next packet

Receiver

When packet is received

Examine checksum

If correct packet, send ACK

deliver data to app layer

If corrupted packet, send NAK

We can also have corrupted feedback.

Question

What happens if ACK/NAK corrupted?

Sender doesn’t know what happened at the receiver, so it should retransmit. But how does the receiver distinguish between the new and retransmitted data.

To handle duplicates, the sender adds a sequence number to each packet. The receiver discards the duplicate packet.

Stop-and-wait Protocol (v2)

Sender

Send a packet

Sequence # = 1 - last sequence #

Wait to get an ACK/NAK

If NAK or corrupted, resend

go back to waiting

If ACK, proceed with next packet

Receiver

When packet is received

If correct packet, send ACK

If Sequence # $\neq =$ last Sequence #, deliver data to app-layer

If corrupted packet, send NAK

Stop-and-wait protocol (v2)

Sender:

Sequence number added to packet
- Only two sequence numbers (0,1) will suffice
Must check if received ACK/NAK is corrupted
There are now twice as many states

Receiver:

Must check if received packet is duplicate
- State indicates whether 0 or 1 is expected packet sequence
The receiver does not have a way of knowing if its last ACK/NAK received OK at sender

Stop-and-wait protocol (v2+): NAK-free

Same functionality, using ACKs only. Instead of NAK, receiver sends ACK for last packet correctly received. Receiver must explicitly include sequence # of packet being ACKed.

Duplicate ACK at sender results in same action as NAK: retransmit current packet.

Stop-and-wait Protocol (v2+): NAK-free

Sender

Send a packet

Sequence # = 1 - last sequence #

Wait to get an ACK/NAK

If ACK (& last Sequence #) or corrupted, resend

go back to waiting

If ACK (& Sequence #), proceed with next packet

Receiver

When packet is received

If correct packet, send ACK (& Sequence #)

If Sequence # $\neq =$ last Sequence #, deliver data to app-layer

If corrupted packet, send ACK (& last Sequence #)

Lecture 9

However, packets can be delayed and lost.

Unreliable channel v2: Channel with errors and loss.

New channel assumption: underlying channel can also lose packets (data or ACKs). Checksum sequence $#$ s, ACKs, retransmissions will be of help, but not close enough.

Question

What is the difference between data corruption and data loss?

Data loss (lose packets), data corruption (bits are flipped $\to$ use checksum). In both cases need to retransmit.

Question

How do humans handle lost sender-to-receiver words in conversation?

Approach: sender waits “reasonable” amount of time for ACK. Retransmits if no ACK received in this time.

If packet (or ACK) just delayed (not lost), then retransmission will be duplicate, but the sequence number handles this. The receiver must specify sequence number of packet being ACKed.

Use the countdown timer to interrupt after “reasonable” amount of time.

Stop-and-wait Protocol (v3)

Sender

Send a packet

Sequence # = 1 - last sequence #

Set timer

Wait to get an ACK/NAK

If ACK (& last Sequence #) or corrupted, resend packet and reset timer

go back to waiting

If ACK (& Sequence #), remove timer and proceed with next packet

If timer goes off, resend packet and reset timer

Receiver

When packet is received

If correct packet, send ACK (& Sequence #)

If Sequence # $\neq =$ last Sequence #, deliver data to app-layer

If corrupted packet, send ACK (& last Sequence #)

Tools: Checksum, ACK, retransmission, 1-bit sequence number, timer

Sender can’t distinguish if the packet or ack was lost. So even if the ack was lost, after the timeout, the sender resends the packet (even though it was already received). The receiver receives the packet (sees that its a duplicate), discards it, and sends the ack back (same ack that was just sent).

Sender sends packet 0, receiver receives packet 0, sends ack0. Sender receives ack0, sends packet 1. Receiver receives packet 1, sends ack1. However this ack1 is delayed. The sender times out, and resends packet 1 (receiver receives this, detects duplicate, and sends back ack1). However right at this point (before the ack1 is sent back), the sender receives the delayed ack1m and sends packet 0 (receiver receives packet 0, sends ack 0). While this is happening, sender receives the ack1, sends packet 0. The receiver detects this as a duplicate, sends ack0 back, and this duplicate packet continues.

Stop-and-wait Protocol (v3)

Sender

Send a packet

Sequence # = 1 - last sequence #

Set timer

Wait to get an ACK/NAK

If ACK (& last Sequence #) or corrupted, resend packet and reset timer

do nothing

If ACK (& Sequence #), remove timer and proceed with next packet

If timer goes off, resend packet and reset timer

Receiver

When packet is received

If correct packet, send ACK (& Sequence #)

If Sequence # $\neq =$ last Sequence #, deliver data to app-layer

If corrupted packet, send ACK (& last Sequence #)

Updated:

Sender sends packet 0, receiver receives packet 0, sends ack0. Sender receives ack0, sends packet1. Receiver receives packet 1, sends ack1 back but delayed. In this time, sender times out, and resends packet 1. Receiver receives packet 1, detects it is a duplicate, sends ack1 back. However in between this, the sender receives the delayed ack1, sends packet0. The receiver receives packet 0, sends back ack0. At this time, the sender receives ack1 (from the previous receive), and ignores.

Stop-and-wait protocol has a problem,

$u_{sender}$ : utilization - fraction of time sender is busy sending,

Example

1 Gbps link, 15ms prop delay, 8000 bit packet

Time to transmit packet into channel:

D_{trans} U_{sender} = \frac{L}{R} = \frac{8000 bits}{1 0 ^{9} bits/sec} = 8 microsecs = \frac{L / R}{RTT + L / R} = \frac{.008}{30.008} = 0.00027

Pipelining: sender allows multiple, in-flight, yet-to-be-acknowledged packets.

Range of sequence numbers must be increased. Buffering at sender and/or receiver.

First packet bit transmitted, $t = 0$ . Last bit transmitted, $t = L / R$ . After RTT, ACK arrives, send next packet, $t = RTT + L / R$ .

3-packet pipelining increases utilization by a factor of 3.

$u_{sender} = \frac{3 L / R}{RTT + L / R} = \frac{.0024}{30.008} = 0.00081$

Go-Back-N: Sender

Sender: “window” of up to $N$ , consecutive transmitted but unACKed packets.

$k$ -bit sequence # in packet header. Keep a window size that tracks packets that have sent, and not yet acked as well as usable packets but not yet sent.

We don’t keep packets that have already been acked in our window.

Cumulative ACK

ACK( $n$ ): ACKs all packets up to, including sequence # $n$

On receiving ACK(n): move window forward to begin at $n + 1$

Single timer for oldest in-flight packet.

Go-Back-N: Receiver

ACK-only: always send ACK for correctly-received packet so far, with highest in-order sequence #. May generate duplicate ACKs, need only remember rcv_base

On receipt of out-of-order packets:

Can discard (don’t buffer) or buffer: an implementation decision
Re-ACK packet with highest in-order sequence #

Receiver view of sequence number space: Received and ACKed, received but not ACKed (out of order), not received.

Extended FSM

ACK-only: always send ACK for correctly-received packet so far, with highest in-order sequence #. May generate duplicate ACKs, need only remember exepctedseqsum.

Out of order packet:

Discard (don’t buffer)
Re-ACK packet with highest in-order sequence #

Lecture 10

Selective Repeat:

Pipelining: Multiple packets in flight

Receiver individually ACKs all correctly received packets. Buffers packets, as needed, for in-order delivery to upper layer.

Sender maintains a timer for each unACKed packet. Timeout retransmits single unACKed packet associated with timeout. Maintains a window over $N$ consecutive sequence numbers. Limits pipelined, in flight packets to be within this window.

Sender:

Data from above:
- If next available sequence number in window, send packet
Timeout( $n$ ):
- Resent packet $n$ , restart timer
Ack( $n$ ) in [sendbase, sendbase $+ N - 1$ ]
- Mark packet $n$ as received
- If $n$ smallest unACKed packet, advance window base to next unACKED sequence number

Receiver:

Packet $n$ in [rcvbase, rcvbase $+ N - 1$ ]
- Send ACK( $n$ )
- Out-of-order: Buffer
- In-order: Deliver (also deliver buffered, in-order packets), advance window to next not-yet-received packet
Packet $n$ in [rcvbase $- N$ , rcvbase-1]
- Ack( $n$ )
Otherwise:
- Ignore

The receiver stores out of place packets in a buffer.

The sender does not know what is happening at the receiver end. Same sequence number but a different packet breaks this. Sequence numbers in the same packet should not be conflicting. The size of the window should be half the sequence base.

TCP guarantees reliable, in-order byte stream, but with no message boundaries.

Full duplex data: Possible to send data both ways once the two processes establish a connection. (Half-duplex $\to$ can each be sender and client but not at the same time).

Uses the pipelining approach to reliable data transfer. A combination of techniques from Go-Back-N (cumulative acks) and Selective Repeat (only retransmitting presumably lost segment). Performance optimizations like fast retransmit and delayed acks.

The interface between a sending process and TCP is a byte stream. TCP assigns a sequence number to every byte (as opposed to every segment).

It keeps track of the status of every byte.

The first byte has a sequence number init_seq, and the next byte has sequence number init_seq + 1. The $N$ th byte has sequence number init_seq $+ N - 1$ .

TCP ACKs

Cumulative ACK
- Has sequence number of next expected in-order byte
Ack( $n$ )
- All bytes in [init_seq, $n - 1$ ] are received.
- The receiver is expecting byte $n$ next

Difference between Go-Back-N. They both use cumulative ACKs, but here we buffer out-of-order packets.

Host $A$ - user types ‘C’
- Sequence = 42, ACK = 79, data = ‘C’
Host $B$ - ACKs receipt of ‘C’, echoes back ‘C’
- Sequence = 79, ACK = 43, data = ‘C’
Host $A$ - ACKs receipt of echoed ‘C’
- Sequence = 43, ACK = 80

This is bi-directional communication. Two data streams, one in each direction, each with its own sequence number space.

TCP Sender

Event: data received from application

Create segment with sequence #. Sequence # is byte-stream offset of first data byte in segment. Start timer if not already running. Think of timer as for oldest unACKed segment. Expiration interval: TimeOutInterval

Event: Timeout

Retransmit segment that caused timeout
Restart timer

Event: ACK received

If ACK acknowledges previously unACKed segments. Update what is known to be ACKed. Restart timer if there are still unACKed segments

Question

How to set TCP timeout value?

Longer than RTT, but RTT varies.

Too short $\to$ premature timeout, unnecessary retransmission

Too long $\to$ slow reaction to segment loss

Timeout value shouldn’t be constant, it should adapt to network conditions. Longer timer during peak hours.

Question

How to estimate RTT?

SampleRTT: measured time from segment transmission until ACK receipt. Ignore retransmissions

SampleRTT will vary, want estimated RTT “smoother”. Average several recent measurements, not just current SampleRTT.

Estimated RTT = $(1 - α) \times$ Estimated RTT $+ α \times$ SampleRTT

Exponential weighted moving average (WEMA). Influence of past sample decreases exponentially fast. Typical value: $α = 0.125$

Timeout interval: Estimated RTT plus “safety margin”

Large variation in EstimatedRTT: want a larger safety margin.

Timeout Interval = Estimated RTT $+ 4 \times$ Dev RTT (safety margin)

Dev RTT: EWMA of Sample RTT deviation from Estimated RTT:

DevRTT = $(1 - β) \times$ Dev RTT $+ β \times$ | Sample RTT - Estimated RTT|

Optimization 1: Fast Retransmit

If sender receives 3 additional ACKs for the same data, resend unACKed segment with smallest sequence #, don’t wait for the timeout, it is likely that the unACKed segment was lost.

Optimization 2: Delayed ACKs

Instead of generating an ACK in response to every segment the moment it arrives. Wait for some time to see if there is another segment right afterwards. Create on ACK for both.

Lecture 11

(lec 7 38 to lec 8 25)

TCP congestion control: AIMD

Approach: Senders can increase sending rate until packet loss (congestion) occurs, then decrease sending rate on loss event.

Additive Increase: Increase sending rate by 1 maximum segment size every RTT until loss detected.

Multiplicative Decrease: Cut sending rate in half at each loss detected by triple duplicate ACK (TCP Reno). Cut to 1 MSS (maximum segment size) when loss is detected by timeout (TCP Taho)

AIMD sawtooth behaviour: probing for bandwidth. It is a distributed asynchronous algorithm that has been shown to optimize congested flow rates network wide and have desirable stability properties.

Lecture 12

Question

Is there a better way then AIMD to probe for usable bandwidth?

$W_{ma x}$ : sending rate at which congestion loss was detected. Congestion state of bottleneck link probably hasn’t changed much. After cutting rate/window in half on loss, initially ramp to $W_{ma x}$ faster, but then approach $W_{ma x}$ more slowly. The shape is more cubic (hence TCP CUBIC).

$K$ : point in time when TCP window size will reach $W_{ma x}$ . $K$ itself is tunable.

Increase $W$ as a function of the cube of the distance between current time and $K$ . Larger increases when further away from $K$ . Smaller increases when nearer $K$ .

TCP CUBIC default in Linux, most popular TCP for popular Web servers.

TCP (classic, CUBIC) increase TCP’s sending rate until packet loss occurs at some router’s output: the bottleneck link.

It is useful to focus on the congested bottleneck link. Senders should decrease their sending rates earlier, hopefully before packet loss, thus avoiding costly packet loss and retransmission.

Delay-based TCP congestion control

Keep sender-to-receiver pipe “just full enough, but no fuller”: keep bottleneck link busy transmitting, but avoid high delays/buffering.

Delay-based approach:

RTT $_{min}$ - minimum observed RTT (uncongested path)
Uncongested throughput with congestion window cwnd is cwnd/RTT $_{min}$ .

Measured throughput = $#$ bytes sent in last RTT interval / RTT $_{measured}$

If measured throughput very close to uncongested throughput $\to$ increase cwnd linearly.
If measured throughput far below uncongested throughput $\to$ decrease cwnd linearly.

A number of deployed TCPs take a delay-based approach. Bottleneck Bandwidth and Round-trip propagation time (BBR) is deployed on Google’s’ internal backbone network.

Explicit Congestion Notification (ECN)

TCP deployments often implement network-assisted congestion control. Two bits in IP header (ToS field) marked by network router to indicate congestion. Policy to determine marking chosen by network operator. The congestion indication is carried to destination. Destination sets ECE bit on ACK segment to notify sender of congestion. Involves both IP (IP header ECN bit marking) and TCP (TCP header, C,E bit marking).

C, E: Congestion notification
ECE: Explicit Congestion Notification Echo
CWR: Congestion Window Reduced

TCP fairness goal: If $K$ TCP sessions share same bottleneck link of bandwidth $R$ , each should have average rate of $R / K$ .

flowchart LR
a[TCP connection 1]
b[TCP connection 2]
c[Bottleneck router Capacity R]
a ---> c
b ---> c

Question

Is TCP Fair?

Assumptions:

Suppose each connection is transferring a large file
There is no UDP traffic passing through the bottleneck link
Only two TCP connections sharing a single link with transmission rate $R$
Assume that the two connections have the same MSS and RTT
Ignore the slow-start phase of TCP
Assume the TCP connections are operating in CA mode (AIMD) at all times

Example

Two competing TCP sessions

Additive increase gives slope of 1, as throughput increases. Multiplicative decrease decreases throughput proportionally.

Question

Is TCP fair?

Yes, under idealized assumptions (same RTT and fixed number of sessions only in congestion avoidance).

Do all networks need to be fair?

Multimedia apps often do not use TCP. Do not want rate throttled by congestion control. Instead, use UDP to send audio/video at a constant rate, tolerate packet loss.

There is no “internet police” policing use of congestion control.

Applications can open multiple parallel connections between two hosts.

Evolving transport-layer functionality. TCP, UDP: principal transport protocols for 40 years.

Moving transport-layer functions to application layer, on top of UDP (HTTP/3: QUIC).

QUIC: Quick UDP Internet Connections

Application-layer protocol, on top of UDP. Increase performance of HTTP, deployed on many Google servers, apps (Chrome, mobile YouTube app).

In the normal case, we have one RTT for a TCP handshake (transport layer), then another RTT for TLS handshake (handshake), then data. This is 2 serial handshakes.

QUIC allows for connection to be established and secure keys to be exchanged in one RTT. This is a single handshake.

QUIC streams (parallelism), no HOL blocking.

Network-Layer Services

The transport layer does not concern itself with how data gets from sending host to the receiving host. It assumes there is a direct channel between the two. But in reality, there is rarely a direct channel between every pair of hosts in a network.

Networks provide the illusion of a direct channel between the sending and receiving host to the transport layer over a shared network.

On the end hosts:

Sender encapsulates transport segments into datagrams and passes them to the link layer
Receiver extracts segments from datagrams and delivers them to the transport layer.

All routers run network layer protocols and transmit datagrams from source to destination.

Question

What kinds of guarantees can this channel provide?

Examples for individual datagrams:

Guaranteed delivery
Guaranteed delivery with less than 40 msec delay

Example for a flow of datagrams:

In-order datagram delivery
Guaranteed minimum bandwidth to flow
Restrictions on changes in inter-packet spacing

Internet is a best effort service model. There are no guarantees on

Successful datagram deliver to destination
Timing or order of delivery
Bandwidth available to end-end flow

Lecture 13

Addressing:

Each host needs an identifier so the network can distinguish between traffic from/to different hosts

Routing:

Determine the best path from the source to the destination host

Forwarding:

Move packet along the path, i.e., on each router, move packets from the input link to the appropriate output link

Assume each host has a special identifier. Over the Internet, these are IP addresses. The destination address is included in the arriving packet.

Determining the best path from the source to the destination host is done by the routing algorithms.

# of hops
Congestion
Distance

Every router has a local forwarding table that maps each destination to an output link.

Routing vs Forwarding

Routing (Global action):

Determining paths to get packets to their destination
Populate forwarding tables of routers/switches

Forwarding (Local action):

Moving arriving packets from router’s input link to appropriate router output link

Routing protocol goal: Determine “good” paths (equivalently, routes), from sending hosts to receiving hosts, through network of routers.

Path: Sequence of routers packets traverse from given initial source host to final destination host.

Good: Least “cost”, “fastest”, “least congested”

Routing: a “top-10” networking challenge.

Graph abstraction: Link costs

$c_{i, j}$ : cost of direct link connecting $i$ and $j$ .

Example

$c_{w, z} = 5, c_{u, z} = \infty$

Cost is defined by network operator. Could always be 1, or inversely related to bandwidth, or related to congestion.

Graph: $G = (N, E)$

$N$ : set of routers or “end points” connected to a router

$E$ : set of links

If you are reading this and have read my other notes, hopefully you’ve also come to the conclusion that everything is a graph.

Routing algorithm approaches

Link state: All routers have complete topology and link cost info.

Distance vector: Routers only know about the state of their attached neighbours. Gets updated through an iterative process of computation and exchanging info with neighbours.

Link-state routing

There are two components; information propagation and path computation.

Information propagation: each node lets the other nodes know about what it is directly connected to and at what cost. That way, each node can build a picture of what the network graph looks like.

Path computation: Each node can use its local view of the complete network graph to compute the least-cost path to each destination.

Lecture 14

Link-state routing - Information Propagation

Each router periodically generates a message including it’s ID (not the same as network-layer identifiers such as IP), all the nodes it is directly connected to, the cost of the links connecting it to its direct neighbours.

Information is propagated to all routers. The message is sent to the neighbours, they will send it to their neighbours etc.

Eventually all routers have the same information (topology, link costs, which routers are connected to which end hosts).

Each node generates messages both periodically and when something changes. Each node forward the update it receives to its neighbours.

Link-state routing - Path computation

Each router computes least cost paths from itself to all other nodes so it will know the least-cost path to reach each end host. When it gets a datagram destined to a certain end host, it knows which link it should forward the datagram to.

Since every router will eventually have the same local version of the topology, the path they pick independently will be consistent with each other.

Question

How does the router compute the least-cost path?

Dijkstra’s algorithm (CS341) runs on each router, and computes the least-cost path from that router to all other nodes. Iterative: after $k$ iterations, know least cost path to $k$ destinations. It iterates until it knows least-cost paths to all destinations.

Notation:

$c_{x, y}$ : direct link from node $x$ to $y$ ; $= \infty$ if not direct neighbours
$D (v)$ : current estimate of cost of least-cost-path from source to destination $v$
$N^{'}$ : set of nodes whose least-cost-path definitively known

Routing based on least-cost: Oscillations are possible

When link costs depend on traffic volume, route oscillations possible.

Example

Routing to destination $a$ , traffic entering at $d$ , $c$ , $b$ with rates $1$ , $e$ (<1), $1$ . Link costs are directional, and volumne-dependent

Distance Vector routing algorithms.

Suppose node $x$ has $n$ neighbours, $v_{1}, v_{2}, \dots, v_{n}$ . The least-cost path from node $x$ to node $y$ will pass one of $x$ ‘s neighbours.

To find its least-cost path to $y$ , $x$ doesn’t necessarily need to build the entire network graph.

It only needs to know

$D_{v_{i}} (y)$ : the distance from $v_{i}$ to $y$
$c_{x, v_{i}}$ : the cost of the direct link from $x$ to $v_{i}$

Based on Bellman-Ford (dynamic programming, CS341):

Let $D_{x} (y)$ : cost of least-cost path from $x$ to $y$ . Then,

D_{x} (y) = v min {c_{x, v} + D_{v} (y)}

Link state packet has info about direct neighbours, and that was sent to all neighbours in the network.

Distance vector - only direct neighbours communicate with node. The information in that communication is the distance to every single node.

Key idea: from time-to-time, each node sends its own distance vector estimate to neighbours.

When $x$ receives new DV estimate from any neighbour, it updates its own DV using B-F equation: $D_{x} (y) \leftarrow min_{v} {c_{x, v} + D_{v} (y)}$ for each node $y \in N$ .

Under minor, natural conditions, the estimate $D_{x} (y)$ converge to the actual least cost $d_{x} (y)$ .

Each node waits for change in local link cost or message from neighbour $\to$ recomputes DV estimates using DV received from neighbour $\to$ if DV to any destination has changed, notify neighbours.

Iterative, asynchronous: Each local iteration is caused by:

Local link cost change
DV update message from neighbour

Distributed, self-stopping: Each node notifies neighbours only when its DV changes

Neighbours then notify their neighbours, only if necessary
No notification received $⟹$ no actions taken

Distance vector is asynchronous. Routers are not all synchronized with each other.

Link cost changes:

Node detects local link cost change
Updates routing info, recalculates local DV
If DV changes, notify neighbours

$t_{0} : y$ detects link-cost change, updates its DV, informs its neighbours

$t_{1} : z$ receives update from $y$ , updates its DV, computes new least cost to $x$ , send its neighbours its DV.

$t_{2} : y$ receives $z$ ‘s update, updates its DV. $y$ ’s least costs do not change, so $y$ does not send a message to $z$ .

Good news travels fast, but bad news travels slow (count-to-infinity problem).

Comparison of LS and DV

Messages:

LS: Each router’s Advertisement will have to be propagated to all the other routers
DV: Several messages exchanged between neighbours until we converge to the least cost paths; convergence time varies

Speed of convergence:

If you change the costs, how long until routes are stable again?

LS: Converges when messages about the change propagate. Dijkstra’s algorithm for least-cost path computation has to run.
DV: May have routing loops. Count-to-infinity problem

Lecture 15

For each destination we have a row in the forwarding table. This table will be too large since it will need to have billions of entries. Routing algorithms would have to run across millions of routers.

A flat network will not scale. Internet and its best-effort network layer are designed with hierarchy in mind. Addresses are hierarchical, hosts that are close together have addresses that can be combined into one group identifier. Routing is also hierarchical - routers learn how to reach groups of addresses.

IP: Internet’s best effort network-layer protocol.

Datagram format:

IPv4 is the first and widely deployed Internet protocol. Many problems in today’s Internet.

IPv6 replacement. Long time for deployment and use.

IPv4 Datagram format. There is overhead of 20 bytes of TCP, 20 bytes of IP $⟹$ 40 bytes + app layer overhead of TCP + IP.

Addressing:

IP address: 32-bit identifier associated with each host or router interface.

Interface: Connection between host/router and physical link. Router’s typically have multiple interfaces. The host typically has one or two interfaces (wired and wireless).

Question

How are interfaces (without an intervening router) actually connected?

This is a question for later.

IP addressing properties

Global uniqueness: Identifies host

Hierarchical structure:

Consists of subnet part + host part
Is necessary for Internet to scale to large number of hosts
Aids Internet routing

Question

What is a subnet?

Device interfaces that can physically reach each other without passing through an intervening router.

IP addresses have structure:

Subnet part (prefix): Devices in same subnet have common high order bits
Host part: Remaining low order bits

Subnets

Recipe for defining subnets: Detach each interface from its host or router, creating “islands” of isolated networks Each isolated network is called a subnet.

IP addressing: CIDR

Classless InterDomain Routing:

Subnet portion of address of arbitrary length. Address format: a.b.c.d/x where x is $#$ of bits in subnet portion of address.

subnet part 110010000001011100010000 host part 00000000 200.23.16.0/23

Question

What is the range of subnet 200.23.16.64/26?

This is 6 free bits, or 64 ( $2^{6}$ ) addresses. Last 8 bits are are of the format 01xx xxxx. The range is 200.23.16.64 - 200.23.16.127.

subnet part 110010000001011100010000 host part 01000000 200.23.16.64 to subnet part 110010000001011100010000 host part 01111111 200.23.17.127

Computer represent subnet masks using binary only.

Question

What is a subnet mask of 24?

11000? No! It would be 1111 1111. 1111 1111. 1111 1111. 0000 0000. This represents whether each bit is in the subnet or not.

Table of Contents

Backlinks

CS456 Course Notes

Lecture 1

Lecture 2

Lecture 3

Lecture 4

Lecture 5

Lecture 6

Lecture 7

Lecture 8

Lecture 9

Lecture 10

Lecture 11

Lecture 12

Lecture 13

Lecture 14

Lecture 15

Lecture 16