Professor: Arne Storjohann | Term: Fall 2024

Module 1

Much of Computer Science is problem solving: Write a program that converts the given input to the expected output.

When first learning to program, we emphasize correctness. With this course, we will also be concerned with efficiency. We will study efficient methods of storing, accessing, and organizing large collections of data.

Typical operations include: Inserting new data items, deleting data items, searching for specific data items, sorting.

We consider various abstract data types (ADTs), and how to realize them efficiently using appropriate data structures. There is a strong emphasis on mathematical analysis in this course.

Useful Logarithms facts:

  • means
  • in this course means
  • , ,
  • ,

Useful Sums:

  • if .

Geometric Sequences:

Harmonic Sequence:

Algorithms and Problems

Problem: Description of possible input and desired output.

Problem Instance: One possible input for the specified problem.

Algorithm: Step-by-step process for carrying out a series of computations, given an arbitrary instance .

Solving a problem: An algorithm solves a problem if, for every instance of , computes a valid output for the instance in finite time.

Program: A program is an implementation of an algorithm using a specified computer language.

Pseudocode: Communicate an algorithm to another person.

insertion-sort(A, n)
A: array of size n
for (i <- i; i < n; i++) do
	for (j <- i; j > 0 and A[j-1] > A[j]; j--) do
		swap A[j] and A[j-1]

Omits obvious details, and should be precise about exit-conditions.

For problem to program that solves it:

  • Design an algorithm that solves Algorithm Design.
  • Assess correctness and efficiency of each Algorithm Analysis. Correctness CS245.
  • If acceptable, implement algorithms.
  • If multiple acceptable algorithms, run experiments to determine the best solution.

This course will focus on the first two points.

Question

What do we mean by efficiency?

In this course, we are primarily concerned with the amount of time a program takes to run (run time).

We also may be interested in the amount of additional memory the program requires (auxiliary space).

The amount of time and or memory required by a program will usually depend on the given problem instance.

RAM model:

  • Each memory cell stores one datum, typically a number, character, or reference.
  • Any access to a memory location takes constant time.
  • Any primitive operation takes constant time.

We compare algorithms by considering the growth rate: What is the behaviour of algorithms as size gets large?

We use order notation (big- and friends). Ignore constants and lower order terms.

We study relationships between functions.

Definition

-notation: ( is asymptotically upper-bounded by ) if there exists constants and such that for all .

In order to prove that from first principles, we need to find and such that the following condition is satisfied:

Many, but not all, choices of and will work.

We also have that . We want a tight asymptotic bound.

Definition

-notation: ( is asymptotically lower-bounded by ) if there exists constants and such that for all .

Definition

-notation: ( is asymptotically tightly-bounded by ) if there exists constants , and such that .

Equivalently:

We also say that the growth rates of and are the same.

How growth rates affect running time. It is interesting to see how the running time is affected when the size of the problem instance doubles.

We have .

Question

How to express that grows slower than .

Definition

-notation: ( is asymptotically strictly smaller than ) if for all constants , there exists a constant such that .

Definition

-notation: ( is asymptotically strictly larger than ) if for all constants , there exists a constant such that .

Suppose that and for all . Suppose that

Then

If the fraction goes towards , then .

This can be computed using l’Hôspital’s rule. This result gives sufficient but not necessary conditions for the stated conclusion to hold.

Many rules are easily proved from first principle.

Identity rule: .

Transitivity:

  • If and then .
  • If and then .
  • If and then .

Suppose that and for all . Then:

Relationships between order notations

Avoid doing arithmetic with asymptotic notations. Do not write .

Techniques for run-time analysis:

  • Identify primitive operations that require time
  • The complexity of a loop is expressed as the sum of the complexities of each iteration of the loop.
  • Nested loops: start with the innermost loop and proceed outwards. This gives nested summations.

Let denote the running time of an algorithm on instance .

Worst-case (best-case) complexity of an algorithm: The worst-case (best-case) running time of an algorithm is a function mapping to the longest (shortest) running time for any input instance of size :

Average-case complexity of an algorithm: The average-case running time of an algorithm is a function mapping to the average running time of over all instances of size :

Question

Suppose algorithm has worst-case run-time and algorithm has worst-case run-time , and both the same problem. Is more efficient?

No! -notation is an upper-bound. may well have worst-case run-time . We should always give -bounds.

Question

Suppose the run-times are and , respectively. We consider to be better. But is it always more efficient?

No! The worst-case run-time of may only be achieved on some instances. Possibly is better on most instances.

Merge-sort

Step 1: Describe the overall idea

Input: Array of integers.

  1. We split into subarrays and that are roughly half as big.
  2. Recursively sort and
  3. After and have been sorted, use a function merge to merge them into a single sorted array.

Step 2: Give pseudo-code or detailed description

merge-sort(A,n)
if (n <= 1) then return
else
	m = (n-1)/2
	merge-sort(A[0..m], m+1)
	merge-sort(A[m+1..n-1], r)
	merge(A[0..m], A[m +1...n-1])

Do not pass array by value, instead indicate the range of the array that needs to be sorted. merge an auxiliary array . Allocate this only once.

merge(A,l,m,r,S)
A[0..n-1] is an array, A[l..m] is sorted, A[m+1..r] is sorted
S[0..n-1] is an array
copy A[l..r] into S[l..r]
(i_L, r_R) <- (l, m+1)
for (k <- l, k <= r: k++) do
	if (i_L > m) A[k] <- S[i_R ++]
	else if (i_R > r) A[k] <- S[i_L ++]
	else if (S[i_L] <= S[i_R]) A[k] <- S[i_L ++]
	else A[k] <- S[i_R ++]

Step 3: Argue correctness. Typically state loop-invariants. Sometimes obvious enough from idea-description and comments.

Step 3: Analyze the run-time

  • First analyze work done outside recursion
  • If applicable, analyze subroutines separately. If there are recursions: how big are the subproblems? The run-time then becomes a recursive function.

The recurrence relation for is as follows

The following is the corresponding sloppy recurrence

This resolves to .

Module 2

Definition

Abstract Data Type (ADT): A description of information and a collection of operations on that information. (Covered in CS135).

The information is accessed only through the operations.

We can have various realizations of an ADT, which specify how the information is stored and how the operations are performed.

Recall:

Stack: An ADT consisting of a collection of items with operations:

  • push: Add an item to the stack.
  • pop: Remove and return the most recently added item.

Items are removed in LIFO order. We can have extra operations size, is-empty, and top.

Queue: An ADT consisting of a collection of items with operations:

  • enqueue: Add an item to the queue
  • dequeue: Remove and return the least recently inserted item.

Items are removed in FIFO order.

We can have extra operations size, is-empty, and peek/format.

Priority Queue generalizes both ADT Stack and ADT Queue.

It is a collection of items (each with a priority or key) with operations:

  • insert: Inserting an item tagged with a priority
  • delete-max: Removing and returning an item of highest priority.

We can have extra operations: size, is-empty, get-max.

This is a maximum-oriented priority queue. A minimum-oriented priority queue replaces operation delete-max by delete-min.

Using a priority queue to sort:

PQ-Sort(A[0..n-1])
for i <- 0 to n-1 do
	PQ.insert (an item with priority A[i])
for i <- n-1 down to 0 do
	A[i] <- priority of PQ.delete-max()

Written as: (initialization insert + delete-max

Realizations of Priority Queues

Realization 1: Unsorted arrays

Run-time of operations:

  • insert:
  • delete-max:

We assume dynamic arrays, expand by doubling as needed. Amortized over all insertions this takes extra time.

PQ-sort with this realization yields selection-sort.

Realization 2: Sorted arrays

Run-time of operations:

  • insert:
  • delete-max:

PQ-sort with this realization yields insertion-sort. insert can be implemented to have best-case run-time. insertion-sort then has best-case run-time.

Realization 3: Heaps

A binary heap is a certain type of binary tree. A binary tree is either empty, or consists of a node and two binary trees (left subtree and right subtree).

Level all nodes with distance from the root. Root is on level .

Height maximum number for which level contains nodes. Single-node tree has height , empty tree has height .

Any binary tree with height has nodes. So height .

A heap is a binary tree with the following two properties:

  1. Structural property: All the levels of a heap are completely filled, except possibly for the last level. The filled items in the last level are left-justified.
  2. Heap-order property: For any node , the key of the parent of is larger than or equal to key of .

The full name for this is max-oriented binary heap.

Lemma: The height of a heap with nodes is .

Heaps should not be stored as binary trees. Let be a heap of items and let be an array of size . Store root in and continue with elements level-by-level from top to bottom, in each level-to-right.

It is easy to navigate using this array representation:

  • The root node is at index 0
  • The last node is
  • The left child is of node is node
  • The right child of node is node
  • The parent of node is node
  • These nodes exist if the index falls in the range

We hide implementation details using helper-functions.

insert in heaps: Place the new key at the first free leaf. Use fix-up to restore heap order.

insert(x)
l <- last()+ 1
A[l] <- x
increase size
fix-up(A, l)
fix-up(A,i)
while parent(i) exists and A[parent(i)].key < A[i].key do
	swap A[i] and A[parent(i)]
	i <- parent(i)

Time: height of heap (and this is tight).

delete-max in heaps

The maximum item of a heap is just the root node. We replace the root by the last leaf (last leaf is taken out). The heap-order property might be violated, perform a fix-down.

fix-down(A,i)
while i is not a leaf do
	j <- left child of i
	if (i has right child and A[right child of i].key > A[j].key)
		j <- right child of i
	if A[i].key >= A[j].key break
	swap A[j] and A[i]
	i <- j

Time: height of heap (and this is tight).

delete-max()
l <- last()
swap A[root()] and A[l]
decrease size
fix-down(A, root(), size)
return A[l]

Time: height of heap (and this is tight).

Binary heap are a realization of priority queues where the operations insert and delete-max take time.

Any priority queue can be used to sort in time

Using the binary-heaps implementation of PQs, we obtain:

PQ-sort-with-heaps(A)
for i <- 0 to n-1 do
	H.insert(A[i])
for <- n-1 down to 0 do
	A[i] <- H.delete-max()

Both operations run in time for heaps PQ-sort using heaps takes time.

Can improve this with two simple tricks heap-sort.

  1. Can use the same array for input and heap auxiliary space.
  2. Heaps can be built faster if we know all input in advance.

Problem: Given items all at once (in ) build a heap containing all of them.

Solution 1: Start with an empty heap and insert items one at a time.

simple-heap-building(A)
for i <- 0 to A.size() - 1 do
	H.insert(A[i])

This corresponds to doing fix-up’s. Worst-case running time:

We can use fix-down’s instead.

heapify(A)
n <- A.size()
for i <- parent(last()) downto root() do
	fix-down(A, i, n)

This yields a worst-case complexity of . A heap can be built in linear time.

Efficient sorting with heaps: Idea is to use PQ-sort with heaps.

heap-sort()A
n <- A.size()
for i <- parent(last()) downto 0 do
	fix-down(A,i,n)
 
while n > 1
	swap items at A[root()] and A[last()]
	decrease n
	fix-down(A, root(), n)

The for-loop takes time and the while-loop takes time.

Problem: Find the th smallest item in an array of numbers.

Solution 1: Make passes through the array, deleting the minimum number each time. Complexity .

Solution 2: Sort , then return . Complexity .

Solution 3: Create a min-heap with heapify(A). Call delete-min(A) times. Complexity: .

Question

We can achieve worst-case time easily, but can we do better?

Yes.

Module 3

We introduce (and solve) a new problem, and then analyze the average-case run-time of our algorithm.

Recall the definition of average-case run-time:

Assume that the set of size- instances is finite. Assume that all instances can occur equally frequently. Then we can use the following simplified formula

To learn how to analyze this, we will do simpler examples first.

silly-test(π, n)
π: a permutation of {0, ..., n-1}, stored as an array
if π[0] = 0 then for j = 1 to n do print 'bad case'
else print 'good case'

permutations have run-time . The remaining permutations have run-time .

Another not-so-contrived example

all-0-test(w,n)
// test whether all entries of bitstring w[0..n-1] are 0
if (n=0) return true
if (w[n-1] = 1) return false
all-0-test(w,n-1)

Define bit-comparisons on input . This is asymptotically the same as the run-time.

Worst-case run-time: Always go into the recursion until . .

Best-case run-time: Return immediately. .

Average-case run-time?

Recall \quad .

Recursive formula for one non-empty bitstring :

Natural guess for the recursive formula for :

This holds with but is useless if ??? is worst. This is not obvious if ??? is avg.

Observe: is uniquely determined by and .

This recursion resolves .

Question

Why can’t we always write ‘avg’ for ’???’ in ?

Consider

silly-all-0-test(w,n)
if (n = 0) then return true
if (w[n-1] = 1) then return false
if (n>1) then w[n-2] <- 0
silly-all-0-test(w,n-1)

Only one more line of code in each recursion, so same formula applies.

But observe that now

So . The ‘obvious’ recursion did not hold. Average-case analysis is highly non-trivial for recursive algorithms.

We saw SELECTION: Given an array of numbers, and , find the element that would be at position of the sorted array.

SELECTION can be done with heaps in time .

Special case: MEDIANFINDING = SELECTION with . With previous approaches, this takes time , no better than sorting.

Question

Can we do selection in linear time?

Yes, the algorithm quick-select.

quick-select and the related quick-sort rely on two subroutines:

  • choose-pivot(A): Return an index in . We will use the pivot-value to rearrange the array.
  • partition(A,p): Rearrange and return pivot-index so that
    • the pivot-value is in ,
    • all items in are , and
    • all items in are .

index of pivot-value before partition (we choose it), index of pivot-value after partition (no control).