Professor: Pascal Poupart | Term: Winter 2026

Lecture 1

History of AI

  • Philosophy (reasoning, planning, learning, science, automation)
  • Mathematics (logic, probability, optimization)
  • Neuroscience (neurons, adaptation)
  • Economics

An agent is an entity that perceive and acts.

A rational agent selects actions that maximize its expected utility. Characteristics of the sensors, actuators, and environment dictate techniques for selecting rational actions.

This course is about:

  • General AI techniques for many problem types
  • Learning to choose and apply the technique

A rational agent : percepts actions

Caveat: Computational limitations and environmental constraints means we do not have perfect rationality.

To design a rational agent, the task must be defined

  • Performance measures
  • Environment
  • Actuators
  • Sensors

Properties of task environment determine what AI approach is most suitable.

  • Fully Observable vs Partially Observable
    • sensors give access to complete state of environment?
  • Deterministic vs Stochastic
    • is the next state completely determined by the current state action executed?
  • Episodic vs Dynamic
    • does the current decision / action influence future decisions or actions
  • Discrete vs Continuous
    • How are the states, time, actions modelled?
  • Static vs Dynamic
    • Is the environment changing as the agent is planning what to do next?
  • Single Agent vs Multiagent

Topics covered:

  • Search
    • Uninformed and Heuristic Search
    • Constraint Satisfaction
  • Reasoning Under Uncertainty
    • Probability Theory and Decision Theory
    • Probabilistic Inference, Causal Inference
    • Bayesian networks, Markov decision processes
  • Learning
    • Decision Trees, Statistical Learning, Neural Networks
    • Bandits, Reinforcement Learning
  • Multiagent systems
    • Game-tree search, Game theory, Multiagent Reinforcement Learning

Lecture 2

Planning agents decide based on evaluating future action sequences. Usually have a definite goal and the optimal state is achieving the goal at the last cost.

A search problem consists of:

  • A state space
  • An initial state
  • Actions in each state
  • Successor function (what is the next state we will be in based on some state and action)
  • A goal test
    • has no dots left
  • Action cost
    • Pac-man: +1 per step; -10 food, -500 win; +500 die; -200 eat ghost

A solution is an action sequence that reaches a goal state.

An optimal solution has least cost among all solutions.

Search problem is a model of the real world, not the real world itself.

Question

What is in a State Space?

The world state includes every last detail of the environment

A search state keeps only the details needed for planning (abstraction)

Problem: Pathing (= path finding)

  • States: ; location
  • Actions: NSEW
  • Transition: Update value
  • Goal test: is = destination

Problem: Eat-All-Dots

  • States: Pac-man location, boolean for each food
  • Actions: NSEW
  • Transition: update and possible a dot Boolean
  • Goal test: dots all false

World state:

  • Agent positions: 120 (10 12)
  • Food count: 30
  • Agent facing: NSEW

Question

How many world states?

Question

States for pathing?

Question

States for eat-all-dots?

State space graph: A mathematical representation of a search problem

  • Nodes are (abstracted) world configurations
  • Arcs represent successors (action results)
  • The goal test is a set of goal nodes (maybe only one)

Each state occurs only once in a state space graph.

We can rarely build this full graph in memory.

We can also construct a search tree. Each node in the search tree is an entire path in the state space graph.

We construct the tree on demand, and we construct as little as possible.

Consider a rectangular grid: (move NSEW).

Question

How many unique states are within steps of start?

Question

How many states in search tree of depth ?

General Tree Search

function Tree-search(p,s) returns a solution, or failure

initial search tere using the initial state of problem

loop do
	if there are no candidates for expansion then return failure
	
	choose a leaf node for expansion according to strategy
	
	if the node cotains a goal state then return the corresponding solution
	
	else expand the node and add the resulting nodes to search tree
end

DFS

  • Strategy: Expand a deepest node first
  • Implementation: Frontier is a LIFO stack

Search Algorithm Properties

  • Complete: Guaranteed to find a solution if one exists?
  • Optimal: Guaranteed to find the least cost path?
  • Time complexity?
  • Space complexity?

Cartoon of search tree:

  • is the branching factor
  • is the maximum depth
  • solutions at various depths

Question

Number of nodes in entire tree?

Question

What nodes does DFS expand?

  • Some left prefix of the tree down to depth
  • Could process the whole tree
  • If is finite, takes time

Question

How much space does the frontier take?

Only has siblings on path to root, so

Question

Is it complete?

could be infinite. Preventing cycles may help

Question

Is it optimal?

No, it finds the “leftmost” solution, regardless of depth or cost.

BFS

  • Strategy: Expand a shallowest node first
  • Implementation: FIFO Queue

Question

What nodes does BFS expand?

Processes all nodes above shallowest solution. Let depth of shallowest solution be . Search takes time

Question

How much space does the frontier take?

Has roughly the last tier, so

Question

Is it complete?

must be finite if a solution exists, so yes.

Question

Is it optimal?

If costs are equal (e.g., ).

DFS vs BFS

In terms of , the depth of the shallowest solution and , the max depth.

Question

When will BFS outperform DFS?

Question

When will DFS outperform BFS?

Idea: Get DFS’s space advantage with BFS’s time / shallow-solution advantages.

Run a DFS with depth limit 1. If no solution, run a DFS with depth limit 2, etc.

Question

Isn’t this wasteful?

Generally most work happens in the lowest level searched, so not so bad.

Cost-Sensitive Search

BFS finds the shortest path in terms of number of actions. It does not find the least-cost path.

Uniform Cost Search

  • = cost from root to .
  • Strategy: Expand lowest
  • Frontier is a priority queue sorted by

Level of traversal is along a cost contour.

Question

What nodes does UCS expand?

Processes all nodes with cost less than cheapest solution.

If solution costs and arc cost at least , then is effective depth (upper bound on depth of solution). Takes time

Question

How much space does the frontier take?

Roughly the last tier, so

Question

Is it complete

Assuming is finite and , yes.

Question

Is it optimal?

Yes.

Lecture 3

Uninformed search methods expand nodes based on distance from start node. But we often have some additional knowledge about the problem.

Our knowledge is often about the merit of nodes.

If we are concerned about the cost of the solution, we might want a notion of how expensive it is to get from a state to a goal.

If we are concerned with minimizing computation, we might want a notion of how easy it is to get from a state to a goal.

We focus on the cost of solution.

We need to develop a domain specific heuristic function, .

guesses the cost of reaching the goal from node .

If then we guess that it is cheaper to reach the goal from than it is from .

We require when is a goal node, for all other nodes.

Greedy best-first search

Use the heuristic function, , to rank the nodes in the fringe.

Search strategy: Expand node with lowest -value.

Greedily trying to find the least-cost solution.

Greedy best-first is not optimal. It is also not complete.

It has exponential space in worst case since need to keep all nodes in memory. Exponential worst case time where is the maximum depth of the tree.

search

Greedy best-first search is too greedy. It doesn’t take into account the cost of the path so far.

Define .

is the cost of the path to node . is the heuristic estimate of cost of reaching goal from node .

search: Expand node in fringe with lowest value.

Question

When should terminate?

terminates only when goal state is popped from the queue.

Question

Is optimal?

No.

Let denote the true minimal cost to the goal node from .

A heuristic, , is admissible if

Admissible heuristics never overestimate the cost to the goal. They are optimistic.

If the heuristic is admissible, then with tree-search is optimal.

Question

What if we revisit a state that was already expanded?

Then we might get a better solution.

To search graphs, we need something stronger than admissibility.

Consistency (monotonicity): . Almost any admissible heuristic function will also be consistent.

graph-search with a consistent heuristic is optimal.

is complete (assuming finite branching factor and positive cost).

It has exponential time complexity in the worst case. A good heuristic will help. if the heuristic is perfect.

It has exponential space complexity.

keeps most generates nodes in memory. On many problems, will run out of memory.

Iterative deepning (IDA)

  • Like IDS, but change -cost rather than depth at each iteration

SMA (Simplified Memory-Bounded )

  • Uses all available memory
  • Proceeds like but when it runs out of memory it drops the worst leaf node (one with highest value)
  • If all leaf nodes have the same -value then it drops oldest and expands the newest.
  • Optimal and complete if depth of shallowest goal node is less than memory size

A good heuristic function can make all the difference.

Question

How do we get heuristics?

One approach is to think of an easier problem and let be the cost of reaching the goal in the easier problem.

Example

8-puzzle game

Relax the game:

  1. Can move tile from position to position if is next to (ignore whether or not it is blank).
  2. Can move tile from position to position if is blank (ignore adjacency).
  3. Can move tile from position to position .

3 leads to misplaced tile heuristic

  • To solve this problem need to move each tile into its final position
  • Number of moves = number of misplaced tiles
  • Admissible

1 leads to manhattan distance heuristic

  • To solve the puzzle need to slide each tile into its final position
  • Admissible

Note dominates . for all .