Econ 121b: Intermediate Microeconomics, What’s Economics?

Uncategorized

Econ 121b: Intermediate Microeconomics
Dirk Bergemann, Spring 2012
1 Introduction
1.1 What’s Economics?
This is an exciting time to study economics, even though may not be so exciting
to be part of this economy. We have faced the largest financial crisis since the
Great Depression. $787 billion has been pumped into the economy in the form
of stimulus package by the US Government. $700 billion has been spent on the
Troubled Asset Relief Programs for the Banks. The unemployment rate has been
high for a long time. The August unemployment rate is 9.7%. Also there has been
big debates going on at the same time on health care reform, government deficits,
climate change etc. We need answers to all of these big questions and many others.
And all of these come under the purview of the discipline of economics (along with
other fields of study). But then how do we define this field of study? In terms of
subject matter it can be defined as the study of allocation of scarce resources. A
more pragmatic definition might be, economics is what economists do! In terms
of methodology Optimiz! ation Theory, Statistical Analysis, Game Theory etc
characterize the study of Economics. One of the primary goals of economics is to
explain human behavior in various contexts, humans as consumer of commodities
or decision maker in firms or head of families or politician holding a political office etc. The areas of research extends from international trade, taxes, economic
growth, antitrust to crime, marriage, war, laws, media, corruption etc. There are
a lot of opportunity for us to bring our way of thinking to these issues. Indeed,
one of most active areas of the subject is to push this frontier.
Economists like to think that the discipline follows Popperian methods, moving
from Stylized facts to Hypothesis formation to Testing hypothesis. Popperian tradition tells you that hypotheses can only be proven false empirically, not proven
true. Hence an integral part of economics is to gather information about the real
world in the form of data and test whatever hypothesis that the economists are
proposing to be true. What this course builds up, however, is how to come up
with sensible hypotheses that can be tested. Thus economic theory is the exercise
in hypothesis formation using the language of mathematics to formalize assumptions (about certain fundamentals of human behavior, or market organization, or
1
distribution of information among individuals etc). Some critics of economics say
our models are too simplistic. We leave too many things out. Of course this is
true – we do leave many many things out, but for a useful purpose. It is better to
be clear about an argument! and focusing on specific things in one model helps
us achieve that. Failing to formalize a theory does not necessarily imply that the
argument is generic and holistic, it just means that the requirement of specificity
in the argument is not as high.
Historically most economists rely on maximization as a core tool in economics,
and it is a matter of good practice. Most of what we will discuss in this course
follows this tradition: maximization is much easier to work with than alternatives.
But philosophically I don’t think that maximization is necessary for any work to
be considered as part of economics. You will have to decide on your own. My own
view is that there are 3 core tools:
• The principle that people respond to incentives
• An equilibrium concept that assumes that absence of free lunches
• A welfare criteria saying that more choices are better
Last methodological point: Milton Friedman made distinction of the field into
positive and normative economics:
• Positive economics – why the world is the way it is and looks the way it does
• Normative economics – how the world can be improved
Both areas are necessary and sometimes merge perfectly. But there are often
tensions. We will return to this throughout the rest of the class. What I hope you
will get out of the course are the following:
• Ability to understand basic microeconomic mechanisms
• Ability to evaluate and challenge economic arguments
• Appreciation for economic way of looking at the world
We now try to describe a very simple form of human interaction in an economic
context, namely trade or the voluntary exchange of goods or objects between two
people, one is called the seller, the current owner of the object and the other the
buyer, someone who has a demand or want for that object. It is referred to as
bilateral trading.
1.2 Gains from Trade
1.2.1 Bilateral Trading
Suppose that a seller values a single, homogeneous object at c (opportunity cost),
and a potential buyer values the same object at v (willingness to pay). Trade could
occur at a price p, in which case the payoff to the seller is p − c and to the buyer
is v − p. We assume for now that there is only one buyer and one seller, and only
2
one object that can potentially be traded. If no trade occurs, both agents receive
a payoff of 0.
Whenever v > c there is the possibility for a mutually beneficial trade at some
price c ≤ p ≤ v. Any such allocation results in both players receiving non-negative
returns from trading and so both are willing to participate (p − c and v − p are
non-negative).
There are many prices at which trade is possible. And each of these allocations,
consisting of whether the buyer gets the object and the price paid, is efficient in
the following sense:
Definition 1. An allocation is Pareto efficient if there is no other allocation that
makes at least one agent strictly better off, without making any other agent worse
off.
1.2.2 Experimental Evidence
This framework can be extended to consider many buyers and sellers, and to allow
for production. One of the most striking examples comes from international trade.
We are interested, not only in how specific markets function, but also in how
markets should be organized or designed.
There are many examples of markets, such as the NYSE, NASDAQ, E-Bay and
Google. The last two consist of markets that were recently created where they did
not exist before. So we want to consider not just existing markets, but also the
creation of new markets.
Before elaborating on the theory, we will consider three experiments that illustrate how these markets function. We can then interpret the results in relation
to the theory. Two types of cards (red and black) with numbers between 2 and
10 are handed out to the students. If the student receives a red card they are a
seller, and the number reflects their cost. If the student receives a black card they
are a buyer, and this reflects their valuation. The number on the card is private
information. Trade then takes place according to the following three protocols.
1. Bilateral Trading: One seller and one buyer are matched before receiving
their cards. The buyer and seller can only trade with the individual they are
matched with. They have 5 minutes to make offers and counter offers and
then agree (or not) on the price.
2. Pit Market: Buyer and seller cards are handed out to all students at the
beginning. Buyers and sellers then have 5 minutes to find someone to trade
with and agree on the price to trade.
3. Double Auction: Buyer and seller cards are handed out to all students at
the beginning. The initial price is set at 6 (the middle valuation). All buyers
3
and sellers who are willing to trade at this price can trade. If there is a
surplus of sellers the price is decreased, and if there is a surplus of buyers
then the price is increased. This continues for 5 minutes until there are no
more trades taking place.
2 Choice
In the decision problem in the previous section, the agents had a binary decision: whether to buy (sell) the object. However, there are usually more than two
alternatives. The price at which trade could occur, for example, could take on
a continuum of values. In this section we will look more closely at preferences,
and determine when it is possible to represent preferences by “something handy,”
which is a utility function.
Suppose there is a set of alternatives X = {x1, x2, . . . , xn} for some individual
decision maker. We are going to assume, in a manner made precise below, that
two features of preferences are true.
• There is a complete ranking of alternatives.
• “Framing” does not affect decisions.
We refer to X as a choice set consisting of n alternatives, and each alternative
x ∈ X is a consumption bundle of k different items. For example, the first element
of the bundle could be food, the second element could be shelter and so on. We
will denote preferences by %, where x % y means that “x is weakly preferred to
y.” All this means is that when a decision maker is asked to choose between x and
y they will choose x. Similarly, x y, means that “x is strictly preferred to y”
and x ∼ y indicates that the decision maker is “indifferent between x and y.” The
preference relationship % defines an ordering on X × X. We make the following
three assumptions about preferences.
Axiom 1. Completeness. For all x, y ∈ X either x % y, y % x, or both.
This first axiom simply says that, given two alternatives the decision maker
can compare the alternatives, and will weakly prefer one of the alternatives to the
other, or will be indifferent, in case both are weakly preferred to each other.
Axiom 2. Transitivity. For all triples x, y, z ∈ X if x % y and y % z then x % z.
Very simply, this axiom imposes some level of consistency on choices. For example, suppose there were three potential travel locations, Tokyo (T), Beijing (B),
and Seoul (S). If a decision maker, when offered the choice between Tokyo and
4
good 1
good 2
bundle x
bundles preferred to b
bundle b
Figure 1: Indifference curve
Beijing, weakly prefers to go to Tokyo, and when given the choice between Beijing
and Seoul weakly prefers to go to Beijing, then this axiom simply says that if she
was offered a choice between a trip to Tokyo or a trip to Seoul, she would weakly
prefer to go to Tokyo. This is because she has already demonstrated that she
weakly prefers Tokyo to Beijing, and Beijing to Seoul, so weakly preferring Seoul
to Tokyo would mean that their preferences are inconsistent.
But it is conceivable that people might violate transitivity in certain circumstances. One of them is “framing effect”. It is the idea that the way the choice
alternatives are framed may affect decision and hence in turn may violate transitivity eventually. The idea was made explicit by an experiment due to Danny
Kahneman and Amos Tversky (1984). In the experiment students visiting the
MIT-Coop to purchase a stereo for $125 and a calculator for $5 were informed
that the calculator is on sale for 5 dollars less at Harvard Coop. The question is
would the students make the trip?
Suppose instead the students were informed that the stereo is 5 dollars less at
Harvard Coop.
Kahneman and Tversky found that the fraction of respondents who would travel
for cheaper calculator is much higher than for cheaper stereo. But they were also
told that there is a stockout and the students have to go to Harvard Coop, and
will get 5 dollars off either item as compensation, and were asked which item do
you care to get money off? Most of them said that they were indifferent. If x =
go to Harvard and get 5 dollars off calculator, y= go to Harvard and get 5 dollars
off stereo, z = get both items at MIT. We have x z and z y, but last question
implies x ∼ y. Transitivity would imply that x y, which is the contradiction.
5
good 1
good 2
y
y’
ay+(1-a)y’
Figure 2: Convex preferences
We for the purposes of this course would assume away any such framing effects in
the mind of the decision maker.
Axiom 3. Reflexivity. For all x ∈ X, x % x (equivalently, x ∼ x).
The final axiom is made for technical reasons, and simply says that a bundle
cannot be strictly preferred to itself. Such preferences would not make sense.
These three axioms allow for bundles to be ordered in terms of preference. In
fact, these three conditions are sufficient to allow preferences to be represented by
a utility function.
Before elaborating on this, we consider an example. Suppose there are two
goods, Wine and Cheese. Suppose there are four consumption bundles z =
(2, 2), y = (1, 1), a = (2, 1), b = (1, 2) where the two elements of the vector represent the amount of wine or cheese. Most likely, z y since it provides more of
everything (i.e., wine and cheese are “goods”). It is not clear how to compare a
and b. What we can do is consider which bundles are indifferent with b. This is
an indifference curve (see Figure 1). We can define it as
Ib = {x ∈ X|b ∼ x)
We can then (if we assume that more is better) compare a and b by considering
which side of the indifference curve a lies on: bundles above and to the right are
more preferred, bundles below and to the left are less preferred. This reduces
the dimensionality of the problem. We can speak of the “better than b” set as
the set of points weakly preferred to b. These preferences are “ordinal:” we can
ask whether x is in the better than set, but this does not tell us how much x is
6
good 1
good 2
good 1
good 2
Figure 3: Perfect substitutes (left) and perfect complements (right)
preferred to b. It is common to assume that preferences are monotone: more of a
good is better.
Definition 2. The preferences % are said to be (strictly) monotone if x ≥ y ⇒
x % y (x ≥ y, x 6= y ⇒ x % y for strict monotonicity).1
Suppose I want to increase my consumption of good 1 without changing my
level of well-being. The amount I must change x2 to keep utility constant, dx2
dx1
is
the marginal rate of substitution. Most of the time we believe that individuals
like moderation. This desire for moderation is reflected in convex preferences. A
mixture between two bundles, between which the agent is indifferent, is strictly
preferred to either of the initial bundle (see Figure 2).
Definition 3. A preference relation is convex if for all y and y
0 with y ∼ y
0 and
all α ∈ [0, 1] we have that αy + (1 − α)y
0 % y ∼ y
0
.
While convex preferences are usually assumed, there could be instances where
preferences are not convex. For example, there could be returns to scale for some
good.
Examples: perfect substitutes, perfect complements (see Figure 3). Both of
these preferences are convex.
Notice that indifference curves cannot intersect. If they did we could take two
points x and y, both to the right of the indifference curve the other lies on. We
would then have x y x, but then by transitivity x x which contradicts
reflexivity. So every bundle is associated with one, and only one, welfare level.
Another important property of preference relation is continuity.
Definition 4. Let {xn}, {yn} be two sequences of choices. If xn % yn, ∀n and
xn → x, and yn → y, then x % y.
1
If x = (x1, . . . , xN ) and y = (y1, . . . , yN ) are vectors of the same dimension, then x ≥ y if
and only if, for all i, xi ≥ yi
. x 6= y means that xi 6= yi for at least one i.
7
This property guarantees that there is no jump in preferences. When X is no
longer finite, we need continuity to ensure a utility representation.
2.1 Utility Functions
What we want to consider now is whether we can take preferences and map them
to some sort of utility index. If we can somehow represent preferences by such a
function we can apply mathematical techniques to make the consumer’s problem
more tractable. Working with preferences directly requires comparing each of
a possibly infinite number of choices to determine which one is most preferred.
Maximizing an associated utility function is often just a simple application of
calculus. If we take a consumption bundle x ∈ R
N
+ we can take a utility function
as a mapping from R
N
+ into R.
Definition 5. A utility function (index) u : X → R represents a preference profile
% if and only if, for all x, y ∈ X: x % y ⇔ u(x) ≥ u(y).
We can think about a utility function as an “as if”-concept: the agent acts “as
if” she has a utility function in mind when making decisions.
Is it always possible to find such a function? The following result shows that
such a function exists under the three assumptions about preferences we made
above.
Proposition 1. Suppose that X is finite. Then the assumptions of completeness,
transitivity, and reflexivity imply that there is a utility function u such that u(x) ≥
u(y) if and only if x % y.
Proof. We define an explicit utility function. Let’s introduce some notation:
B(x) = {z ∈ X|x % z}
Therefore B(x) is the set of “all items below x”. Let the utility function be defined
as,
u(x) = |B(x)|
where |B(x)| is the cardinality of the set B(x), i.e. the number of elements in the
set B(x). There are two steps to the argument:
First part:
u(x) ≥ u(y) ⇒ x % y
Second part:
x % y ⇒ u(x) ≥ u(y)
8
First part of proof:
By definition, u(x) ≥ u(y) ⇒ |B(x)| ≥ |B(y)|. If y ∈ B(x), then x % y by
definition of B(x) and we are done. Otherwise, y /∈ B(x). We will work towards
a contradiction.
Since y /∈ B(x), we have
|B(x) − {y}| = |B(x)|
Since y ∈ B(y) (by reflexivity), we have
|B(y)| − 1 = |B(y) − {y}|
Since |B(x)| ≥ |B(y)|, |B(x)| > |B(y)| − 1 and hence,
|B(x) − {y}| > |B(y) − {y}|
Therefore, there must be some z ∈ X − {y} such that x % z and y z. By
completeness: z % y. By transitivity: x % y. But this implies that y ∈ B(x), a
contradiction. Second part of proof
Want to show: x % y ⇒ u(x) ≥ u(y).
Suppose x % y and z ∈ B(y).
Then x % y and y % z, so by transitivity x % z.
Hence, z ∈ B(x).
This shows that when x % y, anything in B(y) must also be in B(x).
B(y) ⊂ B(x) ⇒ |B(x)| ≥ |B(y)| ⇒ u(x) ≥ u(y)
This completes the proof.
In general the following proposition holds:
Proposition 2. Every (continuous) preference ranking can be represented by a
(continuous) utility function.
This result can be extended to environments with uncertainty, as was shown
by Leonard Savage. Consequently, we can say that individuals behave as if they
are maximizing utility functions, which allows for marginal and calculus arguments. There is, however, one qualification. The utility function that represents
the preferences is not unique.
Remark 1. If u represents preferences, then for any increasing function f : R → R,
f(u(x)) also represents the same preference ranking
In the previous section, we claimed that preferences usually reflect the idea
that “more is better,” or that preferences are monotone.
9
Definition 6. The utility function (preferences) are monotone increasing if x ≥ y
implies that u(x) ≥ u(y) and x > y implies that u(x) > u(y).
One feature that monotone preferences rule out is (local) satiation, where one
point is preferred to all other points nearby. For economics the relevant decision is maximizing utility subject to limited resources. This leads us to consider
constrained optimization.
3 Maximization
Now we take a look at the mathematical tool that will be used with the greatest
intensity in this course. Let x = (x1, x2, . . . , xn) be a n-dimensional vector where
each component of the vector xi
, i = 1, 2, . . . , n is a non-negative real number. In
mathematical notations we write x ∈ R
n
+. We can think of x as description of
different characteristics of a choice that the decision maker faces. For example,
while choosing which college to go (among the ones that have offered admission)
a decision maker, who is a student in this case, looks into different aspects of a
university, namely the quality of instruction, diversity of courses, location of the
campus etc. The components of the vector x can be thought of as each of these
characteristics when the choice problem faced by the decision maker (i.e. the student) is to choose which university to attend. Usually when people go to groceries
they are faced with the problem of buying not just! one commodity, but a bundle
of commodities and therefore it is the combination of quantities of different commodities which needs to be decided and again the components of x can be thought
of as quantities of each commodity purchased. Whatever be the specific context,
utility is defined over the set of such bundles. Since x ∈ R
n
+, we take X = R
n
+. So
the utility function is a mapping u: R
n
+ → R.
Now for the time being let x be one dimensional, i.e. x ∈ R. Let f : R → R be
a continuous and differentiable function that takes real numbers and maps it to
another real number. Continuity is assumed to avoid any jump in the function and
differentiability is assumed to avoid kinks. The slope of the function f is defined
as the first derivative of the function and the curvature of the function is defined
as the second derivative of the function. So, the slope of f at x is formally defined as:
df(x)
dx
, f
0
(x)
and the curvature of f at x is formally defined as:
d
2
f(x)
dx2
, f
00 (x)
10
In order to find out the maximum of f we must first look into the slope of f.
If the slope is positive then raising the value of x increases the value of f. So to
find out the maximum we must keep increasing x. Similarly if slope is negative
then reducing the value of x increases the value of f and therefore to find the
maximum we should reduce the value of x. Therefore the maximum is reached
when the slope is exactly equal to 0. This condition is referred to as the First
Order Condition (F.O.C.) or the necessary condition:
df(x)
dx = 0
But this in itself doesn’t guarantee that maximum is reached, as a perfectly flat
slope may also imply that we’re at the trough, i.e. at the minimum. The F.O.C.
therefore finds the extremum points in the function. We need to look at the
curvature to make sure whether the extremum is actually a maximum or not. If
the second derivative is negative then it means that from the extremum point if
we move x a little bit on either side f(x) would fall, and therefore the extremum
is a maximum. But if the second derivative is positive then by similar argument
we know that its the minimum. This condition is referred to as the Second Order
Condition (S.O.C) or the sufficient condition:
d
2
f(x)
dx2
≤ 0
Now we look at the definitions of two important kind of functions:
Definition 7. (i)A continuous and differentiable function f : R → R is (strictly)
concave if
d
2
f(x)
dx2
≤ (<)0.
(ii) f is convex if
d
2
f(x)
dx2
≥ (>)0.
Therefore a concave function the F.O.C. is both necessary and sufficient condition for maximization. We can also define concavity or convexity of functions
with the help of convex combinations.
Definition 8. A convex combination of two any two points x
0
, x
00
∈ R
n
is defined
as xλ = λx0
+ (1 − λ)x
00 for any λ ∈ (0, 1).
Convex combination of two points represent a point on the straight line joining
those two points. We now define concavity and convexity of functions using this
concept.
11
Definition 9. f is concave if for any two points x
0
, x
00
∈ R, f(xλ) ≥ λf(x
0
) + (1 −
λ)f(x
00) where xλ is a convex combination of x
0
and x
00 for λ ∈ (0, 1). f is strictly
concave if the inequality is strict.
Definition 10. Similarly f is convex if f(xλ) ≤ λf(x
0
)+(1−λ)f(x
00). f is strictly
convex if the inequality is strict.
If the utility function is concave for any individual then, given this definition,
we can understand that, she would prefer to have a certain consumption of xλ
than face an uncertain prospect of consuming either x
0
or x
00. Such individuals are
called risk averse. We shall explore these concepts in full detail later in the course
and then we would require these definitions of concavity and convexity.
4 Utility Maximization
4.1 Multivariate Function Maximization
Let x = (x1, x2, . . . , xn) ∈ R
n
+ be a consumption bundle and f : R
n
+ → R be a
multivariate function. The multivariate function that we are interested in here
is the utility function u: R
n
+ → R where u(x) is the utility of the consumption
bundle x.
The F.O.C. for maximization of f is given by:
∂f(x1, x2, . . . , xn)
∂xi
= 0 ∀ i = 1, 2, . . . , n
This is a direct extension of the F.O.C. for univariate functions as explained in
Lecture 3. The S.O.C. however is a little different from the single variable case.
Let’s look at a bivariate function f : R
2 → R. Let’s first define the following notations:
fi(x) ,
df(x)
dxi
, fii(x) ,
d
2
f(x)
dx2
2
, i = 1, 2,
fij (x) ,
d
2
f(x)
dxidxj
, i 6= j
The S.O.C. for the maximization of f is then given by,
(i) f11 < 0
(ii)

f11 f12
f21 f22

> 0
12
The first of the S.O.C.s is analogous to the S.O.C. for the univariate case. If we
write out the second one we get,
f11f22 − f12f21 > 0
But we know that f12 = f21. So,
f11f22 > f12
2
> 0
⇒ f22 < 0 (since f11 < 0)
Therefore the S.O.C. for the bivariate case is stronger than the analogous conditions from the univariate case. This is because for the bivariate case to make
sure that we are at the peak of a function it is not enough to check if the function
is concave in the directions of x1 and x2, as it could not be concave along the
diagonal and therefore the need to introduce cross derivatives in to the condition.
For the purposes of this class we’d assume that the S.O.C. is satisfied for the utility
function being given, unless it is asked specifically to check for it.
4.2 Budget Constraint
A budget constraint is a constraint on how much money (income, wealth) an
agent can spend on goods. We denote the amount of available income by I ≥
0. x1, . . . , xN are the quantities of the goods purchased and p1, . . . , pN are the
according prices. Then the budget constraint is
X
N
i=1
pixi ≤ I.
As an example, we consider the case with two goods. In that case we get that
p1x1 + p2x2 ≤ I, i.e., the agent spends her entire income on the two goods. The
points where the budget line intersects with the axes are x1 = I/p1 and x2 = I/p2
since these are the points where the agent spends her income on only one good.
Solving for x2, we can express the budget line as a function of x1:
x2(x1) = I
p2
−
p1
p2
x1,
where the slope of the budget line is given by,
dx2
dx1
= −
p1
p2
The budget line here is defined as the equation involving x1 and x2 such
that the decision maker exhausts all her income. The set of consumption bundles
(x1, x2) which are feasible given the income, i.e. (x1, x2) for which p1x1 + p2x2 ≤ I
holds is defined as the budget set.
13
4.3 Indifference Curve
Indifference Curve (IC) is defined as the locus of consumption bundles (x1, x2)
such that the utility is held fixed at some level. Therefore the equation of the IC
is given by,
u(x1, x2) = ¯u
To get the slope of the IC we differentiate the equation w.r.t. x1:
∂u(x)
∂x1
+
∂u(x)
∂x2
dx2
dx1
= 0
⇒
dx2
dx1
= −
∂u(x)
∂x1
∂u(x)
∂x2
= −
MU1
MU2
where MUi refers to the marginal utility of good i. So the slope of IC is the
(negative of) ratio of marginal utilities of good 1 and 2. This ratio is referred
to as the Marginal Rate of Substitution or MRS. This tells us the rate at which
the consumer is ready to substitute between good 1 and 2 to remain at the same
utility level.
4.4 Constrained Optimization
Consumers are typically endowed with money I, which determines which consumption bundles are affordable. The budget set consists of all consumption bundles
such that PN
i=1 pixi ≤ I. The consumer’s problem is then to find the point on the
highest indifference curve that is in the budget set. At this point the indifference
curve must be tangent to the budget line. The slope of the budget line is given by,
dx2
dx1
= −
p1
p2
which defines how much x2 must decrease if the amount of consumption of good 1
is increased by dx1 for the bundle to still be affordable. It reflects the opportunity
cost, as money spent on good 1 cannot be used to purchase good 2 (see Figure 4).
The marginal rate of substitution, on the other hand, reflects the relative benefit from consuming different goods. The slope of the indifference curve is −MRS.
So the relevant optimality condition, where the slope of the indifference curve
equals the slope of the budget line, is
p1
p2
=
∂u(x)
∂x1
∂u(x)
∂x2
.
14
good 1
good 2
budget set
optimal choice
Figure 4: Indifference curve and budget set
We could equivalently talk about equating marginal utility per dollar. If
∂u(x)
∂x2
p2
>
∂u(x)
∂x1
p1
then one dollar spent on good 2 generates more utility then one dollar spent on
good 1. So shifting consumption from good 1 to good 2 would result in higher
utility. So, to be at an optimum we must have the marginal utility per dollar
equated across goods.
Does this mean then that we must have ∂u(x)
∂xi
= pi at the optimum? No. Such
a condition wouldn’t make sense since we could rescale the utility function. We
could instead rescale the equation by a factor λ ≥ 0 that converts “money” into
“utility.” We could then write ∂u(x)
∂xi
= λpi
. Here, λ reflects the marginal utility of
money. More on this in the subsection on Optimization using Lagrange approach.
4.4.1 Optimization by Substitution
The consumer’s problem is to maximize utility subject to a budget constraint.
There are two ways to approach this problem. The first approach involves writing
the last good as a function of the previous goods, and then proceeding with an
unconstrained maximization. Consider the two good case. The budget set consists
of the constraint that p1x1 + p2x2 ≤ I. So the problem is
max
x1,x2
u(x1, x2) subject to p1x1 + p2x2 ≤ I
But notice that whenever u is (locally) non-satiated then the budget constraint
holds with equality since there in no reason to hold money that could have been
15
used for additional valued consumption. So, p1x1 + p2x2 = I, and so we can write
x2 as a function of x1 from the budget equation in the following way
x2 =
I − p1x1
p2
Now we can treat the maximization of u

x1,
I−p1x1
p2

as the standard single variable maximization problem. Therefore now the maximization problem becomes,
max
x1
u

x1,
I − p1x1
p2

The F.O.C. is then given by,
du
dx1
+
du
dx2
dx2(x1)
dx1
= 0
⇒
du
dx1
−
p1
p2
du
dx2
= 0
The second equation substitutes dx2(x1)
dx1
by −
p1
p2
from the budget line equation. We
can further rearrange terms to get,
du
dx1
p1
=
du
dx2
p2
⇒
du
dx1
du
dx2
=
p1
p2
This exactly the condition we got by arguing in terms of budget line and indifference curves. In the following lecture we shall look at a specific example where
we would maximize a particular utility function using this substitution method
and then move over to the Lagrange approach.
5 Utility Maximization Continued
5.1 Application of Substitution Method
Example 1. We consider a consumer with Cobb-Douglas preferences. CobbDouglas preferences are easy to use and therefore commonly used. The utility
function is defined as (with two goods)
u(x1, x2) = x
α
1 x
1−α
2
, α > 0
16
The goods’ prices are p1,p2 and the consumer if endowed with income I. Hence,
the constraint optimization problem is
max
x1,x2
x
α
1 x
1−α
2
subject to p1x1 + p2x2 = I.
We solve this maximization by substituting the budget constraint into the
utility function so that the problem becomes an unconstrained optimization with
one choice variable:
u(x1) = x
α
1

I − p1x1
p2
1−α
. (1)
In general, we take the total derivative of the utility function
du(x1, x2(x1))
dx1
=
∂u
∂x1
+
∂u
∂x2
dx2
dx1
= 0
which gives us the condition for optimal demand
dx2
dx1
= −
∂u
∂x1
∂u
∂x2
.
The right-hand side is the marginal rate of substitution (MRS).
In order to calculate the demand for both goods, we go back to our example.
Taking the derivative of the utility function (1)
u
0
(x1) = αxα−1
1

I − p1x1
p2
1−α
+ (1 − α)x
α
1

I − p1x1
p2
−α
−
p1
p2

= x
α−1
1

I − p1x1
p2
−α
α
I − p1x1
p2
− (1 − α)x1
p1
p2

so the FOC is satisfied when
α(I − p1x1) − (1 − α)x1p1 = 0
which holds when
x
∗
1
(p1, p2, I) = αI
p1
. (2)
Hence, we see that the budget spent on good 1, p1x1, equals the budget share αI,
where α is the preference parameter associated with good 1.
Plugging (2) into the budget constraint yields
x
∗
2
(p1, p2, I) = I − p1x1
p2
=
(1 − α)I
p2
.
These are referred to as the Marshallian demand or uncompensated demand.
17
Several important features of this example are worth noting. First of all, x1
does not depend on p2 and vice versa. Also, the share of income spent on each good
pixi
M
does not depend on price or wealth. What is going on here? When the price of
one good, p2, increases there are two effects. First, the price increase makes good 1
relatively cheaper ( p1
p2
decreases). This will cause consumers to “substitute” toward
the relatively cheaper good. There is also another effect. When the price increases
the individual becomes poorer in real terms, as the set of affordable consumption
bundles becomes strictly smaller. The Cobb-Douglas utility function is a special
case where this “income effect” exactly cancels out the substitution effect, so the
consumption of one good is independent of the price of the other goods.
Cobb – Douglass utility function u(x1, x2) = x1
αx2
(1−α)
sub. to budget constraint p1x1 + p2x2 = I
Therefore we get,
max
x1
x1
α

I − p1x1
p2
(1−α)
The F.O.C. is then given by,
αx1
α−1

I − p1x1
p2
(1−α)
+ (1 − α)x1
α

I − p1x1
p2
α
−
p1
p2

= 0
⇒ α

I − p1x1
p2

= (1 − α)x1

p1
p2

⇒ x1
∗
(p1, p2, I) = αI
p1
⇒ x2
∗
(p1, p2, I) = (1 − α)I
p2
This is referred to as the Marshallian Demand or uncompensated demand.
5.2 Elasticity
When calculating price or income effects, the result depends on the units used.
For example, when considering the own-price effect for gasoline, we might express
quantity demanded in gallons or liters and the price in dollars or euros. The
own-price effects would differ even if consumers in the U.S. and Europe had the
same underlying preferences. In order to make price or income effects comparable
across different units, we need to normalize them. This is the reason why we use
the concept of elasticity. The own-price elasticity of demand is defined as the
18
percentage change in demand for each percentage change in its own price and is
denoted by i
:
i = −
∂xi
∂pi
xi
pi
= −
∂xi
∂pi
pi
xi
.
It is common to multiply the price effect by −1 so that is a positive number since
the price effect is usually negative. Of course, the cross-price elasticity of demand
is defined similarly
ij = −
∂xi
∂pj
xi
pj
= −
∂xi
∂pj
pj
xi
.
Similarly the income elasticity of demand is defined as:
I =
∂xi
∂I
xi
I
=
∂xi
∂I
I
xi
5.2.1 Constant Elasticity of Substitution
Elasticity of substitution for a utility function is defined as the elasticity of the
ratio of consumption of two goods to the MRS. Therefore it is a measure of how
easily the two goods are substitutable along an indifference curve. In terms of
mathematics, it is defined as,
S =
d(x2/x1)
dMRS
MRS
x2/x1
For a class of utility functions this value is constant for all (x1, x2). These utility
functions are called Constant Elasticity of Substitution (CES) utility functions.
The general form looks like the following:
u(x1, x2) =
α1×1
−ρ + α2×2
−ρ
− 1
ρ
It is easy to show that for CES utility functions,
S =
1
ρ + 1
The following utility functions are special cases of the general CES utility function:
Linear Utility: Linear Utility is of the form
U(x1, x2) = ax1 + bx2, a, b constants
which is a CES utility with ρ = −1.
Leontief Utility: Leontief utility is of the form
U(x1, x2) = max{
x1
a
,
x2
b
}, a, b > 0
and this is also a CES utility function with ρ = ∞.
1
5.3 Optimization Using the Lagrange Approach
While the approach using substitution is simple enough, there are situations where
it will be difficult to apply. The procedure requires that, as we know, before the
calculation, the budget constraint actually binds. In many situations there may be
other constraints (such as a non-negativity constraint on the consumption of each
good) and we may not know whether they bind before demands are calculated.
Consequently, we will consider a more general approach of Lagrange multipliers.
Again, we consider the (two good) problem of
max
x1,x2
u(x1, x2) s.t. p1x1 + p2x2 ≤ I
Let’s think about this problem as a game. The first player, let’s call him the
kid, wants to maximize his utility, u(x1, x2), whereas the other player (the parent)
is concerned that the kid violates the budget constraint, p1x1 + p2x2 ≤ I, by
spending too much on goods 1 and 2. In order to induce the kid to stay within
the budget constraint, the parent can punish him by an amount λ for every dollar
the kid exceeds his income. Hence, the total punishment is
λ(I − p1x1 − p2x2).
Adding the kid’s utility from consumption and the punishment, we get
L(x1, x2, λ) = u(x1, x2) + λ(I − p1x1 − p2x2). (3)
Since, for any function, we have max f = − min f, this game is a zero-sum game:
the payoff for the kid is L and the parent’s payoff is −L so that the total payoff
will always be 0. Now, the kid maximizes expression (3) by choosing optimal levels
of x1 and x2, whereas the parent minimizes (3) by choosing an optimal level of λ:
min
λ
max
x1,x2
L(x1, x2, λ) = u(x1, x2) + λ(I − p1x1 − p2x2).
In equilibrium, the optimally chosen level of consumption, x
∗
, has to be the
best response to the optimal level of λ
∗ and vice versa. In other words, when we
fix a level of x
∗
, the parent chooses an optimal λ
∗and when we fix a level of λ
∗
,
the kid chooses an optimal x
∗
. In equilibrium, no one wants to deviate from their
optimal choice. Could it be an equilibrium for the parent to choose a very large
λ? No, because then the kid would not spend any money on consumption, but
rather have the maximized expression (3) to equal λI.
Since the first-order conditions for minima and maxima are the same, we have
20
the following first-order conditions for problem (3):
∂L
∂x1
=
∂u
∂x1
− λp1 = 0 (4)
∂L
∂x2
=
∂u
∂x2
− λp2 = 0 (5)
∂L
∂λ = I − p1x1 − p2x2 = 0.
Here, we have three equations in three unknowns that we can solve for the optimal
choice x
∗
, λ∗
.
Before solving this problem for an example, we can think about it in more
formal terms. The basic idea is as follows: Just as a necessary condition for a
maximum in a one variable maximization problem is that the derivative equals 0
(f
0
(x) = 0), a necessary condition for a maximum in multiple variables is that all
partial derivatives are equal to 0 ( ∂f(x)
∂xi
= 0). To see why, recall that the partial
derivative reflects the change as xi
increases and the other variables are all held
constant. If any partial derivative was positive, then holding all other variables
constant while increasing xi will increase the objective function (similarly, if the
partial derivative is negative we could decrease xi). We also need to ensure that
the solution is in the budget set, which typically won’t happen if we just try to
maximize u. Basically, we impose a “cost” on consumption (the punishment in the
game above), proceed with unconstrained maximization for the induced problem,
and set this cost so that the maximum lies in the budget set.
Notice that the first-order conditions (4) and (5) imply that
∂u
∂x1
p1
= λ =
∂u
∂x2
p2
or
∂u
∂x1
∂u
∂x2
=
p1
p2
which is precisely the “MRS = price ratio” condition for optimality that we saw
before.
Finally, it should be noted that the FOCs are necessary for optimality, but
they are not, in general, sufficient for the solution to be a maximum. However,
whenever u(x) is a concave function the FOCs are also sufficient to ensure that the
solution is a maximum. In most situations, the utility function will be concave.
Example 2. We can consider the problem of deriving demands for a Cobb-Douglas
utility function using the Lagrange approach. The associated Lagrangian is
L(x1, x2, λ) = x
α
1 x
1−α
2 + λ(I − p1x1 − p2x2),
21
which yields the associated FOCs
∂L
∂x1
= αxα−1
1 x
1−α
2 − λp1 = α

x2
x1
1−α
− λp1 = 0 (6)
∂L
∂x2
= (1 − α)x
α
1 x
−α
2 − λp2 = (1 − α)

x1
x2
α
− λp2 = 0 (7)
∂L
∂λ = (I − p1x1 − p2x2) = 0. (8)
We have three equations with three unknowns (x1, x2, λ) so that this system should
be solvable. Notice that since it is not possible that x2
x1
and x1
x2
are both 0 we cannot
have a solution to equations (6) and (7) with λ = 0. Consequently we must have
that p1x1 + p2x2 = I in order to satisfy equation (8). Solving for λ in the above
equations tells us that
λ =
α
p1

x2
x1
1−α
=
(1 − α)
p2

x1
x2
α
and so
p2x2 =
1 − α
α
p1x1.
Combining with the budget constraint this gives
p1x1 +
1 − α
α
p1x1 =
1
α
p1x1 = I.
So the Marshallian2 demand functions are
x
∗
1 =
αI
p1
and
x
∗
2 =
(1 − α)I
p2
.
So we see that the result of the Lagrangian approach is the same as from approach
that uses substitution. Using equation (6) or (7) again along with the optimal
demand x
∗
1
or x
∗
2
gives us the following expression for λ:
λ
∗ =
1
I
.
Hence, λ
∗
equals the derivative of the Lagrangian L with respect to income I. We
call this derivative, ∂L
∂I , the marginal utility of money.
2After the British economist Alfred Marshall.
22
6 Value Function and Comparative Statics
6.1 Indirect Utility Function
The indirect utility function
V (p1, p2, I) , u (x1
∗
(p1, p2, I), x2
∗
(p1, p2, I))
Therefore V is the maximum utility that can be achieved given the prices and
the income level. We shall show later that λ is same as
∂V (p1, p2, I)
∂I
6.2 Interpretation of λ
From FOC of maximization we get,
dL
dx1
=
∂u
dx1
− λp1 = 0
dL
dx2
=
∂u
dx2
− λp2 = 0
dL
dλ = I − p1x1 − p2x2 = 0
From the first two equations we get,
λ =
∂u
dxi
pi
This means that λ can be interpreted as the par dollar marginal utility from any
good. It also implies, as we have argued before, that the benefit to cost ratio is
equalized across goods. We can also interpret λ as shadow value of money. But
we explain this concept later. Before that let’s solve an example and find out the
value of λ for that problem.
Let’s work with the utility function:
u(x1, x2) = α ln x1 + (1 − α) ln x2
The F.O.C.s are then given by,
∂L
∂x1
=
α
x1
− λp1 = 0 (9)
∂L
∂x1
=
1 − α
x2
− λp2 = 0 (10)
∂L
∂λ = I − p1x1 − p2x2 = 0 (11)
23
From the first two equations (3) and (4) we get,
x1 =
α
λp1
and x2 =
1 − α
λp2
Plugging it in the F.O.C. equation (5) we get,
I =
α
λ
+
1 − α
λ
⇒ λ
∗ =
1
I
6.3 Comparative Statics
Let f : R × R → R be a function which is dependent on an endogenous variable,
say x, and an exogenous variable a. Therefore we have,
f(x, a)
Let’s define value function as the maximized value of f w.r.t. x, i.e.
v(a) , max
x
f(x, a)
Let x
∗
(a) be the value of x that maximizes f given the value of a. Therefore,
v(a) = f(x
∗
(a), a)
To find out the effect of changing the value of the exogenous variable a on the
maximized value of f we differentiate v w.r.t. a. Hence we get,
v
0
(a) = df
dx
dx∗
da +
df
da
But from the F.O.C. of maximization of f we know that,
df
dx(x
∗
(a), a) = 0
Therefore we get that,
v
0
(a) = df
da(x
∗
(a), a)
Thus the effect of change in the exogenous variable on the value function is only
it’s direct effect on the objective function. This is referred to as the Envelope
Theorem.
24
In case of utility maximization the value function is the indirect utility function.
We can also define the indirect utility function as,
V (p1, p2, I) , u(x1
∗
, x2
∗
, λ∗
) + λ
∗
[I − p1x1
∗ − p2x2
∗
] = L(x1
∗
, x2
∗
, λ∗
)
Therefore,
∂V
∂I =
∂u
∂x1
∂x1
∗
∂I +
∂u
∂x2
∂x2
∗
∂I
− λ
∗
p1
∂x1
∗
∂I − λ
∗
p2
∂x2
∗
∂I
+ [I − p1x1
∗ − p2x2
∗
]
∂λ∗
∂I + λ
∗
= λ
∗
(by Envelope Theorem)
Therefore we see that λ
∗
is the marginal value of money in the optimum. So
if the income constraint is relaxed by a dollar, it increases the maximum utility of
the consumer by λ
∗ and hence λ
∗
is interpreted as the shadow value of money.
7 Expenditure Minimization
Instead of maximizing utility subject to a given income we can also minimize expenditure subject to achieving a given level of utility ¯u. In this case, the consumer
wants to spend as little money as possible to enjoy a certain utility. Formally, we
write
min
x
p1x1 + p2x2 s.t. u(x) ≥ u. ¯ (12)
We can set up the Lagrange expression for this problem as the following:
L(x1, x2, λ) = p1x1 + p2x2 + λ[¯u − u(x1, x2)]
The F.O.C.s are now:
∂L
∂x1
= p1 − λ
∂u
∂x1
= 0
∂L
∂x2
= p2 − λ
∂u
∂x2
= 0
∂L
∂λ
= ¯u − u(x1, x2) = 0
Comparing the first two equations we get,
∂u
∂x1
∂u
∂x2
=
p1
p2
25
This is the exact relation we got in the utility maximization program. Therefore
these two programs are equivalent exercises. In the language of mathematics it is
called the duality. But the values of x1 and x2 that minimizes the expenditure is a
function of the utility level ¯u instead of income as in the case of utility maximization. The result of this optimization problem is a demand function again, but in
general it is different from x
∗
(p1, p2, I). We call the demand function derived from
problem (1) compensated demand or Hicksian demand.3 We denote it by,
h1(p1, p2, u¯) and h2(p1, p2, u¯)
Note that compensated demand is a function of prices and the utility level
whereas uncompensated demand is a function of prices and income. Plugging
compensated demand into the objective function (p1x1 + p2x2) yields the expenditure function as function of prices and ¯u
E(p1, p2, u¯) = p1h1(p1, p2, u¯) + p2h2(p1, p2, u¯).
Hence, the expenditure function measures the minimal amount of money required
to buy a bundle that yields a utility of ¯u.
Uncompensated and compensated demand functions usually differ from each
other, which is immediately clear from the fact that they have different arguments.
There is a special case where they are identical. First, note that indirect utility
and expenditure function are related by the following relationships
V (p1, p2, E(p1, p2, u¯)) = ¯u
E(p1, p2, V (p1, p2, I)) = I.
That is, if income is exactly equal to the expenditure necessary to achieve utility
level ¯u, then the resulting indirect utility is equal to ¯u. Similarly, if the required
utility level is set equal to the indirect function when income is I, then minimized
expenditure will be equal to I. Using these relationships, we have that uncompensated and compensated demand are equal in the following two cases:
x
∗
i
(p1, p2, I) = h
∗
i
(p1, p2, V (p1, p2, I))
x
∗
i
(p1, p2, E(p1, p2, u¯)) = h
∗
i
(p1, p2, u¯) for i = 1, 2. (13)
Now we can express income and substitution effects analytically. Start with
one component of equation (13):
h
∗
i
(p1, p2, u¯) = x
∗
i
(p1, p2, E(p1, p2, u¯))
3After the British economist Sir John Hicks, co-recipient of the 1972 Nobel Prize in Economic
Sciences.
26
and take the derivative with respect to pj using the chain rule
∂h∗
i
∂pj
=
∂x∗
i
∂pj
+
∂x∗
i
∂I
∂E
∂pj
. (14)
Now we have to find an expression for ∂E
∂pj
. Start with the Lagrangian associated
with problem (12) evaluated at the optimal solution (h
∗
(p1, p2, u¯), λ∗
(p1, p2, u¯)):
L(h
∗
(p1, p2, u¯), λ∗
(p1, p2, u¯)) = p1h
∗
1
(p1, p2, u¯)+p2h
∗
2
(p1, p2, u¯)+λ
∗
(p1, p2, u¯)[¯u−u(x(p1, p2, u¯))].
Taking the derivative with respect to any price pj and noting that ¯u = u(x(p, u¯))
at the optimum we get
∂L(h
∗
(p, u¯), λ∗
(p, u¯))
∂pj
= h
∗
j +
X
I
i=1
pi
∂h∗
i
∂pj
− λ
∗X
I
i=1
∂u
∂xi
∂xi
∂pj
= h
∗
j +
X
I
i=1

pi − λ
∗
∂u
∂xi

∂xi
∂pj
.
But the first -order conditions for this Lagrangian are
pi − λ
∂u
∂xi
= 0 for all i.
Hence
∂E
∂pj
=
∂L
∂pj
= h
∗
j
(p1, p2, u¯).
This result also follows form the Envelope Theorem. Moreover, from equation (13)
it follows that h
∗
j = x
∗
j
. Hence, using these two facts and bringing the second term
on the RHS to the LHS we can rewrite equation (14) as
∂x∗
i
∂pj
=
∂h∗
i
∂pj
|{z}
SE
− x
∗
j
∂x∗
i
∂I
|{z}
IE
.
This equation is known as the Slutsky Equation4 and shows formally that the price
effect can be separated into a substitution (SE) and an income effect (IE).
4After the Russian statistician and economist Eugen Slutsky.
27
8 Categories of goods and Ealsticities
Definition 11. A normal good is a commodity whose Marshallian demand is
positively related to income, i.e. as income goes up the uncompensated demand
of that good goes up as well. Therefore good i is normal if
∂xi
∗
∂I > 0
Definition 12. A inferior good is a commodity whose Marshallian demand is
negatively related to income, i.e. as income goes up the uncompensated demand
of that good goes down. Therefore good i is inferior if
∂xi
∗
∂I < 0
Definition 13. Two goods are gross substitutes if rise in the price of one good
raises the uncompensated demand of the other good. Therefore goods i and j are
gross substitutes if
∂xi
∗
∂pj
> 0
Definition 14. Two goods are net substitutes if rise in the price of one good
raises the compensated or Hicksian demand of the other good. Therefore goods i
and j are net substitutes if
∂h∗
i
∂pj
> 0
Definition 15. Two goods are net complements if rise in the price of one good
reduces the compensated or Hicksian demand of the other good. Therefore goods
i and j are net complements if
∂h∗
i
∂pj
< 0
8.1 Shape of Expenditure Function
The expression for expenditure function in a n commodity case is given by,
E(p1, p2, . . . , pn, u¯) ,
Xn
i=1
pihi
∗
(p1, p2, . . . , pn, u¯)
28
Now let’s look at the effect of changing price pi on the expenditure. By envelope
theorem we get that,
∂E
∂pi
= hi
∗
(p1, p2, . . . , pn, u¯) > 0
Therefore the expenditure function is positively sloped, i.e. when prices go up
the minimum expenditure required to meet certain utility level also goes up. Now
to find out the curvature of the expenditure function we take the second order
derivative:
∂
2E
∂pi
2
=
∂hi
∗
∂pi
< 0
This implies that the expenditure function is concave in prices.
Definition 16. A Giffen good is one whose Marshallian demand is positively
related to its price. Therefore good i is Giffen if,
∂xi
∗
∂pi
> 0
But from the Hicksian demand we know that,
∂hi
∗
∂pi
< 0
Hence from the Slutsky equation,
∂xi
∗
∂pi
=
∂h∗
i
∂pi
− x1
∗
∂xi
∗
∂I
we get that for a good to be Giffen we must have,
∂xi
∗
∂I < 0
and xi
∗ needs to be large to overcome the effect of substitution effect.
Definition 17. A luxury good is defined as one for which the elasticity of income
is greater than one. Therefore for a luxury good i,
i,I =
dx∗
i
dI
x
∗
i
I
> 1
We can also define luxury good in the following alternative way.
Definition 18. If the budget share of a good is increasing in income then it is a
luxury good.
29
Before we explain the equivalence of the two definitions let us first define the
concept of budget share. Budget share of good i, denoted by si(I), is the fraction
of income I that is devoted to the expenditure on that good. Therefore,
si(I) = pixi
∗
(p, I)
I
Now to see how the two definitions are related we take the derivative of si(I) w.r.t.
I.
dsi(I)
dI =
dxi
∗
dI piI − pixi
∗
I
2
Now if good i is luxry then we know that,
dsi(I)
dI > 0
⇐⇒
dxi
∗
dI piI − pixi
∗
I
2
> 0
⇐⇒
dxi
∗
dI piI − pixi
∗ > 0
⇐⇒
dxi
∗
dI
xi
∗
I
> 1
⇐⇒ i,I > 1
Therefore we see that the two definitions of luxury good are equivalent. Hence a
luxury good is one which a consumer spends more, proportionally, as her income
goes up.
8.2 Elasticities
Definition 19. Revenue from a good i is defined as the following:
Ri(pi) = pixi
∗
Differentiating Ri(pi) w.r.t. pi we get,
R
0
i
(pi) = x
∗
i + pi
dx∗
i
dpi
= x
∗
i

1 +
pi
x
∗
i
dx∗
i
dpi

= x
∗
i
[1 + i,i]
30
We say that:
Demand is inelastic if,
R
0
i(pi) > 0 ⇒ i,i ∈ (−1, 0)
Demand is elastic if,
R
0
i(pi) < 0 ⇒ i,i ∈ (−∞, −1)
Demand is unit-elastic if,
R
0
i(pi) = 0 ⇒ i,i = −1
9 Welfare Measurement
In order to do welfare comparison of different price situations it is important that
we move out of the utility space and deal with money as then we would have an
objective measure that we can compare across individual unlike utility. Let the
initial price vector be given by,
p
0 = (p
0
1
, p0
2
, . . . , p0
n
)
and the new price vector be,
p
1 = (p
1
1
, p1
2
, . . . , p1
n
)
9.1 Compensating Variation
The notion of compensating variation asks how much additional amounts of income
is required to maintain the initial level of utility under new prices.
CV = E(p
1
, u0) − E(p
0
, u0)
where E(p
0
, u0) is the expenditure function evaluated at price p
0 and utility level
u0. This gives us a measure of loss or gain of welfare of one individual in terms of
money due to change in prices.
9.2 Equivalent Variation
The notion of equivalent variation asks how much additional amounts of income
is required to raise the level of utility from the initial level to a specified new level
given the same prices.
EV = E(p
0
, u1) − E(p
0
, u0)
31
Let the change in price from p
0
to p
1
is only through the change in price of commodity 1. Let p
1
1 > p0
1
and p
1
i = p
0
i
for all other i = 2, 3, . . . , n. Then we can
write,
CV = E(p
1
, u0) − E(p
0
, u0)
=
Z p
1
1
p
0
1
∂E(p1, u0)
∂p1
dp1
=
Z p
1
1
p
0
1
h
∗
1
(p1, u0)dp1
9.3 Introduction of New Product
Let’s think of a scenario where a new product is introduced. Let that be commodity
k. This can be thought of as a reduction of the price of that product from pk = ∞
to pk = ¯p where ¯p is the price of the new product. Then one can measure the
welfare gain of introducing a new product by calculating the CV with the change
in price of the new product from infinity to ¯p.
CV = −
Z ∞
p¯
h
∗
k
(pk, u0)dpk
9.4 Inflation Measurement
Let the reference consumption bundle be denoted by,
x
0 = (x
0
1
, x0
2
, . . . , x0
n
)
and the reference price be,
p
0 = (p
0
1
, p0
2
, . . . , p0
n
)
Then one measure of inflation is the Laspayers Price Index,
IL =
p
1
· x
0
p
1
· x
0
The other measure is Paasche Price Index,
IP =
p
1
· x
1
p
1
· x
1
where x
1
is the consumption bundle purchased at the new price p
1
. Here the
reference bundle x
0
is the optimal bundle under the price situation p
0
. Therefore
we can say,
p
0
· x
0 = E(p
0
, u0)
32
where u0 is the utility level achieved with price p
0 and income p
0
· x
0
. Now given
the new price situation p
1 we know that,
p
1
· x
0 ≥ E(p
1
, u0)
⇒ IL =
p
1
· x
0
p
1
· x
0
≥
E(p
1
, u0)
E(p
0
, u0)
Hence we see that the Laspayers Price Index is an overestimation of price change.
10 Pareto Efficiency and Competitive Equilibrium
We now consider a model with many agents where we make prices endogenous
(initially) and later incomes as well. Let there be I individuals, each denoted by
i,
i = 1, 2, . . . , I
K commodities, each denoted by k,
k = 1, 2, . . . . , K
a consumption bundle of agent i be denoted by x
i
,
x
i = (x
i
1
, xi
2
, . . . , xi
K)
and the utility function of individual i be denoted by,
u
i
: R
K
+ → R
and society has endowment of commodities denoted by e,
e = (e1, e2, . . . , eK)
A social allocation is a vector of consumption bundles for all the individuals,
x = (x
1
, x2
, . . . , , xi
, . . . , xI
)
The total consumption of commodity k by all the individuals can not exceed the
endowment of that commodity, which is referred to as the feasibility constraint.
We say a social allocation is feasible if,
X
I
i=1
x
i
k ≤ ek ∀ k = 1, 2, . . . , K
which represent the K feasibility constraints.
33
F’s bananas
F’s coconuts
R’s bananas
R’s coconuts
Pareto efficient allocation
Definition 20. An allocation x is Pareto efficient if it is feasible and there
exists no other feasible allocation y such that nobody is worse off and at least one
individual is strictly better off, i.e. there is no y such that for all i:
u
i
(y
i
) ≥ u
i
(x
i
)
and at for some i
0
:
u
i
0
(y
i
0
) > ui
0
(x
i
0
)
We say that an allocation y is Pareto superior to another allocation x if for
all i:
u
i
(y
i
) ≥ u
i
(x
i
)
and at for some i
0
:
u
i
0
(y
i
0
) > ui
0
(x
i
0
)
and we say that y Pareto dominates x if, for all i:
u
i
(y
i
) > ui
(x
i
)
In a 2 agent (say, Robinson and Friday), 2 goods (say, coconuts and bananas)
economy we can represent the allocations in an Edgeworth box. Note that we have
a total of four axes in the Edgeworth box. The origin for Friday is in the southwest corner and the amount of bananas he consumes is measured along the lower
horizontal axis whereas his amount of coconuts is measured along the left vertical
axis. For Robinson, the origin is in the north-east corner, the upper horizontal
axis depicts Robinson’s banana consumption, and the right vertical axis measures
his coconut consumption. The height and width of the Edgeworth box are one
each since there are one banana and one coconut in this economy. Hence, the
34
endowment bundle is the south-east corner where the amount of Friday’s bananas
and Robinson’s coconuts are both equal to one. This also implies that Friday’s
utility increases as he moves up and right in the Edgeworth box, whereas Robinson
is better off the further down and left he gets. Any point inside the the two ICs is
an allocation that gives both Robinson and Friday higher utility. Hence any point
inside is a Pareto superior allocation than the initial one. The point where the two
ICs are tangent to each other is a Pareto efficient point as starting from that point
or allocation, it is not possible to raise one individual’s utility without reducing
other’s. Hence the set of Pareto efficient allocations in this economy is the set of
points in the Edgeworth box where the two ICs are tangent to each other. This
is depicted as the dotted line in the box. It is evident from the picture that there
can be many Pareto efficient allocations. Specifically, allocations that give all the
endowment of the society to either Robinson or Friday are also Pareto efficient as
as any other allocation would reduce that person’s utility.
10.1 Competitive Equilibrium
A competitive equilibrium is the pair (p, x), where p is the price vector for the K
commodities:
p = (p1, . . . , pk, . . . , pK)
and x is the allocation:
x = (x
1
, x2
, . . . , , xi
, . . . , xI
),
such that markets clear for all commodities k:
X
I
i=1
x
i
k ≤ ek,
allocation is affordable for each individual i:
p · x
i ≤ p · e
i
,
and for each individual i there is no y
i
such that
p · y
i ≤ p · e
i
and
u
i
(y
i
) > ui
(x
i
)
35
11 Social Welfare
We here are trying to formalize the problem from the point of view of a social
planner. The social planner has endowments given by the endowment vector e =
(e1, e2, . . . , eK) and attaches weight α
i
to individual i’s utility. So for him the
optimization problem is given by:
max
x1,x2,…,xI
X
I
i=1
α
iu
i
(x
i
) α
i ≥ 0,
X
I
i
α
i = 1
subject to X
I
i=1
x
i
k ≤ ek ∀ k = 1, 2, . . . , K
The Lagrange is given by,
L(x, λ) = X
I
i=1
α
iu
i
(x
i
) +X
K
k=1
λk

ek −
X
I
i=1
x
i
k
!
The first order conditions for individual i and for any two goods k and l are:
x
i
k
: α
i
∂ui
(x
i
)
∂xi
k
− λk = 0,
x
i
l
: α
i
∂ui
(x
i
)
∂xi
l
− λl = 0.
If we consider the ratio for any two commodities, we get for all i and for any pair
k, l of commodities:
αi∂ui
(x
i
)
∂xi
k
αi∂ui(xi)
∂xi
l
=
λk
λl
This means that the MRS between two goods k and l is same across individuals
which is the condition for Pareto Optimality. Hence a specific profile of weights
α = (α
1
, α2
, . . . , αI
) will give us a specific allocation among the set of Pareto
efficient allocations. Therefore we have the following powerful theorem:
Theorem 1. The set of Pareto efficient allocations and the set of welfare maximizing allocations across all possible vectors of weights are identical.
Below we solve a particular example with Cobb-Douglas preferences.
36
Example 3. Let Ann and Bob have the following preferences:
u
A

x
A
1
, xA
2

= α ln x
A
1 + (1 − α) ln x
A
2
u
B

x
B
1
, xB
2

= β ln x
B
1 + (1 − β) ln x
B
2
Let the weight on Ann’s utility function be γ and therefore the weight on Bob’s
utility function is (1 − γ). The Lagrange expression is then given by,
L(x, λ) = γuA + (1 − γ)u
B + λ1[e1 − x
A
1 − x
B
1
] + λ2[e2 − x
A
2 − x
B
2
]
The F.O.C.s are then given by,
∂L(x, λ)
∂xA
1
=
γα
x
A
1
− λ1 = 0
∂L(x, λ)
∂xA
2
=
γ(1 − α)
x
A
2
− λ2 = 0
∂L(x, λ)
∂xB
1
=
(1 − γ)β
x
B
1
− λ1 = 0
∂L(x, λ)
∂xB
2
=
(1 − γ)β
x
B
2
− λ2 = 0
Hence we get,
(1 − γ)βxA
1 = γαxB
1
and
(1 − γ)βxA
2 = γ(1 − α)x
B
2
Thus from the feasibility conditions we get that the allocations would be,
x
A
1 =
γα
γα + (1 − γ)β
e1, xA
2 =
γ(1 − α)
(1 − γ)β + γ(1 − α)
e2
x
B
1 =
(1 − γ)β
γα + (1 − γ)β
e1, xB
2 =
(1 − γ)β
(1 − γ)β + γ(1 − α)
e2
Here by varying the value of γ in its range [0,1] we can generate the whole set of
Pareto efficient allocations.

12 Competitive Equilibrium Continued
Example 4. We now consider a simple example, where Friday is endowed with the
only (perfectly divisible) banana and Robinson is endowed with the only coconut.
That is e
F = (1, 0) and e
R = (0, 1). To keep things simple suppose that both
agents have the same utility function
u(xB, xC) = α
√
xB +
√
xC
and we consider the case where α > 1, so there is a preference for bananas over
coconuts that both agents share. We can determine the indifference curves for
both Robinson and Friday that correspond to the same utility level that the initial
endowments provide. The indifference curves are given by
u
F

e
F
B,e
F
C

= α
q
e
F
B +
q
e
F
C = α = u
F
(1, 0)
u
R

e
R
B,e
R
C

= α
q
e
R
B +
q
e
R
C = 1 = u
R
(0, 1)
All the allocations between these two indifference curves are Pareto superior to
the initial endowment.
We can define the net trade for Friday (and similarly for Robinson) by
z
F
B = x
F
B − e
F
B
z
F
C = x
F
C − e
F
C
Notice that since initially Friday had all the bananas and none of the coconuts
z
F
B ≤ 0
z
F
C ≥ 0
There could be many Pareto efficient allocations (e.g., Friday gets everything,
Robinson gets everything, etc.), but we can calculate which allocations are Pareto
optimal. If the indifference curves at an allocation are tangent then the marginal
rates of substitution must be equated. In this case, the resulting condition is
∂uF
∂xF
B
∂uF
∂xF
C
=
α
2
√
x
F
B
1
2
√
x
F
C
=
α
2
√
xR
B
1
2
√
xR
C
=
∂uR
∂xR
B
∂uR
∂xR
C
which simplifies to
p
x
F
p
C
x
F
B
=
p
x
R
p
C
x
R
B

and, of course, since there is a total of one unit of each commodity, for market
clearing we must have
x
R
C = 1 − x
F
C
x
R
B = 1 − x
F
B
so
p
x
F
p
C
x
F
B
=
p
1 − x
F
p
C
1 − x
F
B
and squaring both sides
x
F
C
x
F
B
=
1 − x
F
C
1 − x
F
B
which implies that
x
F
C − x
F
Cx
F
B = x
F
B − x
F
Cx
B
and so
x
F
C = x
F
B
x
R
C = x
R
B.
What are the conditions necessary for an equilibrium? First we need the conditions
for Friday to be optimizing. We can write Robinson’s and Friday’s optimization
problems as the corresponding Lagrangian, where we generalize the endowments
to any e
R = (e
R
B, eR
C
) and e
F = (e
F
B, eF
C
):
L

x
F
B, xF
C, λF

= α
q
x
F
B +
q
x
F
C + λ

pBe
F
B + e
F
C − pBx
F
B − x
F
C

, (15)
where we normalize pC = 1 without loss of generality. A similar Lagrangian can
be set up for Robinson’s optimization problem. The first-order conditions for (15)
are
∂L
∂xF
B
=
α
2
p
x
F
B
− λ
F
pB = 0 (16)
∂L
∂xF
C
=
1
2
p
x
F
C
− λ
F = 0 (17)
∂L
∂λF
= pBe
F
B + e
F
C − pBx
F
B − x
F
C = 0. (18)
Solving as usual by taking the ratio of equations (16) and (17) we get the following
expression for the relative (to coconuts) price of bananas
pB = α
p
x
F
p
C
x
F
B

so that we can solve for x
F
C
as a function of x
F
B
x
F
C =
pB
α
2
x
F
B.
Plugging this into the budget constraint from equation (18) we get
pBx
F
B +
pB
α
2
x
F
B = pBe
F
B + e
F
C.
Then we can solve for Friday’s demand for bananas
x
F
B =
pBe
F
B + e
F
C
pB +

pB
α
2
and for coconuts
x
F
C =
pB
α
2 pBe
F
B + e
F
C
pB +

pB
α
2
.
The same applies to Robinson’s demand functions, of course.
Now we have to solve for the equilibrium price pB. To do that we use the
market clearing condition for bananas, which says that demand has to equal supply
(endowment):
x
F
B + x
R
B = e
F
B + e
R
B.
Inserting the demand functions yields
pBe
F
B + e
F
C
pB +

pB
α
2 +
pBe
R
B + e
R
C
pB +

pB
α
2 = e
F
B + e
R
B = eB,
where eB is the social endowment of bananas and we define eC = e
F
C + e
R
C
. We
solve this equation to get the equilibrium price of bananas in the economy:
p
∗
B = α
reC
eB
.
So we have solved for the equilibrium price in terms of the primitives of the economy. This price makes sense intuitively. It reflects relative scarcity in the economy
(when there are relatively more bananas than coconuts, bananas are cheaper) and
preferences (when consumers value bananas more, i.e., when α is larger, they cost
more). We can then plug this price back into the previously found equations both
for agents’ consumption and have an expression for consumption in terms of the
primitives.
Now we mention the two fundamental welfare theorems which lay the foundation for taking competitive markets as the benchmark for any study of markets
and prices. The first one states that competitive equilibrium allocations are always
Pareto efficient and the second one states that any Pareto efficient allocation can
be achieved as an outcome of competitive equilibrium.
Theorem 2. (First Welfare Theorem) Every Competitive Equilibrium allocation
x
∗
is Pareto Efficient.
Proof. Suppose not. Then there exists another allocation y, which is feasible, such
that
for all i: u
i
(y) ≥ u
i
(x
∗
)
for some i
0
: u
i
0
(y) > ui
0
(x
∗
).
If u
i
(y) ≥ u
i
(x
∗
), then the budget constraint (and monotone preferences) implies
that
X
K
k=1
pky
i
k ≥
X
K
k=1
pkx
∗k
k
(19)
and for some i
0
X
K
k=1
pky
i
0
k >
X
K
k=1
pkx
∗i
0
k
. (20)
Equations (19) and (20) imply that
X
I
i=1
X
K
k=1
pky
i
k >
X
I
i=1
X
K
k=1
pkx
∗i
k =
X
K
k=1
pkek,
where the left-most term is the aggregate expenditure and the right-most term the
social endowment. This is a contradiction because feasibility of y means that
X
I
i=1
y
i
k ≤
X
I
i=1
e
i
k = ek
for any i and hence
X
I
i=1
X
K
k=1
pky
i
k ≤
X
K
k=1
pkek.
Theorem 3. (Second Welfare Theorem) Every Pareto efficient allocation can be
decentralized as a competitive equilibrium. That is, every Pareto efficient allocation
is the equilibrium for some endowments.
41
13 Decision Making under Uncertainty
So far, we have assumed that decision makers have all the needed information. This
is not the case in real life. In many situations, individuals or firms make decisions
before knowing what the consequences will be. For example, in financial markets
investors buy stocks without knowing future returns. Insurance contracts exist
because there is uncertainty. If individuals were not uncertain about the possibility of having an accident in the future, there would be no need for car insurance.
Definition 21. π = (π1, π2, . . . , πN ) represents a probability distribution if
πn ≥ 0 ∀ n = 1, 2, . . . , N, and X
N
n=1
πn = 1
Now to conceptualize uncertainty we define the concept of lottery.
Definition 22. A lottery L is defined as follows:
L = (x; π) = (x1, x2, . . . , xN ; π1, π2, . . . , πN )
where x = (x1, x2, . . . , xN ) ∈ R
N is a profile of money awards (positive or negative)
to be gained in N different states and π = (π1, π2, . . . , πN ) is the probability
distribution over the N states.
13.1 St. Petersberg Paradox
Here we talk about a well-known lottery known as the St. Petersburg paradox
which was proposed by Daniel Bernoulli in 1736. A fair coin is tossed until head
comes up for the first time. Then the reward paid out is equal to 2n−1
, where n
is the number of coin tosses that were necessary for head to come up once. This
lottery is described formally as
LSP =

1, 2, 4, . . . , 2
n−1
, . . . ;
1
2
,
1
4
,
1
8
, . . . ,
1
2
n
, . . .
.
Its expected value is
E [LSP ] = X∞
n=1
πnxn =
X∞
n=1
1
2
n
2
n−1 =
X∞
n=1
1
2
= ∞.
Hence the expected payoff from this lottery is infinitely large and an individual
offered this lottery should be willing to pay an infinitely large amount for the right
to play this lottery. This is not, however, what people do and hence the paradox.
42
13.2 Expected Utility
The St. Petersberg paradox emphasized the fact that expected value may not be
the right way to describe individual’s preferences over lotteries. In general utility
over lotteries is a function:
U : R
N × [0, 1]N → R
Expected Utility is a particular formulation that says that there is another utility
function defined over money
u: R → R
such that the utility over the lottery is of the following form:
U(x1, x2, . . . , xN ; π1, π2, . . . , πN ) = X
N
n=1
u(xn)πn
Definition 23. A decision maker is called risk averse if the utility function u :
R −→ R+ is concave and she is called risk loving or a risk seeker if u is convex.
Suppose a lottery be given by
L = (x1, x2; π1, π2)
Then the individual is risk averse if
π1u(x1) + π2u(x2) < u(π1×1 + π2×2)
and the individual is risk loving if
π1u(x1) + π2u(x2) > u(π1×1 + π2×2)
and the individual is risk neutral if
π1u(x1) + π2u(x2) = u(π1×1 + π2×2)
13.3 Risky Investment
Let an individual with wealth w deciding how much to invest in a risky asset which
pays return r1 in state 1 and return r2 in state 2 such that
(1 + r1) < 1
(1 + r2) > 1
43
Therefore state 1 is the bad state which gives a negative return while state 2 is
the good state which gives a positive return. If z is the amount that’s invested in
this risky asset then the individual’s expected utility is given by,
U(z) = π1u((1 + r1)z − z + w) + π2u((1 + r2)z − z + w)
So the individual solves the following problem:
max
z
π1u((1 + r1)z − z + w) + π2u((1 + r2)z − z + w)
The marginal utility of investment is given by,
dU(z)
dz = π1u
0
((1 + r1)z − z + w).r1 + π2u
0
((1 + r2)z − z + w).r2
Therefore the MU at z = 0 is given by,

dU(z)
dz
z=0
= π1r1u
0
(w) + π2r2u
0
(w) = [π1r1 + π2r2]u
0
(w)
Therefore,

dU(z)
dz
z=0
T 0 according as π1r1 + π2r2 T 0
Hence whether the individual will invest anything in the asset or not will depend
on the expected value of the asset. If the expected value is positive he will invest
a positive amount irrespective of his degree of risk aversion. The actual value of z
will of course depend on the concavity of his utility function.
14 Theory of Production
We can use tools similar to those we used in the consumer theory section of the class
to study firm behaviour. In that section we assumed that individuals maximize
utility subject to some budget constraint. In this section we assume that firms
will attempt to maximize their profits given a demand schedule and production
technology.
Firms use inputs or commodities x1, . . . , xI to produce an output y. The
amount of output produced is related to the inputs by the production function
y = f(x1, . . . , xI ), which is formally defined as follows:
Definition 24. A production function is a mapping f : R
I
+ −→ R+.
44
The prices of the inputs/commodities are p1, . . . , pI and the output price is py.
The firm takes prices as given and independent of its decisions.
Firms maximize their profits by choosing the optimal amount and combination
of inputs.
max
x1,…,xI
pyf(x1, . . . , xI ) −
X
I
i=1
pixi
. (21)
Another way to describe firms’ decision making is by minimizing the cost necessary
to produce an output quantity ¯y.
min
x1,…,xI
X
I
i=1
pixi s.t. f(x1, . . . , xI ) ≥ y¯.
The minimized cost of production, C (¯y), is called the cost function.
We make the following assumption for the production function: positive marginal
product
∂f
∂xi
≥ 0
and declining marginal product
∂
2
f
∂x2
i
≤ 0.
The optimality conditions for the profit maximization problem (21) and the
FOCs for all i
py
∂f
∂xi
− pi = 0.
In other words, optimal production requires equality between marginal benefits
and marginal cost of production. The solution to the profit maximization problem
then is
x
∗
i
(p1, . . . , pI , py), i = 1, . . . , I
y
∗
(p1, . . . , pI , py),
i.e., optimal demand for inputs and optimal output/supply.
The solution of the cost minimization problem (), on the other hand is
x
∗
i
(p1, . . . , pI , y¯), i = 1, . . . , I,
where ¯y is the firm’s production target.
Example 5. One commonly used production function is the Cobb-Douglas production function where
f(K, L) = KαL
1−α
45
The interpretation is the same as before with α reflecting the relative importance
of capital in production. The marginal product of capital is ∂f
∂K and the marginal
product of labor is ∂f
∂L .
In general, we can change the scale of a firm by multiplying both inputs by a
common factor: f(tK, tL) and compare the new output to tf(K, L). The firm is
said to have constant returns to scale if
tf(K, L) = f(tK, tL),
it has decreasing returns to scale if
tf(K, L) > f(tK, tL),
and increasing returns to scale if
tf(K, L) < f(tK, tL).
Example 6. The Cobb-Douglas function in our example has constant returns to
scale since
f(tK, tL) = (tK)
α
(tL)
1−α = tKαL
1−α = tf(K, L).
Returns to scale have an impact on market structure. With decreasing returns
to scale we expect to find many small firms. With increasing returns to scale, on
the other hand, there will be few (or only a single) large firms. No clear prediction
can be made in the case of constant returns to scale. Since increasing returns to
scale limit the number of firms in the market, the assumption that firms are price
takers only makes sense with decreasing or constant returns to scale.
15 Imperfect Competition
15.1 Pricing Power
So far, we have considered market environments where single agent cannot control
prices. Instead, each agent was infinitesimally small and firms acted as price takers.
This was the case in competitive equilibrium. There are many markets with few
(oligopoly) or a single firms, (monopoly) however. In that case firms can control
prices to some extent. Moreover, when there are a few firms in a market, firms
make interactive decisions. In other words, they take their competitors’ actions
into account. In Section ??, we will use game theory to analyse this type of market
structure. First, we cover monopolies, i.e., markets with a single producer.
46
15.2 Monopoly
If a firm produces a non-negligible amount of the overall market then the price at
which the good sells will depend on the quantity sold. Examples for firms that
control the overall market include the East India Trading Company, Microsoft
(software in general because of network externalities and increasing returns to
scale), telecommunications and utilities (natural monopolies), Standard Oil, and
De Beers.
For any given price there will be some quantity demanded by consumers, and
this is known as the demand curve x : R+ −→ R+ or simply x(p). We assume that
consumers demand less as the price increases: the demand function is downward
sloping or x
0
(p) < 0. We can invert this relationship to get the inverse demand
function p(x) which reveals the price that will prevail in the market if the output
is x.
If the firm is a monopolist that takes the demand data p(x) as given then its
goal is to maximize
π(x) = p(x)x − c(x) (22)
by choosing the optimal production level. For the cost function we assume c
0
(x) >
0 and c
00(x) ≥ 0, i.e., we have positive and weakly increasing marginal costs.
For example, c(x) = cx satisfies these assumptions (a Cobb-Douglas production
function provides this for example). The monopolist maximizes its profit function
(22) over x, which leads to the following FOC:
p(x) + xp0
(x) − c
0
(x) = 0. (23)
Here, in addition to the familiar p(x), which is the marginal return from the
marginal consumer, the monopolist also has to take the xp0
(x) into account, because a change in quantity also affects the inframarginal consumers. For example,
when it increases the quantity supplies, the monopolist gets positive revenue from
the marginal consumer, but the inframarginal consumers pay less due to the downward sloping demand function. At the optimum, the monopolist equates marginal
revenue and marginal cost.
Example 7. A simple example used frequently is p(q) = a − bq, and we will
also assume that a > c since otherwise the cost of producing is higher than any
consumer’s valuation so it will never be profitable for the firm to produce and the
market will cease to exist. Then the firm want to maximize the objective
π(x) = (a − bx − c)x.
The efficient quantity is produced when p(x) = a−bx = c because then a consumer
buys an object if and only if they value it more than the cost of producing, resulting
47
x
p
x_M
p_M
x*
p*
A B
C
MC
p(x)
MR
Figure 5: Monopoly
in the highest possible total surplus. So the efficient quantity is
x
∗ =
a − c
b
.
The monopolist’s maximization problem, however, has FOC
a − 2bx − c = 0
where a − 2bx is the marginal revenue and c is the marginal cost. So the quantity
set by the monopolist is
x
M =
a − c
2b
< x∗
.
The price with a monopoly can easily be found since
p
M = a − bxM
= a −
a − c
2
=
a + c
2
> c.
Figure 5 illustrates this.
A monopoly has different welfare implications than perfect competition. In
Figure 5, consumers in a monopoly lose the areas A and B compared to perfect
48
competition. The monopolist loses area C and wins area A. Hence, there are
distributional implications (consumers lose and the producer gains) as well as
efficiency implications (overall welfare decreases).
We can write the monopolist’s FOC (23) in terms of the demand elasticity
introduced in Section ?? as follows:
p(x
∗
) + x
∗
p
0
(x
∗
) = c
0
(x
∗
) ⇐⇒
p(x
∗
)

1 +
x
∗p
0
(x
∗
)
p(x
∗
)

= c
0
(x
∗
) ⇐⇒
p(x
∗
) = c
0
(x
∗
)
1 +
−1
p
.
Since p < 0, we have that p(x
∗
) > c0
(x
∗
), in other words, the monopolist charges
more than the marginal cost. This also means that if demand is very elastic,
p → ∞, then p(x
∗
) ≈ c
0
(x
∗
). On the other hand, if demand is very inelastic,
p ≈ −1, then p(x
∗
) c
0
(x
∗
).
16 Imperfectly Competitive Market
16.1 Price Discrimination
In the previous section we saw that the monopolist sets an inefficient quantity and
total welfare is decreased. Is there a mechanism, which allows the monopolist to
offer the efficient quantity and reap the entire possible welfare in the market? The
answer is yes if the monopolist can set a two-part tariff, for example. In general,
the monopolist can extract consumer rents by using price discrimination.
First degree price discrimination (perfect price discrimination) means discrimination by the identity of the person or the quantity ordered (non-linear pricing). It
will result in an efficient allocation. Suppose there is a single buyer and a monopoly
seller where the inverse demand is given by p = a − bx. If the monopolist were
to set a single price it would set the monopoly price. As we saw in the previous
section, however, this does not maximize the joint surplus, so the monopolist can
do better. Suppose instead that the monopolist charges a fixed fee F that the
consumer has to pay to be allowed to buy any positive amount at all, and then
sells the good at a price p, and suppose the monopolist sets the price p = c. The
fixed fee will not affect the quantity that a participating consumer will choose, so
if the consumer participates then they will choose quantity equal to x
∗
. The firm
can then set the entry fee to extract all the consumer surplus and the consumer
will still be willing to participate. This maximizes the joint surplus, and gives the
entire surplus to the firm, so the firm is doing as well as it could under any other
49
mechanism. Specifically, using the functional form from Example 7 the firm sets
F =
(a − c)x
∗
2
=
(a − c)
2
2b
In integral notation this is
F =
x
∗
Z
0
(p(x) − c)dx.
This pricing mechanism is called a two-part tariff, and was famously used at Disneyland (entry fee followed by a fee per ride), greatly increasing revenues.
Now, let’s assume that there are two different classes of consumers, type A
with utility function u(x) and type B with βu(x), β > 1, so that the second class
of consumers has a higher valuation of the good. If the monopolist structures a
two-part tariff (F, p = c) to extract all surplus from type B consumers, type A
consumers would not pay the fixed fee F since they could not recover the utility
lost from using the service. On the other hand, if the firm offers two two-part
tariffs (FA, p = c) and (FB, p = c) with FA < FB, all consumers would pick the
cheaper contract (FA, p = c). A solution to this problem would be to offer the
contracts (FA, pA > c) and (FB, p = c). Type A consumers pick the first contract
and consume less of the good and type B consumers pick the second contract,
which allows them to consume the efficient quantity. This is an example for second
degree price discrimination, which means that the firm varies the price by quantity
or quality only. It offers a menu of choices and lets the consumers self-select into
their preferred contract.
In addition, there is third degree price discrimination, in which the firm varies
the price by market or identity of the consumers. For example, Disneyland can
charge different prices in different parks. Let’s assume there are two markets,
i = 1, 2. The firm is a monopolist in both markets and its profit maximization
problem is
max
x1,x2
x1p1(x1) + x2p2(x2) − c(x1 + x2).
The FOC for each market is
pi(xi) + xip
0
i
(xi) = c
0
(x1 + x2),
which leads to optimal solution
pi(x
∗
i
) = c
0
(x
∗
1 + x
∗
2
)
1 + 1

i
p
for i = 1, 2.
Hence, the solution depends on the demand elasticity in market i. The price will
be different as long as the structure of demand differs.
50
16.2 Oligopoly
Oligopoly refers to environments where there are few large firms. These firms are
large enough that their quantity influences the price and so impacts their rivals.
Consequently each firm must condition its behavior on the behavior of the other
firms. This strategic interaction is modeled with game theory. The most important
model of oligopoly is the Cournot model or the model of quantity competition. The
general model is described as follows: Let there be I firms denoted by,
i = 1, 2, . . . , I
each producing one homogenous good, where each firm produces qi amount of that
good. Each firm i has cost function
ci(qi)
The total production is given by,
q =
X
I
i=1
qi
We also denote total production by firms other than i by,
q−i =
X
j6=i
qj
The profit of firm i is given by,
πi(qi
, q−i) = p(qi
, q−i)qi − ci(qi)
So firm i solves,
max
qi
πi(qi
, q−i)
Hence the F.O.C. is given by,
p(qi
, q−i) + ∂p(qi
, q−i)
∂qi
qi − c
0
i
(qi) = 0
It is important to note that the optimal production of firm i is dependent on
the production by other firms i.e. q−i
. This the strategic aspect of this model.
Therefore in order to produce any amount of the good firm i must anticipate
what others might be doing and every firm thinks the same way. So we need an
equilibrium concept here which would tell us that given the production level of
every firm no firm wants to move away from its current production.
51
16.2.1 Example
We here consider the duopoly case, where there are only two firms. Suppose the
inverse demand function is given by p(q) = a − bq, and the cost of producing is
constant and the same for both firms ci(q) = cq. The quantity produced in the
market is the sum of what both firms produce q = q1 + q2. The profits for each
firm is then a function of the market price and their own quantity,
πi(qi
, qj ) = qi (p (qi + qj ) − c) .
The strategic variable that the firm is choosing is the quantity to produce qi
.
Suppose that the firms’ objective was to maximize their joint profit
π1(q1, q2) + π2(q1, q2) = (q1 + q2) (p (q1 + q2) − c)
then we know from before that this is maximized when q1 + q2 = q
M. We could
refer to this as the collusive outcome. One way the two firms could split production
would be q1 = q2 =
qM
2
.
If the firms could write binding contracts then they could agree on this outcome.
However, that is typically not possible (such an agreement would be price fixing),
so we would not expect this outcome to occur unless it is stable/self-enforcing. If
either firm could increase its profits by setting another quantity, then they would
have an incentive to deviate from this outcome. We will see below that both firms
would in fact have an incentive to deviate and increase their output.
Suppose now that firm i is trying to choose qi to maximize its own profits,
taking the other firm’s output as given. Then firm i’s optimization problem is
max
qi
πi(qi
, qj ) = qi (a − b (qi + qj ) − c) ,
which has the associated FOC
∂πi(qi
, qj )
∂qi
= a − b (2qi + qj ) − c = 0.
Then the optimal level q
∗
i
given any level of qj
is
q
∗
i
(qj ) = a − bqj − c
2b
.
This is firm i’s best response to whatever firm j plays. In the special case when
qj = 0 firm i is a monopolist, and the observed quantity qi corresponds to the
monopoly case. In general, when the rival has produced qj we can treat the firm
as a monopolist facing a “residual demand curve” with intercept of a − bqj
. We
can write firm i’s best response function as
q
∗
i
(qj ) = a − c
2b
−
1
2
qj
.
52
q1
q2
1
0.5
0.5 1
q1(q2)
q2(q1)
Nash Equilibrium
Figure 6: Cournot equilibrium
Hence,
dqi
dqj
= −
1
2
.
This has two important implications. First, the quantity player i chooses is
decreasing in its rival’s quantity. This means that quantities are strategic substitutes. Second, if player j increases their quantity player i decreases their quantity
by less than player j increased their quantity (player i decreases his quantity by
exactly 1
2
for every unit player j’s quantity is increased). So we would expect that
the output in a duopoly would be higher than in a monopoly.
We can depict the best response function graphically. Setting a = b = 1 and
0, Figure 6 shows the best response functions. Here, the best response functions
are q
∗
i
(qj ) = 1−qj
2
.
We are at a “stable” outcome if both firms are producing a best response to
their rivals’ production. We refer to such an outcome as an equilibrium. That is,
when
qi =
a − bqj − c
2b
(24)
qj =
a − bqi − c
2b
. (25)
Since the best responses are symmetric we will have qi = qj and so we can calculate
the equilibrium quantities from the equation
qi =
a − bqi − c
2b
53
and so
qi = qj =
a − c
3b
and hence
q = qi + qj =
2(a − c)
3b
>
a − c
2b
= q
M.
There is a higher output (and hence lower price) in a duopoly then a monopoly.
More generally, both firms are playing a best response to their rival’s action
because for all i
πi(q
∗
i
, q∗
j
) ≥ πi(qi
, q∗
j
) for all qi
That is, the profits from the quantity are (weakly) higher then the profits from
any other output. This motivates the following definition for an equilibrium in a
strategic setting.
Definition 25. A Nash Equilibrium in the duopoly game is a pair (q
∗
i
, q∗
j
) such
that for all i
πi(q
∗
i
, q∗
j
) ≥ πi(qi
, q∗
j
) for all qi
.
This definition implicitly assumes that agents hold (correct) expectations or
beliefs about the other agents’ strategies.
A Nash Equilibrium is ultimately a stability property. There is no profitable
deviation for any of the players. In order to be at equilibrium we must have that
qi = q
∗
i
(qj )
qj = q
∗
j
(qi)
and so we must have that
qi = q
∗
i
(q
∗
j
(qi))
so equilibrium corresponds to a fixed-point of the mapping q
∗
1
(q
∗
2
(·)). This idea can
also be illustrated graphically. In Figure 6, firm 1 initially sets q1 =
1
2
, which is not
the equilibrium quantity. Firm 2 then optimally picks q2 = q
∗
2

1
2

=
1
4
according
to its best response function. Firm 1, in turn, chooses a new quantity according to
its best response function: q1 = q
∗
1

1
4

=
3
8
. This process goes on and ultimately
converges to q1 = q2 =
1
3
.
16.3 Oligopoly: General Case
Now, we consider the case with I competitors. The inverse demand function
(setting a = b = 1) is
p(q) = 1 −
X
I
i=1
qi

and firm i’s profit function is
π(qi
, q−i) =
1 −
X
I
i=1
qi − c
!
qi
, (26)
where the vector q−i
is defined as q−i = (q1, . . . , qi−1, qi+1, . . . , qI ), i.e., all quantities
excluding qi
.
Again, we can define an equilibrium in this market as follows:
Definition 26. A Nash Equilibrium in the oligopoly game is a vector q
∗ = (q
∗
1
, . . . , q∗
I
)
such that for all i
πi(q
∗
i
, q∗
−i
) ≥ πi(qi
, q∗
−i
) for all qi
.
We simply replaced the quantity qj by the vector q−i
.
Definition 27. A Nash equilibrium is called symmetric if q
∗
i = q
∗
j
for all i and j.
The FOC for maximizing the profit function (26) is
1 −
X
j6=i
qj − 2qi − c = 0
and the best response function for all i is
qi =
1 −
P
j6=i
qj − c
2
. (27)
Here, only the aggregate supply of firm i’s competitors matters, but not the specific
amount single firms supply. It would be difficult to solve for I separate values of
qi
, but due to symmetry of the profit function we get that q
∗
i = q
∗
j
for all i and j
so that equation (27) simplifies to
q
∗
i =
1 − (I − 1)q
∗
i − c
2
,
which leads to the solution
q
∗
i =
1 − c
I + 1
.
As I increases (more firms), the market becomes more competitive. Market
supply is equal to
X
I
i=1
q
∗
i = Iq∗
i =
I
I + 1
(1 − c).
55
As the number of firms becomes larger, I −→ ∞, q
∗
i −→ 0 and
X
I
i=1
q
∗
i −→ 1 − c,
which is the supply in a competitive market. Consequently,
p
∗ −→ c
As each player plays a less important strategical role in the market, the oligopoly
outcome converges to the competitive market outcome.
Note that we used symmetry in deriving the market outcome from the firms’
best response function. We cannot invoke symmetry when deriving the FOC. One
might think that instead of writing the profit function as (26) one could simplify
it to
π(qi
, q−i) = (1 − Iqi − c) qi
.
This is wrong, however, because it implies that firm i controls the entire market
supply (acts as a monopolist). Instead, in an oligopoly market, firm i takes the
other firms’ output as given.
17 Game Theory
17.1 Basics
Game theory is the study of behavior of individuals in a strategic scenario, where
a strategic scenario is defined as one where the actions of one individual affects
the payoff or utility of other individuals. In the previous section we introduced
game theory in the context of firm competition. In this section, we will generalize
the methods used above and introduce some specific language. The specification
of (static) game consists of three elements:
1. The players, indexed by i = 1, . . . , I. In the duopoly games, for example,
the players were the two firms.
2. The strategies available: each player chooses strategy ai
from the available
strategy set Ai
. We can write a−i = (a1, . . . , ai−1, ai+1, . . . , aI ) to represent
the strategies of the other I −1 players. Then, a strategy profile of all players
is defined by a = (a1, . . . , aI ) = (ai
, a−i). In the Cournot game, the player’s
strategies were the quantities chose, hence Ai = R+.
56
3. The payoffs for each player as a function of the strategies of the players. We
use game theory to analyze situations where there is strategic interaction so
the payoff function will typically depend on the strategies of other players
as well. We write the payoff function for player i as ui(ai
, a−i). The payoff
function is the mapping
ui
: A1 × · · · × AI −→ R.
Therefore we can define a game the following way:
Definition 28. A game (in normal form) is a triple,
Γ = n
{1, 2, . . . . , I}, {Ai}
I
i=1, {ui(·)}
I
i=1o
We now define the concept of best response, i.e. the action for a player i which
is best for him (in the sense of maximizing the payoff function). But since we are
studying a strategic scenario, what is best for player i potentially depends on what
others are playing, or what player i believes others might be playing.
Definition 29. An action ai
is a best response for player i against a profile of
actions of others a−i
if
ui(ai
, a−i) ≥ ui(a
0
i
, a−i) ∀ a
0
i ∈ Ai
We say that,
ai ∈ BRi(a−i)
Now we define the concept of Nash Equilibrium for a general game.
Definition 30. An action profile
a
∗ = (a
∗
1
, a∗
2
, . . . , a∗
I
)
is a Nash equilibrium if,
for all i, ui(a
∗
i
, a∗
−i
) ≥ ui(ai
, a∗
−i
) ∀ ai ∈ Ai
or, stated otherwise,
for all i, a
∗
i ∈ BRi(a
∗
−i
)
We know that
BRi
: ×j6=i Aj → Ai
Now let’s define the following function
BR: ×
I
i=1 Ai → ×I
i=1Ai
as
BR = (BR1, BR2, . . . , BRI )
Then we can redefine Nash equilibrium as follows:
57
Definition 31. An action profile
a
∗ = (a
∗
1
, a∗
2
, . . . , a∗
I
)
is a Nash equilibrium if,
a
∗ ∈ BR(a
∗
)
17.2 Pure Strategies
We can represent games (at least those with a finite choice set) in normal form.
A normal form game consists of the matrix of payoffs for each player from each
possible strategy. If there are two players, 1 and 2, then the normal form game
consists of a matrix where the (i, j)th entry consists of the tuple (player 1’s payoff,
player2’s payoff) when player 1 plays their ith strategy and player 2 plays their
jth strategy. We will now consider the most famous examples of games.
Example 8. (Prisoner’s Dilemma) Suppose two suspects, Bob and Rob are arrested for a crime and questioned separately. The police can prove the committed
a minor crime, and suspect they have committed a more serious crime but can’t
prove it. The police offer each suspect that they will let them off for the minor
crime if they confess and testify against their partner for the more serious crime. Of
course, if the other criminal also confesses the police won’t need his testimony but
will give him a slightly reduced sentence for cooperating. Each player then has two
possible strategies: Stay Quiet (Q) or Confess (C) and they decide simultaneously.
We can represent the game with the following payoff matrix:
Rob
Q C
Bob Q 3, 3 −1, 4
C 4, −1 0, 0
Each entry represents (Bob, Rob)’s payoff from each of the two strategies. For
example, if Rob stays quiet while Bob confesses Bob’s payoff is 4 and Rob’s is
−1. Notice that both players have what is known as a dominant strategy; they
should confess regardless of what the other player has done. If we consider Bob,
if Rob is Quiet then confessing gives payoff 4 > 3, the payoff from staying quiet.
If Rob confesses, then Bob should confess since 0 > −1. The analysis is the same
for Rob. So the only stable outcome is for both players to confess. So the only
Nash Equilibrium is (Confess, Confess). Notice that, from the perspective of the
prisoners this is a bad outcome. In fact it is Pareto dominated by both players
staying quiet, which is not a Nash equilibrium.
58
The above example has a dominant strategy equilibrium, where both players
have a unique dominant strategy.
Definition 32. A strategy is ai
is dominant if
ui(ai
, a−i) > ui(a
0
i
, a−i) for all a
0
i ∈ Ai
, a−i ∈ A−i
.
If each player has a dominant strategy, then the only rational thing for them
to do is to play that strategy no matter what the other players do. Hence, if a
dominant strategy equilibrium exists it is a relatively uncontroversial prediction of
what will happen in the game. However, it is rare that a dominant strategy will
exist in most strategic situations. Consequently, the most commonly used solution
concept is Nash Equilibrium, which does not require dominant strategies.
Note the difference between Definitions ?? and 32: A Nash Equilibrium is only
defined for the best response of the other players, s
∗
−i
, whereas dominant strategies
have to hold for strategies s−i ∈ S−i
. A strategy profile is a Nash Equilibrium if
each player is playing a best response to the other players’ strategies. So a Nash
Equilibrium is a stable outcome where no player could profitably deviate. Clearly
when dominant strategies exist it is a Nash Equilibrium for all players to play
a dominant strategy. However, as we see from the Prisoner’s Dilemma example
the outcome is not necessarily efficient. The next example shows that the Nash
Equilibrium may not be unique.
Example 9. (Coordination Game) We could represent a coordination game where
Bob and Ann are two researcher both of whose input is necessary for a project.
They decide simultaneously whether to do research (R) or not (N).
Bob
R N
Ann R 3, 3 −1, 0
N 0, −1 1, 1
Here (R,R) and (N,N) are both equilibria. Notice that the equilibria in this
game are Pareto ranked with both players preferring to coordinate on doing research. Both players not doing research is also an equilibrium, since if both players
think the other will play N they will play N as well.
A famous example of a coordination game is from traffic control. It doesn’t
really matter if everyone drives on the left or right, as long as everyone drives on
the same side.
Example 10. Another example of a game is a “beauty contest.” Everyone in
the class picks a number on the interval [1, 100]. The goal is to guess as close
59
as possible to 2
3
the class average. An equilibrium of this game is for everyone
to guess 1. This is in fact the only equilibrium. Since no one can guess more
than 100, 2
3
of the mean cannot be higher than 66 2
3
, so all guesses above this are
dominated. But since no one will guess more than 66 2
3
the mean cannot be higher
than 2
3
(66 2
3
) = 44 4
9
, so no one should guess higher than 44 4
9
. Repeating this n
times no one should guess higher than
2
3

n100 and taking n −→ ∞ all players
should guess 1. Of course, this isn’t necessarily what will happen in practice if
people solve the game incorrectly or expect others too. Running this experiment
in class the average guess was approximately 12.
17.3 Mixed Strategy
So far we have considered only pure strategies: strategies where the players do
not randomize over which action they take. In other words, a pure strategy is
a deterministic choice. The following simple example demonstrates that a pure
strategy Nash Equilibrium may not always exist.
Example 11. (Matching Pennies) Consider the following payoff matrix:
Bob
H T
Ann H 1, −1 −1, 1
T −1, 1 1, −1
Here Ann wins if both players play the same strategy, and Bob wins if they
play different ones. Clearly there cannot be pure strategy equilibrium, since Bob
would have an incentive to deviate whenever they play the same strategy and Ann
would have an incentive to deviate if they play the differently. Intuitively, the only
equilibrium is to randomize between H and T with probability 1
2
each.
While the idea of a matching pennies game may seem contrived, it is merely
the simplest example of a general class of zero-sum games, where the total payoff
of the players is constant regardless of the outcome. Consequently gains for one
player can only come from losses of the other. For this reason, zero-sum games
will rarely have a pure strategy Nash equilibrium. Examples would be chess, or
more relevantly, competition between two candidates or political parties. Cold
War power politics between the US and USSR was famously (although probably
not accurately) modelled as a zero-sum game. Most economic situations are not
zero-sum since resources can be used inefficiently.
Example 12. A slight variation is the game of Rock-Paper-Scissors.
6
Bob
R P S
R 0, 0 −1, 1 1, −1
Ann P 1, −1 0, 0 −1, 1
S −1, 1 1, −1 0, 0
Definition 33. A mixed strategy by player i is a probability distribution σi =
σi (S
1
i
), . . . , σi

S
K
i
such that
σi

s
k
i

≥ 0
X
K
k=1
σi

s
k
i

= 1.
Here we refer to si as an action and to σi as a strategy, which in this case is a
probability distribution over actions. The action space is Si =

s
1
i
, . . . , sK
i

.
Expected utility from playing action si when the other player plays strategy
σj
is
ui (si
, σj ) = X
K
k=1
σj

s
k
j

ui

si
, sk
j

.
Example 13. Consider a coordination game (also knows as “battle of the sexes”
similar to the one in Example 9 but with different payoffs
Bob
σB 1 − σB
O C
Ann σA O 1, 2 0, 0
1 − σA C 0, 0 2, 1
Hence Bob prefers to go to the opera (O) and prefers to go to a cricket match
(C), but both players would rather go to an event together than alone. There
are two pure strategy Nash Equilibria: (O, O) and (C, C). We cannot make a
prediction, which equilibrium the players will pick. Moreover, it could be the case
that there is a third Nash Equilibrium, in which the players randomize.
Suppose that Ann plays O with probability σA and C with probability 1 − σA.
Then Bob’s expected payoff from playing O is
2σA + 0(1 − σA) (2
σ_A
σ_B
2/3
1/3Bob’s best response
Ann’s best response
Figure 7: Three Nash Equilibria in battle of the sexes game
and his expected payoff from playing C is
0σA + 1(1 − σA). (29)
Bob is only willing to randomize between his two pure strategies if he gets the
same expected payoff from both. Otherwise he would play the pure strategy that
yields the highest expected payoff for sure. Equating (28) and (29) we get that
σ
∗
A =
1
3
.
In other words, Ann has to play O with probability 1
3
to induce Bob to play a
mixed strategy as well. We can calculate Bob’s mixed strategy similarly to get
σ
∗
B =
2
3
.
Graphically, we can depict Ann’s and Bob’s best response function in Figure 7.
The three Nash Equilibria of this game are the three intersections of the best
response functions.
18 Asymmetric Information: Adverse Selection
and Moral Hazard
Asymmetric information simply refers to situations where some of the players
have relevant information that other players do not. We consider two types of
62
asymmetric information: adverse selection, also known as hidden information,
and moral hazard or hidden action.
A leading example for adverse selection occurs in life or health insurance. If
an insurance company offers actuarially fair insurance it attracts insurees with
above average risk whereas those with below average risk decline the insurance.
(This assumes that individuals have private information about their risk.) In
other words, individuals select themselves into insurance based on their private
information. Since only the higher risks are in the risk pool the insurance company
will make a loss. In consequence of this adverse selection the insurance market
breaks down. Solutions to this problems include denying or mandating insurance
and offering a menu of contracts to let insurees self-select thereby revealing their
risk type.
Moral hazard is also present in insurance markets when insurees’ actions depend
on having insurance. For example, they might exercise less care when being covered
by fire or automobile insurance. This undermines the goal of such insurance,
which is to provide risk sharing in the case of property loss. With moral hazard,
property loss becomes more likely because insurees do not install smoke detectors,
for example. Possible solutions to this problem are copayments and punishment
for negligence.
18.1 Adverse Selection
The following model goes back to George Akerlof’s 1970 paper on “The market
for lemons.” The used car market is a good example for adverse selection because
there is variation in product quality and this variation is observed by sellers, but
not by buyers.
Suppose there is a potential buyer and a potential seller for a car. Suppose that
the quality of the car is denoted by θ ∈ [0, 1]. Buyers and sellers have different
valuations/willingness to pay vb and vs, so that the value of the car is vbθ to the
buyer and vsθ to the seller. Assume that vb > vs so that the buyer always values
the car more highly then the seller. So we know that trade is always efficient.
Suppose that both the buyer and seller know θ, then we have seen in the bilateral
trading section that trade can occur at any price p ∈ [vsθ, vbθ] and at that price
the efficient allocation (buyer gets the car) is realized (the buyer has a net payoff
of vbθ − p and the seller gets p − vsθ, and the total surplus is vbθ − vsθ).
The assumption that the buyer knows the quality of the car may be reasonable
in some situations (new car), but in many situations the seller will be much better
informed about the car’s quality. The buyer of a used car can observe the age,
mileage, etc. of a car and so have a rough idea as to quality, but the seller has
presumably been driving the car and will know more about it. In such a situation
we could consider the quality θ as a random variable, where the buyer knows
63
only the distribution but the seller knows the realization. We could consider a
situation where the buyer knows the car is of a high quality with some probability,
and low quality otherwise, whereas the seller knows whether the car is high quality.
Obviously the car could have a more complicated range of potential qualities. If
the seller values a high quality car more, then their decision to participate in the
market potentially reveals negative information about the quality, hence the term
adverse selection. This is because if the car had higher quality the seller would
be less willing to sell it at any given price. How does this type of asymmetric
information change the outcome?
Suppose instead that the buyer only knows that θ ∼ U[0, 1]. That is that the
quality is uniformly distributed between 0 and 1. The the seller is willing to trade
if
p − vsθ ≥ 0 (30)
and the buyer, who does not know θ, but forms its expected value, is willing to
trade if
E[θ]vb − p ≥ 0. (31)
However, the buyer can infer the car’s quality from the price the seller is asking.
Using condition (30), the buyer knows that
θ ≤
p
vs
so that condition (31) becomes
E

θ

θ ≤
p
vs

vb − p =
p
2vs
vb − p ≥ 0, (32)
where we use the conditional expectation of a uniform distribution:
E[θ|θ ≤ a] = a
2
.
Hence, simplifying condition (32), the buyer is only willing to trade if
vb ≥ 2vs.
In other words, the buyer’s valuation has to exceed twice the seller’s valuation for
a trade to take place. If
2vs > vb > vs
trade is efficient, but does not take place if there is asymmetric information.
In order to reduce the amount of private information the seller can offer a
warranty of have a third party certify the car’s quality.
64
If we instead assumed that neither the buyer or the seller know the realization
of θ then the high quality cars would not be taken out of the market (sellers
cannot condition their actions on information they do not have) and so we could
have trade. This indicates that it is not the incompleteness of information that
causes the problems, but the asymmetry.
18.2 Moral Hazard
Moral hazard is similar to asymmetric information except that instead of considering hidden information, it deals with hidden action. The distinction between the
two concepts can be seen in an insurance example. Those who have pre-existing
conditions that make them more risky (that are unknown to the insurer) are more
likely, all else being equal, to buy insurance. This is adverse selection. An individual who has purchased insurance may become less cautious since the costs of
any damage are covered by insurance company. This is moral hazard. There is a
large literature in economics on how to structure incentives to mitigate moral hazard. In the insurance example these incentives often take the form of deductibles
and partial insurance, or the threat of higher premiums in response to accidents.
Similarly an employer may structure a contract to include a bonus/commission
rather then a fixed wage to induce an employee to work hard. Below we consider
an example of moral hazard, and show that a high price may signal an ability to
commit to providing a high quality product.
Suppose a cook can choose between producing a high quality meal (q = 1) and
a low quality meal (q = 0). Assume that the cost of producing a high quality meal
is strictly higher than a low quality meal (c1 > c0 > 0). For a meal of quality q,
and price p the benefit to the customer is q − p and to the cook is p − ci
. So the
total social welfare is
q − p + p − ci = q − ci
and assume that 1−c1 > 0 > −c0 so that the high quality meal is socially efficient.
We assume that the price is set beforehand, and the cook’s choice variable is the
quality of the meal. Assume that fraction α of the consumers are repeat clients
who are informed about the meal’s quality, whereas 1 − α of the consumers are
uniformed (visitors to the city perhaps) and don’t know the meal’s quality. The
informed customers will only go to the restaurant if the meal is good (assume
p ∈ (0, 1)). These informed customers allow us to consider a notion of reputation
even though the model is static.
Now consider the decision of the cook as to what quality of meal to produce.
If they produce a high quality meal then they sell to the entire market so their
profits (per customer) are
p − c1
65
Conversely, by producing the low quality meal, and selling to only 1 − α of the
market they earn profit
(1 − α)(p − c0)
and so the cook will provide the high quality meal if
p − c1 ≥ (1 − α)(p − c0)
or
αp ≥ c1 − (1 − α)c0
where the LHS is the additional revenue from producing a high quality instead of
a low quality meal and the RHS is the associated cost. This corresponds to the
case
α ≥
c1 − c0
p − c0
.
So the cook will provide the high quality meal if the fraction of the informed
consumers is high enough. So informed consumers provide a positive externality
on the uninformed, since the informed consumers will monitor the quality of the
meal, inducing the chef to make a good meal.
Finally notice that price signals quality here: the higher the price the smaller
the fraction of informed consumers necessary ensure the high quality meal. If the
price is low (p ≈ c1) then the cook knows he will lose p − c1 from each informed
consumer by producing a low quality meal instead, but gains c1 − c0 from each
uninformed consumer (since the cost is lower). So only if almost every consumer is
informed will the cook have an incentive to produce the good meal. As p increases
so does p − c1, so the more is lost for each meal not sold to an informed consumer,
and hence the lower the fraction of informed consumers necessary to ensure that
the good meal will be provided. An uninformed consumer, who also may not know
α, could then consider a high price a signal of high quality since it is more likely
that the fraction of informed consumers is high enough to support the good meal
the higher the price.
18.3 Second Degree Price Discrimination
In Section ?? we considered first and third degree price discrimination where the
seller can identify the type of potential buyers. In contrast, second degree price
discrimination occurs when the firm cannot observe to consumer’s willingness to
pay directly. Consequently they elicit these preferences by offering different quantities or qualities at different prices. The consumer’s type is revealed through
which option they choose. This is known as screening.
Suppose there are two types of consumers. One with high valuation of the good
θh, and one with low valuation θl
. θ is also called the buyers’ marginal willingness
66
to pay. It tells us how much a buyer what be willing to pay for an additional unit
of the good. Each buyer’s type is his private information. That means the seller
does not know ex ante what type a buyer he is facing is. Let α denote the fraction
of consumers who have the high valuation. Suppose that the firm can produce a
product of quality q at cost c(q) and assume that c
0
(q) > 0 and c
00(q) > 0.
First, we consider the efficient or first best solution, i.e., the case where the firm
can observe the buyers’ types. If the firm knew the type of each consumer they
could offer a different quality to each consumer. The condition for a consumer of
type i = h, l buying an object of quality q for price p voluntarily is
θiq − p(q) ≥ 0
and for the firm to participate in the trade we need
p(q) − c(q) ≥ 0.
Hence maximizing joint payoff is equivalent to
max
q
θiq − p(q) + p(q) − c(q)
or
max
q
θiq − c(q).
The FOC for each quality level is
θi − c
0
(q) = 0,
from which we can calculate the optimal level of quality for each type, q
∗
(θi). Since
marginal cost is increasing by assumption we get that
q
∗
(θl) < q∗
(θh),
i.e., the firm offers a higher quality to buyers who have a higher willingness to
pay in the first best case. In the case of complete information we are back to first
degree price discrimination and the firm sets the following prices to extract the
entire gross utility from both types of buyers:
p
∗
h = θhq
∗
(θh) and p
∗
h = θlq
∗
(θl)
so that buyers’ net utility is zero. In Figure 8, the buyers’ gross utility, which is
equal to the price charged, is indicated by the rectangles θiq
∗
i
.
In many situations, the firm will not be able to observe the valuation/willingness
to pay of the consumers. That is, the buyers’ type is their private information. In
67
θ
q
q*_l
q*_h
θ_l θ_h
gross utility high type
gross utility low type information rent
Figure 8: Price discrimination when types are known to the firm
such a situation the firm offers a schedule of price-quality pairs and lets the consumers self-select into contracts. Thereby, the consumers reveal their type. Since
there are two types of consumers the firm will offer two different quality levels, one
for the high valuation consumers and one for the low valuation consumers. Hence
there will be a choice of two contracts (ph, qh) and (pl
, ql) (also called a menu of
choices). The firm wants high valuation consumers to buy the first contract and
low valuation consumers to buy the second contract. Does buyers’ private information matter, i.e., do buyers just buy the first best contract intended for them?
High type buyers get zero net utility from buying the high quality contract, but
positive net utility of θhq
∗
(θl) − pl > 0. Hence, high type consumers have an incentive to pose as low quality consumers and buy the contract intended for the
low type. This is indicated in Figure 8 as “information rent,” i.e., an increase in
high type buyers’ net utility due to asymmetric information.
The firm, not knowing the consumers’ type, however, can make the low quality
bundle less attractive to high type buyers by decreasing ql or make the high quality contract more attractive by increasing qh or decreasing ph. The firm’s profit
maximization problem now becomes
max
ph,pl
,qh,ql
α (ph − c(qh)) + (1 − α) (pl − c(ql)) . (33)
There are two type of constraints. The consumers have the option of walking away,
so the firm cannot demand payment higher than the value of the object. That is,
68
we must have
θhqh − ph ≥ 0 (34)
θlql − pl ≥ 0. (35)
These are known as the individual rational (IR) or participation constraints that
guarantee that the consumers are willing to participate in the trade. The other
type of constraints are the self-selection or incentive compatibility (IC) constraints
θhqh − ph ≥ θhql − pl (36)
θlql − pl ≥ θlqh − ph, (37)
which state that each consumer type prefers the menu choice intended for him to
the other contract. Not all of these four constraints can be binding, because that
would determine the optimal solution of prices and quality levels. The IC for low
type (37) will not binding because low types have no incentive to pretend to be
high types: they would pay a high price for quality they do not value highly. On
the other hand high type consumers’ IR (34) will not be binding either because
we argued above that the firm has to incentivize them to pick the high quality
contract. This leaves constraints (35) and (36) as binding and we can solve for the
optimal prices
pl = θlql
using constraint (35) and
ph = θh(qh − ql) + θlql
using constraints (35) and (36). Substituting the prices into the profit function
(33) yields
max
qh,ql
α [θh(qh − ql) + θlql − c(qh)] + (1 − α) (θlql − c(ql)) .
The FOC for qh is simply
α (θh − c
0
(qh)) = 0,
which is identical to the FOC in the first best case. Hence, the firm offers the high
type buyers their first best quality level q
∗
R(θh) = q
∗
(θh). The FOC for ql
is
α(θl − θh) + (1 − α) (θl − c
0
(ql)) = 0,
which can be rewritten as
θl − c
0
(ql) −
α
1 − α
(θl − θh) = 0.
69
The third term on the LHS, which is positive, is an additional cost that arises
because the firm has to make the low quality contract less attractive for high
type buyers. Because of this additional cost we get that q
∗
R(θl) < q∗
(θl): the
the quality level for low types is lower than in the first best situation. This is
depicted in Figure . The low type consumers’ gross utility and the high type
buyers’ information rent are decreased, but The optimal level of quality offered to
low type buyers is decreasing in the fraction of high type consumer α:
dq∗
R(θl)
dα < 0
since the more high types there are the more the firm has to make the low quality
contract unattractive to them.
This analysis indicates some important results about second degree price discrimination:
1. The low type receives no surplus.
2. The high type receives a positive surplus of ql(θh − θl). This is known as an
information rent, that the consumer can extract because the seller does not
know his type.
3. The firm should set the efficient quality for the high valuation type.
4. The firm will degrade the quality for the low type in order to lower the rents
the high type consumers can extract.
19 Auctions
Auctions are an important application of games of incomplete information. There
are many markets where goods are allocated by auctions. Besides obvious examples such as auctions of antique furniture there are many recent application. A
leading example is Google’s sponsored search auctions. Google matches advertiser
to readers of websites and auctions advertising space according to complicated
rules.
Consider a standard auction with I bidders, and each bidder i from 1 to I has a
valuation vi
for a single object which is sold by the seller or auctioneer. If the bidder
wins the object at price pi then he receives utility vi − pi
. Losing bidders receive
a payoff or zero. The valuation is often the bidder’s private information so that
we have to analyze the uncertainty inherent in such auctions. This uncertainty is
captured by modelling the bidders’ valuations as draws from a random distribution:
vi ∼ F(vi).
70
We assume that bidders are symmetric, i.e., their valuations come from the same
distribution, and we let bi denote the bid of player i.
There are many possible rules for auctions. They can be either sealed bid or
open bid. Examples of sealed bid auctions are the first price auction (where the
winner is the bidder with the highest bid and they pay their bid), and the second
price auction (where the bidder with the highest bid wins the object and pays
the second highest bid as a price). Open bid auctions include English auctions
(the auctioneer sets a low price and keeps increasing the price until all but one
player has dropped out) and the Dutch auction (a high price is set and the price is
gradually lowered until someone accepts the offered price). Another type of auction
is the Japanese button auction, which resembles an open bid ascending auction,
but every time the price is raised all bidders have to signal their willingness to
increase their bid. Sometimes, bidders hold down a button as long as they want
to increase their bid and release when they want to exit the auction.
Let’s think about the optimal bidding strategy in a Japanese button auction,
denotes by bi(vi) = ti
, where ti = pi
is the price the winning bidder pays for the
good. At any time, the distribution of valuations, F, the number of remaining
bidders are known to all players. As long as the price has not reached a bidder’s
valuation it is optimal for him to keep the button pressed because he gets a positive
payoff if all other players exit before the price reaches his valuation. In particular,
the bidder with the highest valuation will wait longest and therefore receive the
good. He will only have to pay the second highest bidder’s valuation, however,
because he should release the button as soon as he is the only one left. At that
time the price will have exactly reached the second highest valuation. Hence, it
is optimal for all bidders to bid their true valuation. If the price exceeds vi they
release the button and get 0 and the highest valuation bidder gets a positive payoff.
In other words, the optimal strategy is
b
∗
i
(vi) = vi
.
What if the button auction is played as a descending auction instead? Then it
is no longer optimal to bid one’s own valuation. Instead, b
∗
i
(vi) < vi because only
waiting until the price reaches one’s own valuation would mean that there might
be a missed chance to get a strictly positive payoff.
In many situations (specifically when the other players’ valuations does not
affect your valuation) the optimal behavior in a second price auction is equivalent
to an English auction, and the optimal behaviour in a first price auction is equivalent to a Dutch auction. This provides a motivation for considering the second
price auction which is strategically very simple, since the English auction is commonly used. It’s the mechanism used in the auction houses, and is a good first
approximation how auctions are run on eBay.
71
How should people bid in a second price auction? Typically a given bidder will
not know the bids/valuations of the other bidders. A nice feature of the second
price auction is that the optimal strategy is very simple and does not depend on
this information: each bidder should bid their true valuation.
Proposition 3. In a second price auction it is a Nash Equilibrium for all players
to bid their valuations. That is b
∗
i = vi
for all i is a Nash Equilibrium.
Proof. Without loss of generality, we can assume that player 1 has the highest
valuation. That is, we can assume v1 = maxi{vi}. Similarly, we can assume
without loss of generality that the second highest valuation is v2 = maxi>1{vi}.
Define
µi(vi
, bi
, b−i) =
vi − pi
, if b1 = maxj{bj}
0, otherwise
to be the surplus generated from the auction for each player i. Then under the
given strategies (b = v)
µi(vi
, vi
, v−i) =
v1 − v2, i = 1
0, otherwise
So we want to show that no bidder has an incentive to deviate.
First we consider player 1. The payoff from bidding b1 is
µ1(v1, b1, v−1) =
v1 − v2, if b1 > v2
0, otherwise
≤ v1 − v2 = µ1(v1, v1, v−1)
so player 1 cannot benefit from deviating.
Now consider any other player i > 1. They win the object only if they bid more
than v1 and would pay v1. So the payoff from bidding b1 is
µi(vi
, bi
, v−i) =
vi − v1, if bi > v1
0, otherwise
≤ 0 = µi(vi
, vi
, v−i)
since vi − v1 ≤ 0. So player i has no incentive to deviate either.
We have thus verified that all players are choosing a best response, and so the
strategies are a Nash Equilibrium.
Note that this allocation is efficient. The bidder with the highest valuation
gets the good.
Finally, we consider a first price sealed bid auction. There, we will see that it
is optimal for bidders to bid below their valuation, b
∗
i
(vi) < vi
, a strategy called
bid shedding. Bidder i’s expected payoff is
max
bi
(vi − bi) Pr(bi > bj
for all j 6= i) + 0 Pr(bi < max{bj} for all j 6= i). (38)
72
Consider the bidding strategy
bi(vi) = cvi
i.e., bidders bid a fraction of their true valuation. Then, if all players play this
strategy,
Pr(bi > bj ) = Pr(bi > cvj ) = Pr
vj <
bi
c

. (39)
With valuations having a uniform distribution on [0, 1], (39) becomes
Pr
vj <
bi
c

=
bi
c
and (38) becomes
max
bi
(vi − bi)
bi
c
+ 0
with FOC
vi − 2bi
c
= 0
or
b
∗
i =
vi
2
.
Hence, we have verified that the optimal strategy is to bid a fraction of one’s
valuation, in particular, c =
1
2
.
73

Continue to order Get a quote

Econ 121b: Intermediate Microeconomics, What’s Economics?

Calculate the price of your order

Our guarantees

Money-back guarantee

Zero-plagiarism guarantee

Free-revision policy

Privacy policy

Fair-cooperation guarantee