Kreps 3 choice under uncertainty
Choice under uncertainty
Up to this point, we've been thinking of the bundles or objects among
which our consumer has been choosing as "sure things"- so many bottles
of wine, so many cans of bee~ so many shots of whisky. Many important
consumption decisions concern choices the consequences of which are
uncertain at the time the choice is made. For example, when you choose
to buy a car (new or used), you aren't sure what the quality is. When you
choose an education, you aren't sure about your abilities, later opportunities,
the skills of your instructors, etc. Both in financial and real markets,
commodities of risky or uncertain character are traded all the time.
Nothing in the theory up to this point precludes such commodities. A
can of Olympia beer is a definite thing - a share of GM is another - and
we could simply proceed by taking as given the consumer's preferences
for bundles that contain so many cans of beer, so many shares of GM; and
so on. But because there is special structure to the commodity a share of GM
(or, rather, because we can model it as having a special structure), we are
able to (a) make further assumptions about the nature of our consumer's
preferences for such things and, thereby, (b) get out something more concrete
about the consumer's demand for these kinds of commodities. This
is the basic plot of this chapter.
We begin in section 3.1 with the theory of von Neumann-Morgenstem
expected utility. In this theory, uncertain prospects are modeled as probability
distributions over a given set of prizes. That is, the probabilities of
various prizes are given as part of the description of the object - probabilities
are objective. Section 3.2 takes up the special case where the prizes
in section 3.1 are amounts of money; then one is able to say a bit more
about the nature of the utility function that represents preferences. In section
3.3, we briefly discuss a few applications of this theory to the topic of
market demand. In section 3.4, we turn to a richer theory, where uncertain
prospects are functions from "states of nature" to prizes, and where
probabilities arise (if at all) subjectively, as part of the representation of
a consumer's preferences. In section 3.5 we will explore theoretical and
empirical problems with these models. Finally, we will turn briefly to a nonnative (re)interpretation of this development. While our emphasis is
on the use of these models as descriptive models of choice, they are also
used as normative guides to aid consumers who must make choices. This
is sketched in section 3.6.
3.1. Von Neumann-Morgenstern expected utility
Setup
To begin, think of a basic set of "prizes" X - these are just like
our commodity bundles from before - and then a larger set of probability
distributions over the prizes. Let P denote the set of probability
distributions with prizes in X For the time being, we will assume
that p' c9nsists only of probability distributions that have a finite number
of possible outcomes - such probability distributions are called simple.
Formally,
Definition. A simple probability distibution p on X is specified by
(a) a fiY~ite subset of X, called the support of p and denoted by supp(p), and
(b) for iach x E supp(p) a ~umber p(x) > 0, with :Z::xEsupp<plp(x) = 1.
The set of simple probability distributions on X will be denoted by P.
To take an example, suppose that X· is the positive orthant in R2
,
where x = (x1, x2) represents x1 cans of beer and X2 bottles of wine. A
typical simple probability distribution is one with support {(10, 2), (4,4)},
p((l0,2)) = 1/3, and p((4,4)) = 2/3. This represents a one-third chance of
receiving 10 cans of beer and 2 bottles of wine and a two-thirds chance of
getting 4 cans and 4 bottles. We will depict simple probability distributions
by chance nodes; the example just given is depicted in figure 3.1. Note that
the numbers on branches are the probabilities, with the prizes written out
at the end of the branches.
Some terminology goes along with this; members x of X will be
called prizes or outcomes. Members p of P will be called lotteries, gambles, and probability distributions, all interchangeably. a
And some notation follows: The lottery that gives the prize x with
probability one will be written Ox.
How does this do as a model of the commodity space that we imagine
is "out there?" We can say three things here: First, if you thought of commodity
bundles as so many cans of beer, so many shares of GM, etc., then
you might have thought in terms of a vector of probability distributions
instead of a probability distribution on vectors. That is, in terms of our
example, we would write a typical commodity bundle as a vector (p1, Pz)
where P1 is a probability distribution over the number of cans of beer and
Pz is a probability distribution over the number of bottles of wine. But
the way we are doing things is superior. You will learn why this is in
section 3.3, but for now (if it makes sense) note that if you have a vector
of probability distributions, you won't know about any correlations in the
prizes they give. A probability distribution over vector prizes tells you not
only the marginal distribution on each component but also ali conditional
and joint probabilities.
Second, the assumption that every distribution has finite support may
seem rather limiting. For one thing, we can't in this framework represent
a gamble constructed as follows: I flip a coin until the first time it is tails
and give you as many dollars as times I flipped the coin, which would
give you $1 with probability 1/2, $2 with probability 1/4, and so on. Or
if the prize space X was R, representing an amount of money, and it
was analytically convenient to have Normal probability distributions, our
formalism woUldn't be adequate. We will address this shortcoming at the
end of this section.
Third, the probabilities all come as part of the description of the object
-probabilities are "objective" instead of "subjective." But in real world
applications, there may be no objective probability for a random event .
. For example, suppose an entrepreneur is considering a venture that will
earn her a given amount of money if a certain technique for gene splicing
works. To use our model in describing her choice problem, we need
to know the probability that this technique will work, something about
which well-informed individuals might disagree. We'll deal with subjective
probabilities in section 3.4.
To complete the setup, we need one more concept. Suppose we have
two simple probability distributions p and q and a number a that lies between zero and one, inclusive. Then we can form a new probability
distribution, written a.p + (1 - a.)q, in two steps:
(a) The support of this new probability distribution is the union of the
supports of p and q
Axioms for preference
Now assume that our consumer has preferences over the set P of all
simple probability distributions on X, given, as before, by a relation >that
expresses strict preference. We insist on two properties immediately.
Assumption 1. >- must be asymmetric and negatively transitive. This is just as before. Also as before, we construct from >- a weak preference
relation !:: and an indifference relation ,...., .
We add to these two properties some properties that exploit the fact
that our objects of choice are probability distributions. Consider first
Assumption 2. Suppose p and q are two probability distributions such that
p >- q. Suppose a is a number from the open interval (0, 1), and r is some other
probability distribution. Then ap + (1 - a)r >- aq + (1 - a)r.
This is called the substitution axiom. The idea is that in both of the two final
probability distributions, the consumer is getting r with probability 1-a,
hence this "same part" won't affect the consumer's preferences. Overall
preference depends on how the consumer feels about the differences between
the two, that is, on p vs. q. Since we suppose that' the consumer
prefers p to q, and since a > 0 implies that there is some chance that this
difference matters, we conclude that the consumer prefers ap + (1- a)r to
aq + (1- a)r.
Assumption 3. Suppose that p, q, and r are three probability distributions
such that p >- q >- r. Then numbers a and fJ exist, both from the open interval
(0, 1), such that ap + (1 - a)r >- q >- fJp + (1 - fJ)r.
This is called (for obscure reasons) the Archimedean axiom. Think of it this
way: Since p is strictly better than q, then no matter how bad r is, we
can find some "mixture" of p and r, which we'll write ap + (1 - a)r,
with weight on p close enough to one so this mixture is better than q.
And, similarly, no matter how much better p is than q, we can find a fJ
sufficiently close to zero so {Jp + (1 - fJ)r is worse than q. To help you
understand this, consider an example in which you might think it is false.
Suppose p gives you $100 for sure, q gives you $10 for sure, and r consists
of your death. You might then say that r is so much worse than q that
no probability a however close to one makes ap + (1 - a)r better than q.
But if you think so, think again. Imagine that you are told that you can
have $10 right now or, if you choose to drive to some nearby location, a
check for $100 is waiting for you. If you are like most people, you will
probably get into your car to get the $100. But this, to a minute degree,
increases the chances of your death. ·
The representation
Does. consumer choice conform to these axioms? A lot of experimental
evidence suggests that the answer to this is no; see section 3.5 below.
Despite this, vast quantities of economic theory are based on the assumption
that consumer choice does conform to these axioms, which leads to
the following representation of preferences:
In words, >- has an expected utility representation. Each possible prize has
a corresponding utility level, and the value of a probability distribution is
measured by the expected level of utility that it provides. Moreover, this
utility function is unique up to a positive affine transformation (which is
a fancy way of putting the last statement).
This is sometimes called a von Neumann-Morgenstern expected utility
representation, since one of the original modern developments of this
theory appears in von Neumann and Morgenstern's Theory of Games and
Economic Behavior. But the form goes back a good deal further, to the
eighteenth century and Daniel Bernoulli.
Note that the proposition establishes the existence of a numerical representation
for preferences on p. That is, there is a function U : P -+ R
How is proposition 3.1 proven? We will sketch the proof here; if you
are ambitious and know a bit of real analysis, problem 1 will give you
a few hints on how to fill in the details. (It isn't very hard.) First, we1ll
add one assumption (to be taken away in problem 1): In X are a best
prize b and a worst prize w; the consumer in question at least weakly
pr~ers b for sure to any other probability distribution over X, and any
other probability distribution over X is a least as good as w for sure. If
the consumer is indifferent between b for sure and w fo:r sure, then the
representation is trivial, so we'll assume as well that ob >- Ow. Now we
can use assumptions 1, 2, and 3 to obtain three lemmas.
This result, which is sometimes simply assumed, is called the calibration
property. It says that we can calibrate the consumer's preference for any
lottery in terms of a lottery that involves only the best and worst prizes.
Note that by virtue of lemma 1, we know that there is exactly one value
a that will do in lemma 2; if p were indifferent to two different mixtures
of the best and worst prizes, it would be indifferent to two things, one of
which is strictly preferred to the other.
Lemma 3. If p "' q, r is any third lottery, and a is any number from the closed
interaal [0, 1], then ap + (1 - a)r "' aq + (1 - a)r.
This is just like the substitution axiom, except that >- is replaced by "' here.
This is sometimes assumed as an axiom, and it itself is then sometimes
called the substitution axiom.
The rest is very easy. For every prize x, define u(x) as that number
between zero and one (inclusive) such that
This number u(x) will be the utility of the prize x. We know that one such
number exists by lemma 2, and this number is unique by lemma 1. Take
any lottery p: -
Lemma 4. For u : X -+ R defined as above, any lottery p is indifferent
to the lottery that gives prize b with probability .I: u(x)p(x) and w with the
complementary probability, where the sum is over all x in the support of p.
Once we have this result, we can use lemma 1 to finish the main
part of the proof: Compare any two lotteries, p and q. The lottery p
is indifferent to the lottery that gives b with probability I: u(x)p(x) and
w with the complementary probability, and q is indifferent to the lottery
that gives b with probability I: u(x)q(x) and w with the complementary
probability. We know by lemma 1 how to compare the two lotteries over
b and w: Whichever gives a higher probability of b is better. But this is
precisely the representation.
Proving lemma 4 is fairly simple in concept but rather cumbersome
notationally. The idea is that we take each prize x in the support of p
and substitute for it the lottery that gives b with probability u(x) and
w with probability 1 - u(x). By lemma 3, each time we make such a
substitution we will have a new lottery that is indifferent to p. And
when we are done with all these substitutions, we will have the lottery
that gives prize b with probability I: u(x)p(x) and w with probability
Or, in words, if we look at lotteries involving only the two prizes b and
w, .the consumer always (strictly) prefers a higher probability of ~g
the better prize.
Of course, this is only a demonstration of lemma 4 on a particular example;
it is not a proof. The exact proof is left as an exercise (problem 1(d));
if you try this, you should use induction on the size of the support of p.
All this was based on our added assumption that there was a best
and worst prize. But this assumption can be done away with; you can
either consult one of the standard reference books or try your hand at
problem 1(e).
To complete the proof of the proposition, we need to show that if
preferences have an expected utility numerical representation, then those
preferences satisfy assumptions 1 through 3, and. we need to show that
a representing utility function u is unique up to positive affine transformations.
(That is, any other function v that gives an expected utility
representation for fixed preferences >- satisfies v = au + b for constants
a> 0 and b.) These parts of the proof are left to you.
To summarize: We began by taking a choice space that had some
special structure - a set of probability distributions. We used that structure
to pose some axioms for the consumer's preferences that exploited
this structure. Then, using those axioms, we showed that a numerical
representation for consumer preferences can be created that exploits this
structure; viz., the representing function U on P takes the form of expected
utility for some function u defined on X. We continue this in the
next section, by making further assumptions about X and >-.
But first we address a concern voiced earlier: In this development we
only got the representation for probability distributions with finite support.
Can this sort of representation be extended to probability distributions such
as, say, the Normal distribution? Yes, it certainly can be. We sketch one way to proceed, which requires that you know some advanced probability
theory. Assume that X is R" or the positive orthant of R~ (If you know
enough math, take X to be a compact subset of a complete separable metric
space.) Let P be the space of all Borel measures on X and assume
that preferences are continuous in the weak topology on measures. If assumptions
1, 2, and 3 hold for all probability distributions in P, they hold
for the subspace of simple probability distributions, so we produce an expected
utility representation for some function u : X -+ R that works for
simple probability distributions. We can then use the continuity assumption
to show that u must be bounded. (If u were unbounded [say] above, we
could find a sequence of prizes {xn} with u(xn) > 2n. Construct the lottery
that gives prize Xn with probability 1/2n. This has "expected utility"
infinity. If you are clever, you should be able to show that this poses real
problems for the Archimedean axiom and weak continuity.) And then use
the fact that simple probability distributions are dense in the Borel probabilities
in the weak topology to extend the expected utility representation to all
of P. You can go on to show that u must be continuous (on X) as well as
bounded.
The boundedness of u is somewhat limiting in applications. For example,
for reasons to be explored later, a very nice utility function in applications
is exponential, or u(x) = -e--'"', where X is the real line. This utility function
is unbounded below. So we would want a way to obtain an expected utility
representation for more than simple probability distributions that didn't require
bounded u. This can be done as well. Essentially, we obtained bounded
u because we took P to be all Borel probability distributions and we assume
continuity over all of P. If one restricts P, say to distributions that meet certain
tail conditions, then the set of compatible utility functions becomes those
u that don't grow "too quickly'' relative to the assumed rate at which tail
probabilities must die off. For further details, see Fishburn (1970) or Kreps
(1988). .
3.2. On utility for money
Now we specialize even further to the case where the prizes are dollar
values. That is, X is the real line, or some interval thereof. We'll continue
to refer to the probability distributions p as lotteries, and we'll continue
to restrict attention to simple probability distributions. Throughout, we
assume that we have the three assumptions of the previous section and,
therefore, an expected utility representation.
In this case, it seems reasonable to suppose that our consumer prefers
more money to less. This has a straightforward consequence for the representation,
which you should have no difficulty proving.
Risk aversion
More subtle is the property of risk aversion. First we need a piece of
notation. For a lottery p, let Ep represent the expected value of p, or
Ep = :Z::., xp(x). b
Proposition 3.3. Suppose that for all lotteries p, 8Ep !:::: p. This is true if and
only if the utilitlj function u is concave.
A consumer who prefers to get the expected value of a gamble for sure
instead of taking the risky gamble, (and whose utility function, in consequence,
is concave) is said to be risk averse. We could also define a
risk-seeking consumer as one for whom p !:::: 8Ep for all p; this sort of
behavior goes with a convex utility function u. And a consumer is risk
neutral if p ,..., 8 Ep, which goes with a linear utility function. In economic
theory, risk aversion, which includes risk neutrality as a special case, is
typically assumed.
We have defined risk aversion in terms of comparisons of a lottery with
its expected value. We can generalize this sort of thing as follows. It is
possible to partially order lotteries according to their riskiness. That is, for
some pairs of lotteries p and p1
, it makes sense to say that p is riskier than l
For example, suppose that p1 gives prizes 10 and 20 with probabilities 2/3
and 1/3, and p gives prizes 5, 15, and 30 with probabilities 1/3, 5/9, and 1/9, respectively. I claim that p is riskier than p~ Among the many ways to
justify this claim, one is that p is p1 "plus noise." Imagine conducting p~ If
the outcome is 10, then either take 5 away (leaving 5) or give 5 more (leaving
15), each with probability 1/2. If the outcome is 20, either take away 5 with
probability 2/3 (leaving 15) or give 10 more with probability 1/3. Think
of this as a compound lottery, and you will see that first I do p1 and thffit
conditional on the outcome, I conduct a lottery with expected value zero. But
this compound lottery reduces to p. Since p is gotten from p1 in this fashion,
if a consumer was risk averse, she would (weakly) prefer p1 top. (Why?) In
general, if p and p1 are related in this fashion (p is p1 plus zero-conditionalmean
noise), then any risk averse decision maker will (weakly) prefer p1 to
p. This will be true also if p is Pt plus zero-conditional-mean noise, Pt is P2
plus sucl1 noise, ... , and p1 is p.,. plus such noise.
AB something of a converse, we can define "riskier than" by p is riskier
than p1 if p and .p' have the same expected value and every risk averse consumer
(weakly) prefers p' top. Then an interesting questiOn is, What is the
relationship between this definition of "riskier than" and the characterizations
that involve zero-conditional-mean noise? The general notion of more or less
risky gambles is an important one in the literature of the economics of uncertainty,
although we will not develop it here; see Rothschild and Stiglitz (1970,
1971, 1973), Diamond and Stiglitz (1974), or Machina (forthcoming).
Definition. A certainty equivalent for a lottery p is any prize x such that
Ox tv p.
Proposition 3.3. If X is an interval of R and u is continuous, then every
lottery p has at least one certainty equivalent. If u is strictly increasing, every
p has at most one certainty equivalent.
(The proof is left as an exercise.) We henceforth assume that the utility
function under consideration is continuous, strictly increasing, and concave,
the latter two reflecting increasing and risk averse preferences. Hence
every p has a unique certainty equivalent, which we denote O(:p). Note
that risk a~~sion, in this setting, can be characterized by G(:p)::; Ep.
What right have we to assume that u is continuous? It turns out that
concavity of u implies that u is continuous on the interior of the interval X,
although perhaps not at any endpoints of the interval. (Can you show this?
Does the assumption that u is strictly increasing help about the endpoints in
any way?) But we will want u to be continuous even at the endpoints, and
we might want u to be continuous even if u didn't represent risk averse preferences:
One can prove that u must be continuous if and only if preferences
are continuous in the weak topology, relative to the space of simple probability
distributions. So if you know about the (relative) weak topology; you
know what is 'necessary and sufficient for this often very useful assumption
about u.
Absolute (and relative) risk aversion
Assume henceforth that any von Neumann-Morgenstem utility function
u we consider is strictly increasing, concave, and has at least two
continuous derivatives. c This implies, of course, that u is continuous,
and so each lottery p has a unique certainty equivalent O(:p) that is less
than or equal to Ep. We will call the difference Ep-G(:p) the risk premium
of p, written R(:p).
Now consider a lottery panda (dollar) amount z. Write pffiz for the
lottery that gives prize x+z with probability p(x). That is, pffiz is just the
c What justification is there for the assumption that u is twice differentiable? We can
justify continuity and concavity. This then implies that u will have left- and right-hand
derivatives everywhere. The properties of nondecreasing or nonincreasing risk aversion are
then sufficient to show that u is continuously differentiable. (This makes a good homework
exercise.) Concavity again implies that u1 is decreasing, and we could, by complicating matters
substantially, get away without assuming a continuous second derivative. But unless
one is being very much more exact about this than we are, it isn't worth the difficuity entailed.
lottery constructed from p by increasing each prize of p by the amount
z. By thinking of these prizes as the "after the gamble" wealth level of
our consumer, as we increase z in p E9 z, we increase the consumer's general
wealth leveL It seems somewhat natural to suppose that as someone
becomes richer, she cares less and less about risks that she takes in given
gambles. In syinbols, this would say that as z increases, R(:p E9 z) should
not increase; the consumer's risk premium for a fixed gamble should not
increase as the consumer becomes wealthier. We formalize this notion and
two related to it as follows:
Definition. For a fixed consumer with utility function u, if R(:p E9 z) is nonincreasing
in z, the consumer is said to be nonincreasingly risk averse. d If
R(:pEBz) is constant in z, we say that the consumer has constant risk aversion.
If R(:pEBz) is nondecreasing in z, then we say the consumer is nondecreasingly
risk averse.
Our assertion about what seems "somewhat natural" is that consumers
are nonincreasingly or perhaps constantly risk averse.
What does all this portend for the utility function u? We give a
definition and then the result.
Definition. Given a (twice continuously differentiable, concave, strictly increasing)
utility function u, let .>..(x) = -u"(x)ju'(x), and call the function .>.. the
coefficient of absolute risk aversion for the consumer.
Since u is concave and strictly increasing, .>..(x) 2:: 0 ..
Proposition 3.4. A consumer is nonincreasingly risk averse if and only if.>.. (defined
from her von Neumann-Morgenstern utility function u) i~ a nonincreasing
function of x. The consumer is nondecreasingly risk averse if and only if .>.. is
a nondecreasing function of x . And the consumer has constant risk aversion if
and only if .>.. is a constant function, in which case the utility function u is a
positive affine translate of the utility function -e->.x. (If.>.. is the constant zero,
then u(x) is a positive affine translate of the function x; the consumer is risk
neutral.)
We do not attempt to prove this or the subsequent proposition in this
subsection. References will be supplied at the end of the chapter.
We can use this general concept of the coefficient of risk aversion to
compare the risk averseness of two different consumers. Imagine we have
two consumers who conform to the model we are investigating. Let u
be the first consumer's von Neumann-Morgenstern utility function, with
>.(x) = -u"(x)ju(x), and let v be the second consumer's von NeumannMorgenstem
utility function, with 1/(x) = -v" (x) / v' (x). We wish to capture
formally the idea that the first consumer is at least as risk averse as the
second. A natural definition e is:
Definition. The first consumer is at least as risk averse as the second if, for
every lottery p and sure thing x such that the first consumer weakly prefers the
lottery p to x for sure, the second consumer prefers the lottery as well.
Put andtb,er way, any time the at-least-as-risk-averse consumer is willing
to take the risk, so is the at-most-as-risk-averse consumer.
Up to this point, we've been thinking of the bundles or objects among
which our consumer has been choosing as "sure things"- so many bottles
of wine, so many cans of bee~ so many shots of whisky. Many important
consumption decisions concern choices the consequences of which are
uncertain at the time the choice is made. For example, when you choose
to buy a car (new or used), you aren't sure what the quality is. When you
choose an education, you aren't sure about your abilities, later opportunities,
the skills of your instructors, etc. Both in financial and real markets,
commodities of risky or uncertain character are traded all the time.
Nothing in the theory up to this point precludes such commodities. A
can of Olympia beer is a definite thing - a share of GM is another - and
we could simply proceed by taking as given the consumer's preferences
for bundles that contain so many cans of beer, so many shares of GM; and
so on. But because there is special structure to the commodity a share of GM
(or, rather, because we can model it as having a special structure), we are
able to (a) make further assumptions about the nature of our consumer's
preferences for such things and, thereby, (b) get out something more concrete
about the consumer's demand for these kinds of commodities. This
is the basic plot of this chapter.
We begin in section 3.1 with the theory of von Neumann-Morgenstem
expected utility. In this theory, uncertain prospects are modeled as probability
distributions over a given set of prizes. That is, the probabilities of
various prizes are given as part of the description of the object - probabilities
are objective. Section 3.2 takes up the special case where the prizes
in section 3.1 are amounts of money; then one is able to say a bit more
about the nature of the utility function that represents preferences. In section
3.3, we briefly discuss a few applications of this theory to the topic of
market demand. In section 3.4, we turn to a richer theory, where uncertain
prospects are functions from "states of nature" to prizes, and where
probabilities arise (if at all) subjectively, as part of the representation of
a consumer's preferences. In section 3.5 we will explore theoretical and
empirical problems with these models. Finally, we will turn briefly to a nonnative (re)interpretation of this development. While our emphasis is
on the use of these models as descriptive models of choice, they are also
used as normative guides to aid consumers who must make choices. This
is sketched in section 3.6.
3.1. Von Neumann-Morgenstern expected utility
Setup
To begin, think of a basic set of "prizes" X - these are just like
our commodity bundles from before - and then a larger set of probability
distributions over the prizes. Let P denote the set of probability
distributions with prizes in X For the time being, we will assume
that p' c9nsists only of probability distributions that have a finite number
of possible outcomes - such probability distributions are called simple.
Formally,
Definition. A simple probability distibution p on X is specified by
(a) a fiY~ite subset of X, called the support of p and denoted by supp(p), and
(b) for iach x E supp(p) a ~umber p(x) > 0, with :Z::xEsupp<plp(x) = 1.
The set of simple probability distributions on X will be denoted by P.
To take an example, suppose that X· is the positive orthant in R2
,
where x = (x1, x2) represents x1 cans of beer and X2 bottles of wine. A
typical simple probability distribution is one with support {(10, 2), (4,4)},
p((l0,2)) = 1/3, and p((4,4)) = 2/3. This represents a one-third chance of
receiving 10 cans of beer and 2 bottles of wine and a two-thirds chance of
getting 4 cans and 4 bottles. We will depict simple probability distributions
by chance nodes; the example just given is depicted in figure 3.1. Note that
the numbers on branches are the probabilities, with the prizes written out
at the end of the branches.
Some terminology goes along with this; members x of X will be
called prizes or outcomes. Members p of P will be called lotteries, gambles, and probability distributions, all interchangeably. a
And some notation follows: The lottery that gives the prize x with
probability one will be written Ox.
How does this do as a model of the commodity space that we imagine
is "out there?" We can say three things here: First, if you thought of commodity
bundles as so many cans of beer, so many shares of GM, etc., then
you might have thought in terms of a vector of probability distributions
instead of a probability distribution on vectors. That is, in terms of our
example, we would write a typical commodity bundle as a vector (p1, Pz)
where P1 is a probability distribution over the number of cans of beer and
Pz is a probability distribution over the number of bottles of wine. But
the way we are doing things is superior. You will learn why this is in
section 3.3, but for now (if it makes sense) note that if you have a vector
of probability distributions, you won't know about any correlations in the
prizes they give. A probability distribution over vector prizes tells you not
only the marginal distribution on each component but also ali conditional
and joint probabilities.
Second, the assumption that every distribution has finite support may
seem rather limiting. For one thing, we can't in this framework represent
a gamble constructed as follows: I flip a coin until the first time it is tails
and give you as many dollars as times I flipped the coin, which would
give you $1 with probability 1/2, $2 with probability 1/4, and so on. Or
if the prize space X was R, representing an amount of money, and it
was analytically convenient to have Normal probability distributions, our
formalism woUldn't be adequate. We will address this shortcoming at the
end of this section.
Third, the probabilities all come as part of the description of the object
-probabilities are "objective" instead of "subjective." But in real world
applications, there may be no objective probability for a random event .
. For example, suppose an entrepreneur is considering a venture that will
earn her a given amount of money if a certain technique for gene splicing
works. To use our model in describing her choice problem, we need
to know the probability that this technique will work, something about
which well-informed individuals might disagree. We'll deal with subjective
probabilities in section 3.4.
To complete the setup, we need one more concept. Suppose we have
two simple probability distributions p and q and a number a that lies between zero and one, inclusive. Then we can form a new probability
distribution, written a.p + (1 - a.)q, in two steps:
(a) The support of this new probability distribution is the union of the
supports of p and q
Axioms for preference
Now assume that our consumer has preferences over the set P of all
simple probability distributions on X, given, as before, by a relation >that
expresses strict preference. We insist on two properties immediately.
Assumption 1. >- must be asymmetric and negatively transitive. This is just as before. Also as before, we construct from >- a weak preference
relation !:: and an indifference relation ,...., .
We add to these two properties some properties that exploit the fact
that our objects of choice are probability distributions. Consider first
Assumption 2. Suppose p and q are two probability distributions such that
p >- q. Suppose a is a number from the open interval (0, 1), and r is some other
probability distribution. Then ap + (1 - a)r >- aq + (1 - a)r.
This is called the substitution axiom. The idea is that in both of the two final
probability distributions, the consumer is getting r with probability 1-a,
hence this "same part" won't affect the consumer's preferences. Overall
preference depends on how the consumer feels about the differences between
the two, that is, on p vs. q. Since we suppose that' the consumer
prefers p to q, and since a > 0 implies that there is some chance that this
difference matters, we conclude that the consumer prefers ap + (1- a)r to
aq + (1- a)r.
Assumption 3. Suppose that p, q, and r are three probability distributions
such that p >- q >- r. Then numbers a and fJ exist, both from the open interval
(0, 1), such that ap + (1 - a)r >- q >- fJp + (1 - fJ)r.
This is called (for obscure reasons) the Archimedean axiom. Think of it this
way: Since p is strictly better than q, then no matter how bad r is, we
can find some "mixture" of p and r, which we'll write ap + (1 - a)r,
with weight on p close enough to one so this mixture is better than q.
And, similarly, no matter how much better p is than q, we can find a fJ
sufficiently close to zero so {Jp + (1 - fJ)r is worse than q. To help you
understand this, consider an example in which you might think it is false.
Suppose p gives you $100 for sure, q gives you $10 for sure, and r consists
of your death. You might then say that r is so much worse than q that
no probability a however close to one makes ap + (1 - a)r better than q.
But if you think so, think again. Imagine that you are told that you can
have $10 right now or, if you choose to drive to some nearby location, a
check for $100 is waiting for you. If you are like most people, you will
probably get into your car to get the $100. But this, to a minute degree,
increases the chances of your death. ·
The representation
Does. consumer choice conform to these axioms? A lot of experimental
evidence suggests that the answer to this is no; see section 3.5 below.
Despite this, vast quantities of economic theory are based on the assumption
that consumer choice does conform to these axioms, which leads to
the following representation of preferences:
In words, >- has an expected utility representation. Each possible prize has
a corresponding utility level, and the value of a probability distribution is
measured by the expected level of utility that it provides. Moreover, this
utility function is unique up to a positive affine transformation (which is
a fancy way of putting the last statement).
This is sometimes called a von Neumann-Morgenstern expected utility
representation, since one of the original modern developments of this
theory appears in von Neumann and Morgenstern's Theory of Games and
Economic Behavior. But the form goes back a good deal further, to the
eighteenth century and Daniel Bernoulli.
Note that the proposition establishes the existence of a numerical representation
for preferences on p. That is, there is a function U : P -+ R
How is proposition 3.1 proven? We will sketch the proof here; if you
are ambitious and know a bit of real analysis, problem 1 will give you
a few hints on how to fill in the details. (It isn't very hard.) First, we1ll
add one assumption (to be taken away in problem 1): In X are a best
prize b and a worst prize w; the consumer in question at least weakly
pr~ers b for sure to any other probability distribution over X, and any
other probability distribution over X is a least as good as w for sure. If
the consumer is indifferent between b for sure and w fo:r sure, then the
representation is trivial, so we'll assume as well that ob >- Ow. Now we
can use assumptions 1, 2, and 3 to obtain three lemmas.
This result, which is sometimes simply assumed, is called the calibration
property. It says that we can calibrate the consumer's preference for any
lottery in terms of a lottery that involves only the best and worst prizes.
Note that by virtue of lemma 1, we know that there is exactly one value
a that will do in lemma 2; if p were indifferent to two different mixtures
of the best and worst prizes, it would be indifferent to two things, one of
which is strictly preferred to the other.
Lemma 3. If p "' q, r is any third lottery, and a is any number from the closed
interaal [0, 1], then ap + (1 - a)r "' aq + (1 - a)r.
This is just like the substitution axiom, except that >- is replaced by "' here.
This is sometimes assumed as an axiom, and it itself is then sometimes
called the substitution axiom.
The rest is very easy. For every prize x, define u(x) as that number
between zero and one (inclusive) such that
This number u(x) will be the utility of the prize x. We know that one such
number exists by lemma 2, and this number is unique by lemma 1. Take
any lottery p: -
Lemma 4. For u : X -+ R defined as above, any lottery p is indifferent
to the lottery that gives prize b with probability .I: u(x)p(x) and w with the
complementary probability, where the sum is over all x in the support of p.
Once we have this result, we can use lemma 1 to finish the main
part of the proof: Compare any two lotteries, p and q. The lottery p
is indifferent to the lottery that gives b with probability I: u(x)p(x) and
w with the complementary probability, and q is indifferent to the lottery
that gives b with probability I: u(x)q(x) and w with the complementary
probability. We know by lemma 1 how to compare the two lotteries over
b and w: Whichever gives a higher probability of b is better. But this is
precisely the representation.
Proving lemma 4 is fairly simple in concept but rather cumbersome
notationally. The idea is that we take each prize x in the support of p
and substitute for it the lottery that gives b with probability u(x) and
w with probability 1 - u(x). By lemma 3, each time we make such a
substitution we will have a new lottery that is indifferent to p. And
when we are done with all these substitutions, we will have the lottery
that gives prize b with probability I: u(x)p(x) and w with probability
Or, in words, if we look at lotteries involving only the two prizes b and
w, .the consumer always (strictly) prefers a higher probability of ~g
the better prize.
Of course, this is only a demonstration of lemma 4 on a particular example;
it is not a proof. The exact proof is left as an exercise (problem 1(d));
if you try this, you should use induction on the size of the support of p.
All this was based on our added assumption that there was a best
and worst prize. But this assumption can be done away with; you can
either consult one of the standard reference books or try your hand at
problem 1(e).
To complete the proof of the proposition, we need to show that if
preferences have an expected utility numerical representation, then those
preferences satisfy assumptions 1 through 3, and. we need to show that
a representing utility function u is unique up to positive affine transformations.
(That is, any other function v that gives an expected utility
representation for fixed preferences >- satisfies v = au + b for constants
a> 0 and b.) These parts of the proof are left to you.
To summarize: We began by taking a choice space that had some
special structure - a set of probability distributions. We used that structure
to pose some axioms for the consumer's preferences that exploited
this structure. Then, using those axioms, we showed that a numerical
representation for consumer preferences can be created that exploits this
structure; viz., the representing function U on P takes the form of expected
utility for some function u defined on X. We continue this in the
next section, by making further assumptions about X and >-.
But first we address a concern voiced earlier: In this development we
only got the representation for probability distributions with finite support.
Can this sort of representation be extended to probability distributions such
as, say, the Normal distribution? Yes, it certainly can be. We sketch one way to proceed, which requires that you know some advanced probability
theory. Assume that X is R" or the positive orthant of R~ (If you know
enough math, take X to be a compact subset of a complete separable metric
space.) Let P be the space of all Borel measures on X and assume
that preferences are continuous in the weak topology on measures. If assumptions
1, 2, and 3 hold for all probability distributions in P, they hold
for the subspace of simple probability distributions, so we produce an expected
utility representation for some function u : X -+ R that works for
simple probability distributions. We can then use the continuity assumption
to show that u must be bounded. (If u were unbounded [say] above, we
could find a sequence of prizes {xn} with u(xn) > 2n. Construct the lottery
that gives prize Xn with probability 1/2n. This has "expected utility"
infinity. If you are clever, you should be able to show that this poses real
problems for the Archimedean axiom and weak continuity.) And then use
the fact that simple probability distributions are dense in the Borel probabilities
in the weak topology to extend the expected utility representation to all
of P. You can go on to show that u must be continuous (on X) as well as
bounded.
The boundedness of u is somewhat limiting in applications. For example,
for reasons to be explored later, a very nice utility function in applications
is exponential, or u(x) = -e--'"', where X is the real line. This utility function
is unbounded below. So we would want a way to obtain an expected utility
representation for more than simple probability distributions that didn't require
bounded u. This can be done as well. Essentially, we obtained bounded
u because we took P to be all Borel probability distributions and we assume
continuity over all of P. If one restricts P, say to distributions that meet certain
tail conditions, then the set of compatible utility functions becomes those
u that don't grow "too quickly'' relative to the assumed rate at which tail
probabilities must die off. For further details, see Fishburn (1970) or Kreps
(1988). .
3.2. On utility for money
Now we specialize even further to the case where the prizes are dollar
values. That is, X is the real line, or some interval thereof. We'll continue
to refer to the probability distributions p as lotteries, and we'll continue
to restrict attention to simple probability distributions. Throughout, we
assume that we have the three assumptions of the previous section and,
therefore, an expected utility representation.
In this case, it seems reasonable to suppose that our consumer prefers
more money to less. This has a straightforward consequence for the representation,
which you should have no difficulty proving.
Risk aversion
More subtle is the property of risk aversion. First we need a piece of
notation. For a lottery p, let Ep represent the expected value of p, or
Ep = :Z::., xp(x). b
Proposition 3.3. Suppose that for all lotteries p, 8Ep !:::: p. This is true if and
only if the utilitlj function u is concave.
A consumer who prefers to get the expected value of a gamble for sure
instead of taking the risky gamble, (and whose utility function, in consequence,
is concave) is said to be risk averse. We could also define a
risk-seeking consumer as one for whom p !:::: 8Ep for all p; this sort of
behavior goes with a convex utility function u. And a consumer is risk
neutral if p ,..., 8 Ep, which goes with a linear utility function. In economic
theory, risk aversion, which includes risk neutrality as a special case, is
typically assumed.
We have defined risk aversion in terms of comparisons of a lottery with
its expected value. We can generalize this sort of thing as follows. It is
possible to partially order lotteries according to their riskiness. That is, for
some pairs of lotteries p and p1
, it makes sense to say that p is riskier than l
For example, suppose that p1 gives prizes 10 and 20 with probabilities 2/3
and 1/3, and p gives prizes 5, 15, and 30 with probabilities 1/3, 5/9, and 1/9, respectively. I claim that p is riskier than p~ Among the many ways to
justify this claim, one is that p is p1 "plus noise." Imagine conducting p~ If
the outcome is 10, then either take 5 away (leaving 5) or give 5 more (leaving
15), each with probability 1/2. If the outcome is 20, either take away 5 with
probability 2/3 (leaving 15) or give 10 more with probability 1/3. Think
of this as a compound lottery, and you will see that first I do p1 and thffit
conditional on the outcome, I conduct a lottery with expected value zero. But
this compound lottery reduces to p. Since p is gotten from p1 in this fashion,
if a consumer was risk averse, she would (weakly) prefer p1 top. (Why?) In
general, if p and p1 are related in this fashion (p is p1 plus zero-conditionalmean
noise), then any risk averse decision maker will (weakly) prefer p1 to
p. This will be true also if p is Pt plus zero-conditional-mean noise, Pt is P2
plus sucl1 noise, ... , and p1 is p.,. plus such noise.
AB something of a converse, we can define "riskier than" by p is riskier
than p1 if p and .p' have the same expected value and every risk averse consumer
(weakly) prefers p' top. Then an interesting questiOn is, What is the
relationship between this definition of "riskier than" and the characterizations
that involve zero-conditional-mean noise? The general notion of more or less
risky gambles is an important one in the literature of the economics of uncertainty,
although we will not develop it here; see Rothschild and Stiglitz (1970,
1971, 1973), Diamond and Stiglitz (1974), or Machina (forthcoming).
Definition. A certainty equivalent for a lottery p is any prize x such that
Ox tv p.
Proposition 3.3. If X is an interval of R and u is continuous, then every
lottery p has at least one certainty equivalent. If u is strictly increasing, every
p has at most one certainty equivalent.
(The proof is left as an exercise.) We henceforth assume that the utility
function under consideration is continuous, strictly increasing, and concave,
the latter two reflecting increasing and risk averse preferences. Hence
every p has a unique certainty equivalent, which we denote O(:p). Note
that risk a~~sion, in this setting, can be characterized by G(:p)::; Ep.
What right have we to assume that u is continuous? It turns out that
concavity of u implies that u is continuous on the interior of the interval X,
although perhaps not at any endpoints of the interval. (Can you show this?
Does the assumption that u is strictly increasing help about the endpoints in
any way?) But we will want u to be continuous even at the endpoints, and
we might want u to be continuous even if u didn't represent risk averse preferences:
One can prove that u must be continuous if and only if preferences
are continuous in the weak topology, relative to the space of simple probability
distributions. So if you know about the (relative) weak topology; you
know what is 'necessary and sufficient for this often very useful assumption
about u.
Absolute (and relative) risk aversion
Assume henceforth that any von Neumann-Morgenstem utility function
u we consider is strictly increasing, concave, and has at least two
continuous derivatives. c This implies, of course, that u is continuous,
and so each lottery p has a unique certainty equivalent O(:p) that is less
than or equal to Ep. We will call the difference Ep-G(:p) the risk premium
of p, written R(:p).
Now consider a lottery panda (dollar) amount z. Write pffiz for the
lottery that gives prize x+z with probability p(x). That is, pffiz is just the
c What justification is there for the assumption that u is twice differentiable? We can
justify continuity and concavity. This then implies that u will have left- and right-hand
derivatives everywhere. The properties of nondecreasing or nonincreasing risk aversion are
then sufficient to show that u is continuously differentiable. (This makes a good homework
exercise.) Concavity again implies that u1 is decreasing, and we could, by complicating matters
substantially, get away without assuming a continuous second derivative. But unless
one is being very much more exact about this than we are, it isn't worth the difficuity entailed.
lottery constructed from p by increasing each prize of p by the amount
z. By thinking of these prizes as the "after the gamble" wealth level of
our consumer, as we increase z in p E9 z, we increase the consumer's general
wealth leveL It seems somewhat natural to suppose that as someone
becomes richer, she cares less and less about risks that she takes in given
gambles. In syinbols, this would say that as z increases, R(:p E9 z) should
not increase; the consumer's risk premium for a fixed gamble should not
increase as the consumer becomes wealthier. We formalize this notion and
two related to it as follows:
Definition. For a fixed consumer with utility function u, if R(:p E9 z) is nonincreasing
in z, the consumer is said to be nonincreasingly risk averse. d If
R(:pEBz) is constant in z, we say that the consumer has constant risk aversion.
If R(:pEBz) is nondecreasing in z, then we say the consumer is nondecreasingly
risk averse.
Our assertion about what seems "somewhat natural" is that consumers
are nonincreasingly or perhaps constantly risk averse.
What does all this portend for the utility function u? We give a
definition and then the result.
Definition. Given a (twice continuously differentiable, concave, strictly increasing)
utility function u, let .>..(x) = -u"(x)ju'(x), and call the function .>.. the
coefficient of absolute risk aversion for the consumer.
Since u is concave and strictly increasing, .>..(x) 2:: 0 ..
Proposition 3.4. A consumer is nonincreasingly risk averse if and only if.>.. (defined
from her von Neumann-Morgenstern utility function u) i~ a nonincreasing
function of x. The consumer is nondecreasingly risk averse if and only if .>.. is
a nondecreasing function of x . And the consumer has constant risk aversion if
and only if .>.. is a constant function, in which case the utility function u is a
positive affine translate of the utility function -e->.x. (If.>.. is the constant zero,
then u(x) is a positive affine translate of the function x; the consumer is risk
neutral.)
We do not attempt to prove this or the subsequent proposition in this
subsection. References will be supplied at the end of the chapter.
We can use this general concept of the coefficient of risk aversion to
compare the risk averseness of two different consumers. Imagine we have
two consumers who conform to the model we are investigating. Let u
be the first consumer's von Neumann-Morgenstern utility function, with
>.(x) = -u"(x)ju(x), and let v be the second consumer's von NeumannMorgenstem
utility function, with 1/(x) = -v" (x) / v' (x). We wish to capture
formally the idea that the first consumer is at least as risk averse as the
second. A natural definition e is:
Definition. The first consumer is at least as risk averse as the second if, for
every lottery p and sure thing x such that the first consumer weakly prefers the
lottery p to x for sure, the second consumer prefers the lottery as well.
Put andtb,er way, any time the at-least-as-risk-averse consumer is willing
to take the risk, so is the at-most-as-risk-averse consumer.
Comments
Post a Comment