ECON4140 - Spring 2019

Before the dynamic programming lectures

To get the mindset:

Imagine that from a stock of (positive) size x, you get a proportional yield bx which can be reinvested for higher stock next period, or consumed now. b>0 is a constant.

You reinvest a fraction u (in the unit interval [0,1]) of bx, and consume (1-u)bx at log utility. There is capital depreciation at constant rate µ so we assume that this will give you a stock tomorrow of g(x,u) = (1-µ)x + bux = (1-µ+bu)x. There is no discounting.

To get a solution, we shall assume b>1-µ. (And, µ in [0,1].)

Starting at the case where there is no future looks over-simplistic, but still:

0. No more future

At the end of the horizon, there is no use for future stock. You reinvest nothing, because that maximizes utility ln((1-u)bx) = ln(bx)+ ln(1-u). Maximized by u = 0.

Often, one would not even formulate this "optimization" step: the model would say that at this stage, there is no reinvestment.

Anyway: the value is ln b + ln x where x is whatever stock you might have at this stage.

1. Time ends tomorrow.

Choosing u today yields ln((1-u)bx) today. But tomorrow, you get a payoff from tomorrow's stock g = (1-µ+bu)x: Namely, ln b + ln g.

Inserting, you get to maximize ln((1-u)bx) + ln b + ln((1-µ+bu)x).
In this particular case, because logs behave so nice, we get 2 ln x + 2 ln b + ln((1-u)(1-µ+bu)) to maximize. Already here, we see that the maximizer will not depend on x, so we will end up with 2 ln x + some constant. (The maximizer u* equals (b+µ-1)/2b (interior max, from the assumptions made). Insert for this, and you get the constant determined.)

What we just did.

We maximized to get what? Today's value = today's direct utility + tomorrow's value.
^{(If there were discounting: "present value of tomorrow's value".)}

Note, "tomorrow's value" is a function of tomorrow's state, which depends on today's state and today's choice: g(x,u).
^{(We could have had time-dependence too.)}

So if we let f be today's running utility and V be tomorrow's value, then we have today's value \(= v(x) = \displaystyle\max_{u\in[0,1]} \Big\{f(x,u)+V(g(x,u))\Big\}.\)

General principle

Value depends on the horizon. Call the time of the "0" case T. So the value ln b + ln x should be indexed with time. It is not uncommon to use the letter "J" for value (why not? It isn't that much used for other things?) - so we index it by time and write Jx_T() = ln b + ln x.

Recursively, we then have \({J_{t-1}(x)= \displaystyle\max_{u\in[0,1]} \Big\{f(t,x,u)+J_t(g(t,x,u))\Big\}}\)
^{(Here we have allowed both running utility f and the dynamics g to depend on time.)}

In words: To get the optimal value today, optimize the sum of

today's direct utility, and
value from tomorrow.

^{Note: "value" from tomorrow assumes that we behave optimally from tomorrow on. A half-assed attempt at implementation here in Norwegian: https://www.youtube.com/watch?v=BERRCrSBNdk - with an English translation at http://www.jakobsande.no/?info=12&dikt=822}

Exercise:

With time t left, call the value J_T-t(x). For the above problem: prove by induction that this value is of the form C_T-t + A_T-t ln x. (understood: where the A and C do not depend on x).

To note on the language: phrases like "T period model" could lead to a bit confusion. Is a static model of zero periods or one? I intended "With time t left" to mean that the "0" case above is t = 0, i.e. J_T-t= J_T-0

Published Mar. 18, 2019 11:51 AM - Last modified Mar. 18, 2019 11:51 AM