Executive Summary
One of the most common criticisms to the use of Monte Carlo in financial planning is its typical assumption that investment returns are normally distributed, when in reality the market appears to go through environments that may be more volatile than a normal distribution would predict, as highlighted by the events of the financial crisis. In the last four months of 2008 alone, the market experienced 20 high-volatility trading days with a standard deviation varying from 3.5 to almost 10... each of which should not have occurred more frequently than once per millenia to once per several billion years. Yet when we look at those returns on an annual basis, we see a very different picture - the one-year decline to the bottom in March of 2009 was a "mere" 2.5 standard deviation event, which is uncommon but entirely probable under a normal distribution. Which raises the question - are "black swans" just a short-term phenomenon that average out by the end of the year, and are we focusing too much on impossibly rare black swans instead of the rare-but-entirely-probable 2 standard deviation decline?
The inspiration of today's blog post is a continuation of the analysis of black swans I have been working on as a part of my recent series on Monte Carlo analysis in the January and pending February issues of The Kitces Report. In exploring the ongoing criticism of the so-called "black swans" that violate the normal distribution assumption in Monte Carlo planning, I discovered a very interesting fact: black swans seem to be entirely a short term phenomenon.
For instance, the graph below shows the actual daily price volatility (red line) of the Dow Jones Industrial Average for roughly the past century (from 1900 through late 2009) against the frequency of those results that would be expected from a normal distribution (green line). As the results reveal, when looking at the actual historical data, there is a surprising frequency of very extreme results, at least relative to the "0" probability they were anticipated to have. To either side of the green bell curve, we see several small red peaks showing historical data points that could only be termed "extreme outliers" - in fact, the graph has to be widened so far to fit them, that the green line is condensed in the middle and looks more like a green bump than a green bell! (If we zoomed in to only look at the center of the graph, though, the familiar bell-shaped curve would be present.) Although many of the outliers shown by the red line literally represent no more than 1-10 occurrences out of over 27,000 daily data points, the fact that they exist at all is shocking; any result beyond 5 standard deviations should only occur once every several millennia, yet we see numerous instances in this barely-more-than-a-century data set; the most extreme results, at +/- 10 standard deviations (with two data points at almost -20 standard deviations), would not be expected to occur even once in several lifetimes of the universe!
However, it is notable that these results are based on daily return volatility, while in practice financial planning clients are investing for months, years, or even decades at a time, and have relatively modest ongoing withdrawals (or may even still be contributing). It's not as though financial planning clients are invested with extensive leverage, where an extreme short-term event can render them insolvent before a recovery comes. After all, a client who is withdrawing less than 5% of the portfolio in any particular year might not be harmed at all by extreme daily volatility as long as it recovers by the end of the year, and in fact might not be harmed much by a severe one-year portfolio decline, as long as it is followed by an equally volatile recovery in the opposite direction in the subsequent year. And as the above graph highlights, the unexpected "black swans" do occur in the positive direction as well as the negative, creating the genuine potential that positive black swans could largely offset negative ones.
To look at this effect over time, the next graph below shows the distribution of historical market returns when viewed on an annual basis, instead of a daily basis. As the results reveal, in reality over the span of a year, daily negative black swans are in fact offset by positive black swans as mean reversion (and/or the impact of market valuation) takes hold, and consequently the market declines in the historical data on an annual basis are actually remarkably close to what is predicted by a normal distribution! Just as the normal distribution would predict no extreme outliers beyond approximately 3.5 standard deviations in "only" a century of data, the actual results confirm there weren't any!
To be fair, if you look closely, it is true that there's a slightly higher frequency of negative results that range from -2.5 to -3.5 standard deviations in the annual data - they should be more rare. Yet on the other hand, there are also an unexpected frequency of positive results in the +2.5 to +3.5 standard deviation range, as well as additional results that extend up past 7 standard deviations! If we were to view all of these unexpected results as "black swans", then the reality is that there are almost as many positive black swans as there are negative ones, when viewed on an annual basis! In other words, the greatest surprise implied by the distribution of actual historical annual returns is an elevated "risk" that the client might turn out to get impossibly rare good returns that would not be predicted in a normal-distribution-based Monte Carlo! Of course, since the returns are positive, they perhaps would be better labeled as "gold" swans than black ones! (Note: Technically, as a series of normal distribution returns compound, they are expected to produce a distribution with an increasingly long positive tail, due to the impact of positive compounding to the right while negative returns are ultimately constrained by the zero bound {you can't have negative wealth}. This distribution is called a lognormal distribution, and in fact some Monte Carlo software is built using a lognormal distribution instead of a normal one. Nonetheless, relative to modeling market returns in Monte Carlo using a normal distribution for annual returns, which is still the most common approach, the fact remains that the normal distribution is more likely to underestimate extremely good returns than extremely bad ones, when viewed on an annual basis.)
Of course, this is not to say that the negative returns on the left side of the annual results can't lead to a disaster in the client's financial plan. A negative 2 standard deviation event is still a pretty bad one-year price decline of roughly 35%. But the point is that a 35% bear market decline in a year is not a black swan - it's a rare but entirely probable event already anticipated by a normal distribution. The only black swans - the events that are predicted to never happen with a normal distribution, yet do - are actually the gold swans.
Which means the real challenge in financial planning, where there is no leverage and spending even over the span of a year is fairly modest relative to the total portfolio, is not dealing with the impact of negative "black swans" but in being prepared for the rare but entirely probable 2-standard-deviation bear market that is likely to occur at some point even under a normal distribution. Which in turn suggests, as highlighted in last week's blog, that managing effective retirement advice is less about dealing with the risk of impossibly rare black swan declines, but more about how you plan and prepare for any kind of decline, including the rare but probable ones.
So what do you think? Are we overreacting to "black swans" in the financial planning context, where there is no leverage and clients have only modest withdrawals at best? Have we become too focused on the extreme black swans, and ignored the rare but entirely probable "normal" bear market decline already predicted by normal distributions? Would you change your planning discussion with clients if the reality was that the only true black swans that occur annually are actually golden?
Meg Bartelt says
Wow. What a delightfully stats-tastic topic. I found myself first reveling in the discussion of the bell curve distortions, etc., and how clearly yes! we shouldn’t have been talking “black swan” at all, just a wholly statistically predictable “really bad year.”
Then I started thinking about communications with clients. All media outlets were screaming “black swan” for months on end in 2008+, so we had to deal with that particular term and the Armageddon-like panic that accompanied it. But how easy/worthwhile would it be to talk with a client about how it’s not *really* a black swan, not the end of the world, etc. How does the statistical reality you reveal above translate into communicating with clients, because that’s at the base of making the suggested plans to deal with the bad-but-not-end-of-world drops in the market.
Perhaps I’m just restating your question in a different way, but I was just struck by the undefined gap between your statistical analysis and what clients think, hear, need to know, can be expected to understand viscerally, etc.
Eric Bruskin says
Methodology question: what’s the number of data points summarized in annual returns distribution (your 2nd graph)? Did your years include Jan 1 to Jan 1, Jan 2 to Jan 2, etc, or just calendar years? In the former case, N would be (I presume) the number of daily measurements minus 365; in the latter case, N would be just the number of years you have.
Michael Kitces says
Eric,
The years are overlapping rolling periods – starting Jan 1 to Jan 1, Jan 2 to Jan 2, etc.
In point of fact – although it was too much detail to cover in the original blog post – I believe that’s why you see a disproportionate number of -2.5 standard deviation events – because this kind of rolling period approach essentially “oversamples” four precipitous declines (crashes of 1929, 1973, 2001, and 2008) into several hundred overlapping periods that include the decline.
I hope that helps to clarify a little?
Respectfully,
– Michael
Bill says
This kind of negates your estimate of standard deviation, doesn’t it?? All that overlapping data is going to play hell with your stats…
Dick Purcell says
Michael –
Great blog! I hope there’s more coming in your probing of Monte Carlo simulation.
I agree with the thrust of your message, and suggest that the thing to worry about is not shape of the annual return-rate distribution but uncertainty about its mean and standard deviation – first and foremost its mean.
Sensitivity testing in multi-year Monte Carlo simulations demonstrates the point.
As you point out, there are two ways in which swans are ironed out over longer terms. Daily swans are largely noise that is ironed out in annual returns, beautifully shown in your graph. And as pointed out in your text, the shape of the annual distribution matters less and less for longer periods (thanks to the Central Limit Theorem). As multi-year time horizon gets longer, the distribution of the result gets closer and closer to the same lognormal produced with the simple normal-shaped annual-rate distribution.
By contrast, differences in return-rate mean are amplified and compounded over longer horizons. And we have no basis for knowing with precision what return-rate mean to assume.
For the stock market annual return-rate mean, our historical sample size is so inadequate that the “true” historical mean could well be 2% or 3% lower. And that’s for the past – in the future, return-rate mean could be 1% or 2% or 3% lower than whatever the “”true” mean has been in our history.
Well – if we test multi-year simulations with return-rate means that differ by 1% or 2% or more, the differences in probabilities for long-term results are very very big. Way way bigger than differences produced by “reasonable” changes in the annual return rate distribution shape.
Of course, there’s always threat of a MEGA-swan. Yellowstone could go off, cover most of the USA in feet of volcanic ash . . .
Dick Purcell
Alex says
People who blame Monte Carlo for assumption that investment returns are normally distributed simply are not educated enough. Monte Carlo can generate any type of probabilistic distribution and there are multiple techniques that allow doing this. If people do not like results of Monte Carlo this means in reality that they do not like model of return that that was used as input to Monte Carlo, or particular generator of random numbers, or simply software implementing modeling has bugs. Unfortunately many people working in financial industry confuse Monte Carlo with probabilistic modeling and particular software (that can be very bad) implementing it.
Monte Carlo is the great tools used in many areas. Tool is not responsible if it is used wrongly or for solving problem that can not be solved by this tool. Does Monte Carlo modeling have real problem? Yes, of course, but this is definitely not limitation related to modeling normal distribution. There are areas in financial planning where Monte Carlo modeling is extremely slow, because its correct usage requires generation of hundreds of thousands scenarios. In particularly, Monte Carlo is not good for solving complex and even middle size optimization problems like retirement planning with annually changing asset allocation and multiple asset classes or mutual funds. In these cases, analytical framework or combination of analytical framework and Monte Carlo are much better. Trying to use Monte Carlo for such problems has a little chance of success. You can read more about this in the “Monte Carlo trap and alternatives” paper.