Wednesday, March 31, 2010

What Is the "Best" Statistical Measure of Home Prices?

In my last post I discuss various statistical methods used to track housing prices.  I suggest that the Case-Shiller Index, which uses the repeat-sales method, is the current gold standard. 

Case-Shiller's utility is limited by the relatively few metro areas it covers--twenty to be precise.  I happen to live and work in the Harrisburg, PA metro area. The nearest city included in the Case-Shiller Index is Washington, DC, one hundred miles south and a far different housing market than Harrisburg (trust me).

So...if you live in a smaller metro area you probably depend on your local multi-list service to periodically report sales statistics, usually the mean (average) and median home price.  Personally I like to present this data to home sellers when I talk about the market they are selling into.  As my local multi-list reports both the mean and median, I wondered which stat better represents central tendency for my marketplace. 

To this end I collected a year's worth of home sales data for the West Shore of the Harrisburg metro area (the collection of munipalities west of the Susquehanna River).  For 2009 there were 1,962 residential units sold.  The histogram below shows the distibution of home sales.  It has a Bell Curvish shape skewed to the high end.  Thirteen sales over $700,000 largely account for the long "tail" on the distribution.  I suspect this is a characteristic of many real estate markets that include very high-end communities.



Superimpose the mean and median on this distribution (see below) and you "see" a $20,000 gap between the two measures of central tendency.  Visually the median appears closer than the mean to what most people would consider the typical house price.  Also, in this case, the median corresponds with the mode, which is the most frequent value in a distribution.

The conclusion of my simple test is that the median seems the more appropriate measure of central tendency or typical-ness in the housing market.  As it is less sensitive than the mean to outlying values, i.e., the relatively few high-priced home sales, it should also be less volatile quarter-to-quarter or year-to-year, and therefore a better statistic with which to trend housing prices.

Sunday, March 28, 2010

Trending House Prices

The topic is trending house prices, and specifically, house prices in the Harrisburg marketplace. For blogging purposes I break the topic into three parts:

• How do we trend house prices?

• What is the “best” statistical measure for house prices in the Harrisburg market?

• How have Harrisburg house prices trended since 1993 (as far back as I have data)?

How do we trend house prices?

We understand financial markets through the statistics we employ to track those markets.

Take the Dow Jones Industrial Average (DJIA), which for many people is synonymous with the U.S. stock market. The DJIA is a weighted and scaled average of the stock price of thirty large corporations, the so-called Blue Chips. Considering there are over 3,100 U.S. companies listed on the New York Stock Exchange it is a narrowly defined index. Yet it’s simple and easy-to-understand. Sophisticated research goes into the selection of the companies in the index (only General Electric remains from the original 12 companies Charles Dow included in the index in 1896). Hardcore investors may rely on more complex stock indices; however, for those of us whose primary exposure to the market is our 401K, the DJIA does a good job of tracking the performance of the large-cap sector of the stock market.

It would be nice to have something comparable to the DJIA for the housing market. Given the nature of real estate—housing prices highly dependent on local conditions, no two homes identical, etc.—it’s a tall order.

In mid-80s two Boston-area economists, Karl Case and Robert Shiller, developed an index to measure the change in housing prices using repeat-sales of the same house. I attempted a fuller understanding of Case and Shiller’s statistical methodology, but encountered terms like multivariate regression and heteroscedastic sampling error and figured it wasn’t worth it.

Today the Case-Shiller Index is the gold standard for tracking housing prices. Apparently there are options and futures contracts based on this index, although how they work is a mystery to me. Unfortunately the Case-Shiller Index only covers twenty large U.S. metro areas, of which Harrisburg isn’t one, so it’s of limited value in my marketplace.

The Case-Shiller Index peaked in the 2nd quarter of 2006 and has declined in every successive quarter.

The Federal Housing Finance Agency (FHFA) developed its own housing price index using the repeat-sales method, the FHFA HPI. It collects repeat-sales data exclusively from transactions that are financed by conventional/conforming mortgages purchased by Fannie Mae and Freddie Mac. The FHFA index covers more metro areas than Case-Shiller, including Harrisburg-Carlisle. It doesn’t include home sales that are financed by FHA, VA, and Dept. of Agriculture (rural housing) mortgages. As these government-backed mortgage programs finance a large percentage of home sales today, it limits this index’s usefulness.

So...what if the Case-Shiller Index doesn’t cover your local housing market?  Most likely you must depend on the mean or median price of home sales published by the local multi-list service to track housing prices (in Harrisburg the Central PA Multi-List publishes sales stats quarterly)

The mean and the median attempt to measure the central tendency, or typical value, of a population. Which statistic works better in an application depends on the nature of the population under consideration. In case you don’t remember the definitions…

Arithmetic mean (also known as average)—sum all values in a sample and divide by number of values

Median—the middle value

For a population that resembles the classic Bell Curve, there is little or no difference between the mean and median. In the case of 2009 house prices in the Harrisburg metro area the mean was $180,228 and the median was $161,900. The $18,000 difference between the two measures of central tendency suggests that the distribution of house prices in Harrisburg is skewed towards the upper price range. As it happens the mean is more sensitive to outlying (rare) values than the median. In other words a $1 million house sale may significantly move the mean, but it hardly influences the median.

Integrated Asset Services (IAS), an REO services company, publishes a house price index, the IAS360™, that trends the median price for detached single family house in 360 U.S. counties as well as nationally/regionally. It claims proprietary “next generation” technology, and takes pains to differentiate its methodology from the repeat-sales indices, but doesn’t explain it in great detail (it being, you know, proprietary). Likewise it doesn’t discuss how it selects the 360 counties; presumably population and importance to the REO industry come into play. Its key attribute is speed: it reports monthly with only a 1-month lag.

Tuesday, March 16, 2010

The CPI and U.S. Housing Prices

In the early years of the 21st century, as home prices started taking off, I went to the font of real estate knowledge, i.e., http://www.realtor.org/, to seek enlightenment. What I read, or what I remember reading, is that based on historical behavior home prices over time should not increase much faster than the rate of inflation, or else, and brace yourself here, eventually no one could afford to buy a house. This seemed a keep-it-simple-stupid (KISS) explanation of the way things ought to work.

Several years later home prices continued an upward trajectory. No longer were real estate gurus talking about home prices eventually falling back to earth—OK there was some talk about housing coming in for a “soft landing.” More typically anyone with a masters degree in economics and 5 minutes of air time was postulating reasons why the housing boom was likely to continue indefinitely: immigration was fueling demand, people were buying more 2nd homes, Wall Street had devised can’t fail investment products (can you spell collateralized mortgage obligations?) funneling money into mortgages, etc.

There was undoubtedly an element of truth to all this speculation, but it glossed over the cold hard fact that housing prices were increasing faster than people’s income and housing inflation was disproportionately high relative to all the other things measured by the Consumer Price Index (CPI). If you buy the KISS explanation above, something had to give; eventually it did, and we’re still working through the economic fallout.

With the luxury of hindsight I decided to see for myself how home prices behaved relative to the CPI before, during, and after the housing bubble years. Arbitrarily I started my comparison in 1990, using that year’s median U.S. house price as a basis and applying the annual CPI measure of inflation to obtain a CPI-predicted price from 1991 to present. The graph shows actual median house price (blue) and CPI-predicted median house price (red). From 1990 to 2000 actual home prices lagged behind the inflation rate. Starting around 2000 house prices caught up to the CPI and then surged ahead reaching a peak about 2007. Since then, as every real estate agent knows, house prices crashed. Today, for practical purposes, the median house price is back on the CPI-predicted line, suggesting much of the air is out of the housing bubble.

OK this may not be the most sophisticated analysis of the housing market; however, when we rely on pundits to interpret the world for us, a simple model sometimes serves as reality-check, sort of like the little boy who cried the emperor has no clothes.

Notes: As it turns out the CPI is easily found on the web. Surprisingly median U.S. house price are not as readily obtained, at least not if you want 20 years of data. Nevertheless I found several sites that listed historical median house prices. The data was presented as quarterly not annual median price, so I had to do a little massaging. I don’t believe anything I did compromised the validity of the data, at least not for the purposes employed here.