Avoiding Costly Mistakes with Uncertainty Quantification for Algorithmic Home Valuations

The simple tricks for using AVMU, or Automated Valuation Model Uncertainty, to make your home buying decisions more confident and less risky! The post Avoiding Costly Mistakes with Uncertainty Quantification for Algorithmic Home Valuations appeared first on Towards Data Science.

Apr 8, 2025 - 02:03

Avoiding Costly Mistakes with Uncertainty Quantification for Algorithmic Home Valuations

When you’re about to buy a home, whether you’re an everyday buyer looking for your dream house or a seasoned property investor, there’s a good chance you’ve encountered automated valuation models, or AVMs. These clever tools use massive datasets filled with past property transactions to predict the value of your potential new home. By considering features like location, number of bedrooms, bathrooms, property age, and more, AVMs use AI to learn associations with sales prices. A rapid and low-cost appraisal of any home sounds great on paper, and in many cases it is great. However, with every price prediction comes a level of uncertainty, and failing to consider this uncertainty can be a costly mistake. In this post, I illustrate the application of AI-uncertainty quantification for AVMs through the AVMU methodology.

Price Prediction Uncertainty?

Let’s start off simple. Imagine you’re looking for a two-story, four-bedroom house in a cozy neighborhood in Virginia Beach, VA. You’ve downloaded some local housing data and used it to train your own AVM (you’re tech-savvy like that!).

Case 1: Lucky you, several almost identical homes in the neighborhood have sold for around $500,000 in the past year. Your AVM confidently suggests the home you’re interested in will also likely be worth around the same price. Easy enough, right?

But here’s where it gets trickier:

Case 2: This time, no similar two-story, four-bedroom homes have sold recently. Instead, your dataset shows smaller, one-story homes selling at $400,000, and larger, three-story homes going for $600,000. Your AVM averages things out and again suggests $500,000. It makes sense, your target house is bigger than the cheaper homes and smaller than the pricier ones.

Both scenarios gave you the same $500,000 valuation. However, there’s a catch: The first scenario is backed by solid data (similar homes selling recently), making the price prediction quite reliable. In the second scenario, on the other hand, trusting the price prediction might be a bit riskier. With fewer comparable sales, the AVM had to make “an educated guess”, leading to a less certain price prediction.

The solid AVM in Case 1 is a very helpful decision support tool for purchasing a home, but the shaky AVM in Case 2 can give you a totally wrong idea of the home’s market value. Here’s the big question:

How can you tell whether your AVM prediction is solid or shaky?

AVMU—An Uncertainty Quantification Technique for AVMs

This is exactly why we need AVMU, or Automated Valuation Model Uncertainty. AVMU is a recent methodological framework that helps us quantify exactly how reliable (or uncertain) these AVM predictions are. Think of it as a confidence meter for your house price prediction, helping you make smarter decisions instead of blindly trusting an algorithm.

Let’s return to our Virginia Beach example. You’ve browsed listings extensively and narrowed your choices down to two fantastic homes: let’s call them Home A and Home B.

Image by Author, made partly with DALL-E.

Of course, the first thing you want to know is their market values. Knowing the market value ensures you don’t overpay, potentially saving you from future financial headaches and having to resell the home at a loss. Unfortunately, you don’t have much knowledge about house prices in Virginia Beach, as you’re originally from [insert name of the place you grew up]. Fortunately, you recall the data science skills you picked up in grad school and confidently decide to build your own AVM to get a grasp of the market values of your two candidate homes.

To ensure your AVM predictions are as accurate as possible, you train the model using Mean Squared Error (MSE) as your loss function:

\[\text{MSE} = \frac{1}{n} \sum_{i=1}^{n} (y_i – \hat{y}_i)^2\]

Here, $ n $ is the number of homes in your training dataset, $ \hat{y}_i $ represents the AVM’s price prediction for home $ i $, and $ y_i $ is the actual price at which home $ i $ was sold.

After training the model, you eagerly apply your AVM to Homes A and B. To your surprise (or perhaps excitement?), both homes are valued at exactly $500,000 by the algorithm. Very well, but just as you’re about to place an offer on home B, a thought strikes: these predictions aren’t absolute certainties. They’re “point predictions”, essentially the AVM’s best guess at the most likely market value. In fact, the true market value is probably somewhat higher or lower, and it’s rather unlikely that the AVM prediction nailed the market value down to the exact dollar.

So, how do we measure this uncertainty? This is where AVMU methodology comes into play, with a straightforward but powerful approach:

First, you use cross-validation (e.g., 5-fold CV) to generate out-of-fold price predictions, $ \hat{y}_i $, for all the $ n $ homes in your dataset.
Next, for each home, you calculate how far off the prediction was from the actual sales price. This difference is called the absolute deviation, $ |\hat{y}_i – y_i| $, between the price prediction, $ \hat{y}_i $, and the actual sales price, $ y_i $.
Then, instead of predicting sales prices, you train a separate “uncertainty model”, $ F(\hat{y}_i, x_i) $, using these absolute deviations, $ |\hat{y}_i – y_i| $, as the target. This special model learns patterns indicating when the AVM predictions are typically accurate or uncertain.
Finally, you apply this uncertainty model to estimate how uncertain the price predictions are for Homes A and B (i.e., your test set), by predicting their absolute price deviations. You now have simple uncertainty estimates for both of the homes.

Now, I know exactly what some of you might be thinking about the third step:

“Wait a second, you can’t just put a regression on top of another regression to explain why the first one is off!”

And you’d be absolutely right. Well, sort of. If there were clear, predictable data patterns showing that certain homes were consistently overpriced or underpriced by your AVM, that would mean your AVM wasn’t very good in the first place. Ideally, a good AVM should capture all meaningful patterns in the data. But here’s the clever twist: instead of predicting if a home is specifically overpriced or underpriced (what we call the signed deviation), we focus on absolute deviations. By doing this, we sidestep the issue of explaining if a home is valued too high or too low. Instead, we let the uncertainty model focus on identifying which types of homes the AVM tends to predict accurately and which ones it struggles with, no matter the direction of the error.

From a homebuyer’s perspective, you’re naturally more worried about overpaying. Imagine buying a home for $500,000 only to discover it’s actually worth just $400,000! But in practice, underestimating the value of a home is also more problematic than you’d think. Make an offer that’s too low, and you might just lose your dream home to another buyer. That’s why, as a savvy buyer equipped with AVM predictions, your goal isn’t just to chase the highest or lowest price prediction. Instead, your priority should be robust, reliable valuations that closely match the true market value. And thanks to the AVMU uncertainty estimates, you can now more confidently pinpoint exactly which predictions to trust.

Mathematically, the process described above can be written like this:

\[|\hat{y}_i – y_i| = F(\hat{y}_i, x_i) + \varepsilon_i \quad \text{for } 1 \leq i \leq n\]

and:

\[\text{AVMU}_i = F(\hat{y}_i, x_i)\]

The uncertainty model, $ F(\hat{y}_i, x_i) $, can be based on any regression algorithm (even the same one as your AVM). The difference is, for your uncertainty model you’re not necessarily interested in achieving perfect predictions for the absolute deviations. Instead, you’re interested in ranking the homes based on prediction uncertainty, and thereby learn which out of Home A’s and Home B’s price predictions you can trust the most. The MSE loss function used for the AVM (see first equation), might therefore not be the ideal choice.

Rather than using MSE, you therefore fit your uncertainty model, $ F(\hat{y}_i, x_i) $, to optimize a loss function more suited for ranking. An example of such a loss function is to maximize rank correlation (i.e., Spearman’s $ \rho $), given by:

\[\rho = 1 – \frac{6 \sum_{i=1}^{n} D_i^2}{n(n^2 – 1)}\]

Here, a higher $ \rho $ means your model ranks homes better regarding prediction uncertainty. $ D_i $ represents the difference in ranks between actual absolute deviations, $ |\hat{y}_i – y_i| $, and predicted uncertainties, $ \text{AVMU}_i = F(\hat{y}_i, x_i) $, for home $ i $.

So now you have, for both candidate homes, an AVM price prediction and a corresponding AVMU uncertainty estimate. By combining these two measures, you quickly notice something interesting: even if multiple homes share the same “most likely market value”, the reliability of that predictions can vary greatly. In your case, you see that Home B comes with a significantly higher AVMU uncertainty estimate, signaling that its actual market value could stray far from the $500,000 valuation.

To protect yourself from the unnecessary risk, you wisely opt for purchasing Home A, whose AVM valuation of $500,000 is backed by stronger certainty. With confidence restored thanks to the AVMU, you happily finalize your purchase, knowing you’ve made a smart, data-informed choice, and celebrate your new home with a relaxing drink in your new front yard.

Ethics and Other Applications of AVMU

This simple introduction to AVM price uncertainty and how AVMU can guide you when buying a home is just one of its many potential applications. Homes aren’t the only assets that could benefit from quick, low-cost valuation tools. While AVMs are commonly associated with housing due to plentiful data and easily identifiable characteristics, these models, and their uncertainty quantification via AVMU, can apply to virtually anything with a market price. Think about used cars, collectibles, or even pro soccer players. As long as there’s uncertainty in predicting their prices, AVMU can be used to understand it.

Sticking with housing, purchasing decisions aren’t the only area where AVMU could be used. Mortgage lenders frequently use AVMs to estimate the collateral value of properties, yet often overlook how uneven the accuracy of these price predictions can be. Similarly, tax authorities can use AVMs to determine your property taxes but may accidentally set unfair valuations due to unacknowledged uncertainty. Recognizing uncertainty through AVMU can help make these valuations fairer and more accurate across the board.

However, despite its versatility, it’s essential to remember neither AVMU is perfect. It’s still a statistical model relying on data quality and quantity. No model can completely eliminate uncertainty, especially the random aspects inherent in most markets, sometimes referred to as aleatoric or irreducible uncertainty. Imagine a newlywed couple falling head-over-heels for a particular kitchen, prompting them to bid way above the typical market value. Or perhaps bad weather negatively influencing someone’s perception of a house during a viewing. Such unpredictable scenarios will always exist, and AVMU can’t account for every outlier.

Remember, AVMU gives you probabilities, not fixed truths. A home with a higher AVMU uncertainty is more likely to experience price deviations, it is not a guaranteed. And if you find yourself thinking, “should I make third model to predict the uncertainty of my uncertainty model?”, it’s probably time to accept that some uncertainty is simply unavoidable. So, armed with your AVMU-informed insights, relax, embrace the uncertainty, and enjoy your new home!

References

A. J. Pollestad, A. B. Næss and A. Oust, Towards a Better Uncertainty Quantification in Automated Valuation Models (2024), The Journal of Real Estate Finance and Economics.
A. J. Pollestad and A. Oust, Harnessing uncertainty: a new approach to real estate investment decision support (2025), Quantitative Finance.

The post Avoiding Costly Mistakes with Uncertainty Quantification for Algorithmic Home Valuations appeared first on Towards Data Science.