The Value of Subsurface Modeling

Physics-based models are rational “logic machines” that elucidate the mechanics of a problem, assist in identifying dominant parameters/processes, and provide forecasted results.

In 2008, TNO launched a modeling competition where they used a model to create a data set including the production history of 30 wells for the first ten years of production, and required geologic data (with uncertainty). TNO then invited nine modeling groups to construct, tune, and optimize simulation models to the data set provided by TNO. During phase one, competitors were tasked to optimize NPV over the next 20 years of the field’s life. Each group’s optimized development plan was then tested in the original, ground-truth model. In phase two of the study, competitors were able to re-optimize their models after each year of production over that 20-year forecast (i.e. competitors would create an optimized plan looking 20 years into the future, then update again with 19 years to go, again with 18 years to go, etc.). (Denney et al., 2009 and Peters et al., 2009).

There are two findings from the Brugge study that are impressive. From phase one, they found that the top competitor (with imperfect information) created a field development plan over the next 20 years that achieved only 3% less NPV than the organizers achieved with perfect information, demonstrating in light of uncertainty, model optimizations still provide meaningful performance uplift. And from phase two, they show that continual optimization (year after year) continues to yield incremental gains each year, demonstrating that performance improves with frequency of model use.

Physics-based models provide a self-consistent description of dynamic processes: identifying inconsistent data, explaining complex phenomena, and optimizing decision making.

So why is the value of subsurface modeling challenged in the industry? I believe it’s the result of poor application of subsurface models, resulting in the ubiquitous phrase “garbage in, garbage out.” The assumption of “garbage in, garbage out” is that if the modeling inputs are uncertain, the modeling results are uncertain; and therefore, models can’t add value. As it applies to oil and gas, our reservoir data is inherently uncertain as our rocks are two miles underground, and so modeling skeptics devalue or negate the value of modeling as results are necessarily uncertain.

However, here we counter with another popular aphorism, “All models are wrong, but some are useful.” Everyone has heard George Box’s famous words and it is quoted widely. The expanded context of the quote is that Box goes on to use the ideal gas law (PV = RT) as an example of a model that is imprecise for a real gas; however, he extolls the values and insights provided from the ideal gas law. Because the relation is physics-based, it remains directional, even if slightly imprecise. Of course, for applications that need precision or where the gas behavior deviates significantly from ideal behavior, we can use a more detailed calculation to include the Z-factor. Starfield and Cundall (1988) expand upon this modeling philosophy in their work, “Towards a Methodology for Rock Mechanics Modelling.”

Starfield and Cundall introduce Holling’s classification of modeling problems, as in Figure 1. Holling’s diagram separates modeling problems into four quadrants:

  1. Good data and lacking understanding
  2. Lacking data with good understanding
  3. Good data and good understanding
  4. Lacking data and lacking understanding

Figure 1: Holling's classification of modeling problems.

The appropriate modeling approach is a function of the quadrant wherein the model problem resides. Quadrant 1 is the quintessential application of data-driven modeling, such as machine learning and data analytics. Many surface modeling problems fall into this category – things like optimizing weight on bit or predicting component failure due to vibrations. In the subsurface, we occasionally receive data sets in Category 3 (think a highly diagnosed science pad), but most subsurface petroleum engineering problems fall into Category 2. Recognizing which quadrant your subsurface problem falls into and the questions your model is trying to solve are critical for the correct application of modeling workflows. Applying the correct modeling framework to Category 2 and 4 problems counters the “garbage in, garbage out” objection posed by skeptics.

In Category 2 and 4 problems, data is uncertain (“garbage in”) and many times the problems themselves are ill-posed (number of uncertainties exceed the number of constraints). In these Categories, data-driven models are inappropriate and can be prone to produce “garbage out.” On the other hand, physics-based models are slaves to the confines of physics, and all outputs must obey these confines (note: we need to make sure we are using the right physics). Applying physics-based models to data-limited problems allows for identification of data gaps, falsification of hypotheses, and forecasts with reasonable error bars. Quoting Starfield and Cundall, “a system of interacting parts often behaves in ways that are surprising to those who specified the rules of interaction” (Starfield et al., 1988).

This is exactly what we’ve done with ResFrac. Integrating the physics of fracturing, production, and wellbore, which often interact in complex and nuanced ways, provides insights not possible in siloed modeling regimes. A transparent model (where the causational relationships are clear and exposed to the user, as they are in ResFrac), allows the user to explain model predictions, thereby identifying combinations of complex processes that yield unexpected results.

“Modelling in a cautious and considered way leads to new knowledge or, at the least, fresh understanding” (Starfield et al., 1988). The focus of evaluating the benefit of modeling often focuses on predictive accuracy. This is appropriate for Category 3 problems. However, for Category 2 and 4 modeling problems, in addition to evaluating forecast predictivity (and appreciating the envelope of uncertainty is greater for Category 2 and 4 models), modelers should evaluate the model on its ability to inform the input data itself: what data are incongruent with each other, what assumptions are necessary in order to explain an observation, what data would add value if acquired. So how do we apply this to ResFrac modeling?

  1. ResFrac is an interdisciplinary model, requiring input and reconciliation of geologic, petrophysical, geomechanical, and production data.
  2. Embrace differences/uncertainties in input data. Form hypotheses for the inferences supported by each data. Test these in the model. You will quickly find some data and hypotheses are not supported.
  3. Identify gaps in the data (sources of major uncertainty). If you were to collect more data, where would be most valuable to do so?
  4. Remain focused on the questions being asked of the model.
    1. Don’t overfit beyond the constraints available. Category 3 models allow for higher resolution of parameter fitting then Category 2 models.
    2. The greater the uncertainty, the greater the focus should be on global parameters to match the calibration data.
  5. And finally, optimize in the presence of the uncertainties identified in #1-4. The presence of uncertainty does not nullify the value of optimization. Our modeling projects consistently identify 10-25% realizable improvements in NPV, and are manifest in the field.

Further, some problems benefit from applying complementary Category 1 and Category 4 modeling mindsets. Addressing the same objective from two angles can greatly strengthen the insights obtained. For example, take the issue of well spacing in unconventionals. Tens of thousands of unconventional wells are drilled each year, generating a huge amount of data. Applying data-driven Category 1 workflows to this data can quickly synthesize and identify the primary trends. Borrowing from the Data-Information-Knowledge-Wisdom hierarchy (, this process high grades Data into Knowledge, characterizing the patterns and trends in the data. Critically thinking about the produced Knowledge: does optimal spacing vary by basin?, by hydrocarbon type?, by generation?, etc., then allows us to apply a Category 4 modeling workflow to explain the WHY behind these observations (Wisdom). Understanding the WHY allows us to make changes. Maybe 500-foot well spacing was too close historically, but if we changed X or Y, then 500 feet then becomes optimal.

A Category 1 data-driven mindset helps us formulate hypotheses, and we can then apply a Category 4 physics-based approach to test and validate/refute those hypotheses, and together, construct a robust, self-consistent explanation of processes and prescriptively optimize future completions. In this manner, subsurface models can be leveraged to add tremendous value to organizations that adopt and practice deliberate and directed modeling.


Thanks to Mark McClure and Egor Dontsov for their contributions!



Box, G. E. P. (1979), “Robustness in the strategy of scientific model building”, in Launer, R. L.; Wilkinson, G. N. (eds.), Robustness in Statistics, Academic Press, pp. 201–236, doi:10.1016/B978-0-12-438150-6.50018-2, ISBN 9781483263366.

Denney, D. (2009, July 1). Results of the Brugge Benchmark Study for Flooding Optimization and History Matching. Society of Petroleum Engineers. doi:10.2118/0709-0048-JPT

Holling C. S. (Editor) Adaptive Environmental Assessment and Management. Wiley, Chichester (1978).

Peters, L., Arts, R., Brouwer, G., & Geel, C. (2009, January 1). Results of the Brugge Benchmark Study for Flooding Optimisation and History Matching. Society of Petroleum Engineers. doi:10.2118/119094-MS

Starfield, A.M., & Cundall, P.A. (1988). TOWARDS A METHODOLOGY FOR ROCK MECHANICS MODELLING. International Journal of Rock Mechanics and Mining Sciences & Geomechanics Abstracts, 25, 99-106.

Learn why both independents and supermajors alike trust ResFrac