
EPCs will only become credible if model outputs are compared with real-world performance in a transparent and repeatable way.
Energy Performance Certificates should reflect how homes really perform. This is important for them to be reliable tools for investment and improvement decisions. But at scale, this goal is difficult: EPCs rely on survey data and modelling assumptions, and real homes do not always behave as expected. This leads to a performance gaps, where EPCs do not reflect reality, which in turn impacts the credibility and usefulness of EPCs as a tool for investing in and improving homes.
H3 is an open-source project from Vulcan and Knauf Energy Solutions designed to help close the performance gap. H3 (short for HEM-HTC-Harness) wraps the Home Energy Model (HEM) in a workflow for comparing HEM predictions with real-world heat-loss measurements.
In simple terms, H3 lets users compare a HEM-derived HTC (Heat Transfer Coefficient) with a measured HTC, calibrate the model to reduce the performance gap, and review the result in an interactive report.
We are sharing H3 now, and created it under an Open Source MIT License from Day 1, to invite industry and policy-market collaboration on how measured-performance workflows around HEM should work in practice. We are looking forward to adding codebase collaborators, and plan to make the full codebase publicly available after a short pre-release period.
Measurements show what a home is really doing, while models help explain why and predict what happens next. H3 is intended to connect those two perspectives in a transparent and repeatable way.
Why measurement matters
The methodology behind Energy Performance Certificates (EPCs) is changing from SAP ('Standard Assessment Procedure') to the HEM ('Home Energy Model'). Compared with SAP's approximate monthly calculation, HEM's sub-hourly simulation provides a much richer representation of building performance, including estimates of internal temperatures, peak space-heating and cooling demand, heating-system efficiency, and unmet demand.
But even with HEM, survey-based assessments still rely on assumptions about real buildings. That is practical and scalable, but it can hide poor-performing homes that appear "typical" on paper.
That matters because better retrofit decisions, stronger quality assurance, and any future move toward measured EPCs all depend on being able to check model outputs against real performance. Measured EPCs can help asset owners better understand their building performance, and create conditions for better asset management.
This concern is also reflected in EPC reform consultations, with growing emphasis on fabric-focused metrics and increasing attention to heat-loss indicators such as HTC ('Heat Transfer Coefficient'). Fabric-focused measures matter because they say more about the thermal quality of the building itself, and less about short-term occupant behaviour, controls, or fuel choice.
HTC is the rate of heat loss per degree of temperature difference (W/K), in effect a single number that summarises how thermally leaky a home is. That makes it a useful basis for whole-home heat-loss comparison.
A growing range of methods can estimate whole-home HTC, from long-duration co-heating tests in vacant dwellings to scalable in-use smart-meter approaches such as SMETER. More rapid or in-use methods typically come at the cost of greater measurement uncertainty. Measurements can show what a home is really doing, but they are not self-explanatory: results still depend on method, timing, weather, and how the data is analysed.
SAP calculates an 'Inputs-Based' HTC from fabric and ventilation assumptions. HEM reports a static HTC in the results_static.csv output file, but that value is not directly interchangeable with SAP. Neither metric gives you the same thing as a dynamic output-based HTC under changing weather, solar gains, ventilation, and thermal mass.
H3 is designed to derive that output-based HTC from HEM in a way that is more comparable with in-use measurement. In practice, it wraps HEM in a controlled execution workflow so measured performance can challenge the model, while the calibrated model still helps explain and predict building behaviour.
How H3 calculates HTC
HEM exposes timestep heat-balance terms, but naive per-timestep "heat loss divided by delta-T" is unstable when:
- the delta between internal and external temperatures approaches zero;
- dynamic effects such as solar gains, thermal mass, and ventilation behaviour shift the gradient in a seasonal way; and
- in warmer periods, HEM can model window opening to manage overheating risk.
H3 therefore uses a perturbation-regression approach. It reruns the same model with small external temperature offsets (default +/-1 K, plus a base run) and derives HTC from the gradient of the heat-loss response.
To keep comparisons fair, H3 uses controlled heating assumptions by default and makes key analysis choices explicit, including masking and time-window controls when tighter comparability is needed.
The result is a stable output-based HTC metric intended to align more closely with measured in-use HTC.
Today, H3 runs against a dedicated environment with a minimally patched Rust HEM engine (a faster version of HEM used for Energy Calculations as a Service, MHCLG's building regulations API). The intent is not to maintain a permanent fork, but to upstream the small changes H3 needs over time.
Calibrating models with measurement
HEM offers useful advantages for calibration. Internal temperatures and other dynamic outputs give more evidence than trying to match a single number, reducing the risk of getting a good fit for the wrong reasons. And because HEM represents solar gains and thermal-mass behaviour directly, it can test whether measured HTC estimates remain consistent under stated boundary conditions.
In the initial release, H3 uses a single-scalar calibration against a single measured HTC figure. A two-scalar approach for fabric and ventilation was considered, but not pursued because one measurement does not provide enough information to identify those effects robustly and independently.
Instead, H3 solves for one shared heat-loss scalar, applied across both fabric and ventilation control points:
- Fabric control point: scaling relevant fabric and thermal-bridging inputs.
- Ventilation control point: scaling the aggregated ACH (Air Changes per Hour) term used in ventilation heat-loss calculations.
Calibration targets the same output-based HTC metric described above. The workflow runs baseline, scalar calculation, calibration, then optional iterative refinement while keeping the comparison window fixed for consistency.
Fairly comparing measurement methods
The value of any measurement depends on two things: the boundary conditions under which it is generated, and the trade-offs each test design makes between disruption, duration, and confidence.
That means different methods can produce different HTC estimates for the same dwelling. H3 therefore encourages calibration inputs to carry structured metadata such as method, time window, provenance, and uncertainty, so comparisons remain auditable and comparable over time.
Not all of that metadata is used directly in optimisation today, but it still matters for traceability and for future uncertainty-aware calibration, especially where uncertainty is higher or not directly comparable across methods or providers.
Using H3 on a case study
We expect H3 to be most useful when it is applied to real case studies rather than treated as a purely abstract exercise.
A typical workflow starts with an existing HEM model for a home, usually based on survey data. This can be developed using Vulcan tooling, the MHCLG's front-end, and other software options. Users then add a measured HTC value, together with context about how that measurement was obtained. H3 derives the HTC from HEM, shows the gap between modelled and measured performance, and generates an interactive report for review.
In the early stages, we want to use H3 on real case studies so that the workflow, evidence, and limits of the approach can be tested in practice. If you want to try H3 on a case study, we can offer access to a free Vulcan trial to create inputs to support that work.
What comes next
Next, we want to make H3 easier to adopt, improve the sophistication of calibration, and build a reusable evidence base for comparing methods, conventions, providers, and calibrated outputs. That means clearer standalone access and documentation, richer calibration objectives, and a structured artefact database for transparent QA and cross-method comparison.
If you work on EPC reform, measured HTC methods, or HEM-based assessment workflows, we would be very interested to hear from you.
If you want to explore H3 on a real case study, contact Baz from Vulcan (baz@usevulcan.app) or Kate from Knauf Energy Solutions (catherine.crawford@knaufenergy.com).
