In partnership with researchers at Queen Mary University of London’s Clinical Effectiveness Group, Endeavour Health has developed an address-matching algorithm to link patient health records to geospatial information.
Linking people to places can help researchers understand how health is impacted by social and environmental factors, like the characteristics of a household, green space or air pollution. But patient addresses are entered into NHS records as free text so the same address can be written in different ways, making data linkage very difficult.
The algorithm, known as ASSIGN (AddreSS MatchInG to Unique Property Reference Numbers), allocates a Unique Property Reference Number (UPRN) to patient records. Every property in the UK already has a UPRN. They are allocated by local authorities and made nationally available by Ordnance Survey. A UPRN gives every address a standardised format, enabling pseudonymised linkage to other sources of data. ASSIGN compares addresses in the NHS and social care records with the Ordnance Survey’s Address Base Premium UPRN database, one element at a time, and decides whether there is a match. The algorithm mirrors human pattern recognition, so it allows for certain character swaps, spelling mistakes and abbreviations. After rigorous testing and adjustments, ASSIGN correctly matches 98.6% of patient addresses at 38,000 records per minute. It also includes patients’ past addresses, making it possible to study addresses across the life span.
Researchers at Queen Mary are using the pseudonymised UPRNs to study:
- How the health of household members impacts childhood obesity.
- Whether overcrowded or multi-generational households are at greater risk from Covid-19.
- How to support GPs to identify people living in care homes so they can provide more effective care.
As ASSIGN is open source, it is hoped the algorithm will also be used by other researchers to link data, inform policy and improve population health across England.