The many advances provided by large scale genomic analyses have confirmed that whilst disease risk has both genetic and environmental components, it’s complicated. So although there is undoubtedly a genetic component to an individual’s disease risk, deeply technical statistical methods are needed to tease apart the signal from the noise. Partly this is due to the scale of the datasets, which can involve upwards of 10 million data points for each of tens of thousands of individuals. But it’s also down to the complicated nature of disease risk which is the result of an interplay between the environment and several genetic variants. Whilst we are getting ever better at measuring genetic variation, measuring the environment is still incredibly difficult.
It was against this backdrop that the UK Biobank was developed. The UK Biobank is a unique and innovative resource. It is a prospective cohort study, which means that it aims to research a group of people over time as different diseases begin to affect them. The UK Biobank cohort contains 500,000 people who were between the ages of 40 and 69 at the time of recruitment between 2006 and 2010. At the beginning of the study it was unknown what diseases would affect individuals, but because the number of individuals is large, it’s likely that many of the most common diseases will affect a sizeable chunk of the cohort.
The major innovation of this cohort is that a wide variety of measurements were taken at the beginning of the study and continue to be taken now. These include things like family history of disease, early life experiences, current and former lifestyle choices and cognitive function, that were answered by participants through detailed questionnaires. A wide variety of physical objective measurements were also taken, such as hand grip strength, height, weight, biochemical measurements. Importantly, genome-wide genotype data has also been collected. Ongoing analyses linking electronic health records, deaths, and hospital inpatient data mean that all of these measurements can be linked to the onset of different diseases, making this resource the first of its kind in getting large amounts of data on individuals that can be interrogated by any bona fide researcher or commercial organisation. Indeed, this latter point is important to stress, as a key aspect of this data is that it has always been planned to be made available to academics and private companies.