Summary statistics
Data are available for both the Affymetrix 500K and Infinium 15K chips. All files are available on the Sanger Institute FTP site.
Affymetrix 500K
The result files for the analyses described in the consortium paper for (i) each case collection against the pooled control groups 58C and UKBS, (ii) combining other case collections as controls and (iii) combining phenotypically relevant case collections (RA/T1D, autoimmune - CD/RA/T1D, cardiovascular - HT/CAD/T2D) are available below. Data are split by chromosome. Chromosome X is split into 'chromosome 23' (the part not pseudoautosomal with Y) and 'chromosome 24' (PAR).
Autosomes and pseudoautosomal region on chromosome X
Download the complete data set or subsets with just the data relevant to each disease (Bipolar Disorder, Coronary Artery Disease, Crohn's Disease, Hypertension, Rheumatoid Arthritis, Type 1 Diabetes, Type 2 Diabetes).
The files have columns:
- id [500K probe ID]
- rsid [dbSNP]
- position
- allele1
- allele2
- average_maximum_posterior_call [measure of certainty of calls]
- controls_AA
- controls_AB
- controls_BB
- controls_NULL
- cases_AA
- cases_AB
- cases_BB
- cases_NULL
- frequentist_add [p-value for additive model]
- frequentist_gen [p-value for general model]
- bayesian_add [-log10 Bayes Factor for Additive model]
- bayesian_gen [-log10 Bayes Factor for General model]
- sex_frequentist_add [p-value for additive model stratified by sex]
- sex_frequentist_gen [p-value for general model stratified by sex]
- good_clustering [1 = good clustering, 0 = bad clustering]
Chromosome X (non-pseudoautosomal)
Download the complete data set. The files are named snptest_{case}_{control}_23.txt with the abbreviations:
- CTL (58C, NBS)
- RT1 (RA, T1D)
- AIM (CD, RA, T1D)
- CVD (HT, CAD, T2D)
- CL1 (58C, NBS, HT, CAD, T2D, BD)
- CL2 (58C, NBS, CD, RA, T1D, BD)
- CL3 (58C, NBS, T1D, CAD, HT, CD, T2D)
These files include the information above but with some additional columns:
- controls_A_male
- controls_B_male
- controls_NULL_male
- cases_A_male
- cases_B_male
- cases_NULL_male
- region_frequentist_add [p-value for additive model stratified by region]
- region_frequentist_gen [p-value for general model stratified by region]
The columns cases_ and controls_ without the _male suffix refer to females. Note that some of the columns listed above for autosomes are not relevant here; in which cases a "." will be found in place of data.
Imputed genotypes
Summary statistics as above for HapMap SNPs not present on the Affymetrix 500K Genechip and for which the genotypes of the WTCCC samples have been obtained by imputation as described in the paper.
Download the complete data set or subsets, one per collection (1958 British Birth Cohort, UK Blood Service, Bipolar Disorder, Coronary Artery Disease, Crohn's Disease, Hypertension, Rheumatoid Arthritis, Type 1 Diabetes, Type 2 Diabetes).
The files have columns:
- id
- rsid
- pos
- allele_A
- allele_B
- average_maximum_posterior_call [measure of certainty of calls]
- info
- controls_AA_exp [expected AA genotype frequency from imputation]
- controls_AB_exp [expected AB genotype frequency from imputation]
- controls_BB_exp [expected BB genotype frequency from imputation]
Illumina Infinium 15K (non-synonymous)
There are four files. Download them all in a ZIP archive.
The file genotype_counts.txt contains genotype counts for each cohort used in the analysis. Counts for the homozygous genotype containing the minor allele are listed first, followed by counts for the heterozygote and other homozygote. Poorly genotyping SNPs that were excluded from analyses are set to missing:
- Chromosome
- SNP
- Location (in bp)
- Ankylosing Spondylitis Genotype Counts
- Auto-immune Thyroid Disease Genotype Counts
- Breast Cancer Genotype Counts
- Multiple Sclerosis Genotype Counts
- 1958 Birth Cohort Genotype Counts
The file original_analyses.txt documents the analyses performed on each case group versus the 1958 birth cohort control group:
- Chromosome
- SNP
- Location (in bp)
- Ankylosing Spondylitis Armitage Trend test Chi-square Statistic
- Ankylosing Spondylitis degrees of freedom
- Ankylosing Spondylitis p value
- Auto-immune Thyroid Disease Armitage Trend test Chi-square Statistic
- Auto-immune Thyroid Disease degrees of freedom
- Auto-immune Thyroid Disease p value
- Breast Cancer Armitage Trend test Chi-square Statistic
- Breast Cancer degrees of freedom
- Breast Cancer p value
- Multiple Sclerosis Armitage Trend test Chi-square Statistic
- Multiple Sclerosis degrees of freedom
- Multiple Sclerosis p value
The file pooled_analyses.txt documents the analyses for each case group versus a combined control group consisting of each of the other disease groups and the 1958 Birth Cohort:
- Chromosome
- SNP
- Location (in bp)
- Ankylosing Spondylitis Armitage Trend test Chi-square Statistic
- Ankylosing Spondylitis degrees of freedom
- Ankylosing Spondylitis p value
- Auto-immune Thyroid Disease Armitage Trend test Chi-square Statistic
- Auto-immune Thyroid Disease degrees of freedom
- Auto-immune Thyroid Disease p value
- Breast Cancer Armitage Trend test Chi-square Statistic
- Breast Cancer degrees of freedom
- Breast Cancer p value
- Multiple Sclerosis Armitage Trend test Chi-square Statistic
- Multiple Sclerosis degrees of freedom
- Multiple Sclerosis p value
- Auto-Immune Diseases Armitage Trend test Chi-square Statistic
- Auto-Immune Diseases degrees of freedom
- Auto-Immune Diseases p value
The file top_hits_pooled_analyses.txt contains results for SNPs that have p values less than 0.001 in the combined analyses:
- Disease
- SNP
- Chromosome
- Location (in bp)
- Minor Allele Frequency
- Odds ratio
- Armitage trend test Chi-square
- p value
- Gene
- Gene Description