Wellcome Trust Case Control Consortium

Home
WTCCC1 WTCCC2 WTCCC3
Press & publications
Press Release: 28/09/2005 Press Release: 06/06/2007 Publications and presentations
Data access
Access to data Approved Applications CDAC members Data formats FAQ
Open access
Available software
Participant access
Participant Login
Contact
Feedback and queries

Access to WTCCC genotype data

The primary purpose of the WTCCC is to accelerate efforts to identify genome sequence variants influencing major causes of human morbidity and mortality, through implementation and analysis of large-scale genome wide association studies. Additional objectives include the development and validation of informatics and analytical solutions appropriate to the scale and nature of the project, as well as use of the data generated to answer important methodological and biological questions relevant to association studies in general, and in the UK in particular (for example issues of population substructure).

The Consortium anticipates that data generated from the project will be used by others, such as required for developing new analytical methods, in understanding patterns of polymorphism and in guiding selection of markers to map genes involved in specific diseases.

Access to summary data and individual-level genotype data is available by application to the Wellcome Trust Case Control Consortium Data Access Committee. Access to data will be granted to qualified investigators for appropriate use. Individual-level genotype data and summary genotype statistics for WTCCC1 collections are held within the European Genotype Archive, http://www.ebi.ac.uk/ega. For further information regarding EGA, please contact ega-admin@ebi.ac.uk.

Data available

Summary statistics and individual-level genotype data is available for the following:

  1. Data from the following samples using the 500K Affymetrix chip:

    • 1,500 samples from the 1958 British Birth Cohort
    • 1,500 samples from the UK Blood Service Control Group
    • 2,000 samples each from the following disease collections: type 1 diabetes, type 2 diabetes, rheumatoid arthritis, inflammatory bowel disease, bipolar disorder, hypertension, coronary artery disease.
    • 1,500 samples from a Tuberculosis collection from Gambia and 1500 samples from a control collection from Gambia

    Two sets of genotypes are available for each of these, called by different algorithms: Chiamo as discussed and used in the analysis for the WTCCC papers and the Affymetrix algorithm BRLMM.

  2. Data from the following samples using the Affymetrix v6.0 chip:

    • 3,000 samples from the 1958 British Birth Cohort
    • 3,000 samples from the UK Blood Service Control Group

    This data will now not be ready for release until late summer 2009

  3. Data from the following samples using an Illumina 1.2M (custom) chip:

    • 3,000 samples from the 1958 British Birth Cohort
    • 3,000 samples from the UK Blood Service Control Group

    This data will now not be ready for release until late summer 2009.

  4. Data from 1,400 samples from the 1958 British Birth Cohort typed on the Illumina 550k chip. This dataset was generated by the Wellcome Trust Sanger Institute in collaboration with the 1958 BC, but is being distributed as part of the WTCCC.

  5. Data from the following samples using a custom Illumina chip with 15K non synonymous SNPs:

    • 1,500 samples from the 1958 British Birth Cohort
    • 1,000 samples each from the following disease collections: multiple sclerosis, autoimmune thyroid disease, ankylosing spondylitis and breast cancer.

Genotypes from these samples were called with the Illumina algorithm GenCall.

Related data and samples

  1. 1958 BC Genotype Data: All individual-level genotype data generated on the 1958 BC samples will be distributed using the access process described below. If you require access to 1958 BC genotype data only, please follow the access process described below. Applications for access to 1958 BC genotype data and sample DNA, or genotype data and phenotype data, will be considered by the 1958 BC Oversight Committee. Details of the application process can be found here.
  2. Data from 1059 samples from cases of severe malaria from The Gambia. One set of genotypes is available for these, called by Chiamo as discussed and used in the analysis for the WTCCC papers. These are available through MalariaGEN, http://www.malariagen.net/access. Data from the related controls, 1496 cord blood samples from The Gambia, are available through either MalariaGEN or WTCCC.

How to apply

To apply for access to the WTCCC datasets please use the following link:

https://www.sanger.ac.uk/legal/DAA/MasterController

If you have any queries about the application process please contact

Publications

The WTCCC plans to publish several manuscripts based on this control genotype data which will focus on the following aspects:

The release of pre-publication data from large resource-generating scientific projects was the subject of a meeting held in January 2003, the "Fort Lauderdale meeting", sponsored by the Wellcome Trust. The report from that meeting can be viewed at wtd003207.pdf. The recommendations of the Fort Lauderdale meeting address the roles and responsibilities of data producers, data users, and funders of 'community resource projects', with the aim of establishing and maintaining an appropriate balance between the interests of data users in rapid access to data and the needs of data producers to receive recognition for their work. The WTCCC has agreed to follow these data-release principles and as such, these data are being released as a 'community resource project' as defined in the report of the Fort Lauderdale meeting.