B1460 - In-silico detection of deletions in KLK3 from the ALSPAC raw SNP data - 03/12/2012

B number: 
B1460
Principal applicant name: 
Santiago Rodriguez (University of Bristol, UK)
Co-applicants: 
Prof Ian Day (University of Bristol, UK), Dr Osama Al-Ghamdi (University of Bristol, UK)
Title of project: 
In-silico detection of deletions in KLK3 from the ALSPAC raw SNP data.
Proposal summary: 

Aim: To estimate the frequency of KLK3 copy number variants in the general population.

KLK3 encodes PSA, the most widely used biomarker in the early detection of prostate cancer. Genetic variants in KLK3 may influence serum PSA levels. From a pilot of selected controls within the ProtecT biorepository, we have identified three subjects with very low serum PSA levels (less than 1 ng/micro-L) with evidence of heterozygous deletions that entirely encompassed KLK3 (Rodriguez et al., Clinical Chemistry, in press)

Hypothesis: That deletions in the KLK3 region can be identified from SNP raw data available from ALSPAC.

Background: SNP arrays were originally designed to genotype SNPs; however they have been adapted for structural variant discovery. Several tools to analyse data from SNP arrays have been developed recently, however they tend to be platform-specific (i.e. able to process either Affymetrix or Illumina array data).

Algorithms that handle data from Affymetrix arrays include programmes as Birdsuite (1), and ITALICS (2). Other algorithms were developed to analyse data from Illumina arrays, such as SCIMM (3) and TriTyper (4). Some software applications are "versatile"; capable of handling data from both platforms (e.g.) PennCNV (5) and QuantiSNP (6).

Approach: Access to the fluorescence SNP raw data for a genomic interval of ~18Mb on chromosome 19 is required; (Chromosome 19: [41,000,000 - 59,128,983] Human GRCh37 build). This interval includes the entire human kallikrein locus, comprised of 15 kallikrein genes -in tandem-, including KLK3 that encodes the prostate specific antigen (PSA). Indeed serum PSA of normal women is typically undetectable (around 1 pg/mL) (7), however, deletions in KLK3 could be transmitted from mothers to children. KLK3 deletions from the general population, and from prostate cancer cases are considered. ALSPAC mothers will fit in as a "control" group to data from the 1958 cohort and from the ProtecT cohort (males with prostate cancer and controls). Copy number variants with high confidence scores will be assembled from multiple scans using several algorithms, filtering out false positive/discrepant signals in the process. A list of copy number variants called from this study will be compared across other studies to estimate the frequency of deletion events in KLK3 among males vs. females, and male controls vs. prostate cancer cases. We will also request access to the SNP genotype data for the same genomic region for children. This will enable us to study deletion transmission patterns from mothers to children.

References:

Santiago Rodriguez, Osama A Al-Ghamdi, Kimberley Burrows, Philip A.I. Guthrie, J. Athene Lane, Michael Davis, Gemma Marsden, Khalid K Alharbi, Angela Cox, Freddie C Hamdy, David E Neal, Jenny L Donovan, and Ian N. M. Day (2012) Very low PSA levels and deletions of the KLK3 gene. Clin Chem, -in press-.

(1) Korn, J. M., F. G. Kuruvilla, S. A. McCarroll, A. Wysoker, J. Nemesh, S. Cawley, E. Hubbell, J. Veitch, P. J. Collins, K. Darvishi, C. Lee, M. M. Nizzari, S. B. Gabriel, S. Purcell, M. J. Daly & D. Altshuler (2008) Integrated genotype calling and association analysis of SNPs, common copy number polymorphisms and rare CNVs. Nat Genet, 40, 1253-60.

(2) Rigaill, G., P. Hupe, A. Almeida, P. La Rosa, J. P. Meyniel, C. Decraene & E. Barillot (2008) ITALICS: an algorithm for normalization and DNA copy number calling for Affymetrix SNP arrays. Bioinformatics, 24, 768-74.

(3) Kelley, D. R. & S. L. Salzberg (2010) Clustering metagenomic sequences with interpolated Markov models. BMC Bioinformatics, 11, 544.

(4) Franke, L., C. G. de Kovel, Y. S. Aulchenko, G. Trynka, A. Zhernakova, K. A. Hunt, H. M. Blauw, L. H. van den Berg, R. Ophoff, P. Deloukas, D. A. van Heel & C. Wijmenga (2008) Detection, imputation, and association analysis of small deletions and null alleles on oligonucleotide arrays. Am J Hum Genet, 82, 1316-33.

(5) Wang, K., M. Li, D. Hadley, R. Liu, J. Glessner, S. F. Grant, H. Hakonarson & M. Bucan (2007) PennCNV: an integrated hidden Markov model designed for high-resolution copy number variation detection in whole-genome SNP genotyping data. Genome Res, 17, 1665-74.

(6) Colella, S., C. Yau, J. M. Taylor, G. Mirza, H. Butler, P. Clouston, A. S. Bassett, A. Seller, C. C. Holmes & J. Ragoussis (2007) QuantiSNP: an Objective Bayes Hidden-Markov Model to detect and accurately map copy number variation using SNP genotyping data. Nucleic Acids Res, 35, 2013-25.

(7) Chang, Y. F., S. H. Hung, Y. J. Lee, R. C. Chen, L. C. Su, C. S. Lai & C. Chou (2011) Discrimination of breast cancer by measuring prostate-specific antigen levels in women's serum. Anal Chem, 83, 5324-8.

Date proposal received: 
Thursday, 8 November, 2012
Date proposal approved: 
Monday, 3 December, 2012
Keywords: 
Genetics, Cancer
Primary keyword: