Insight into genetic predisposition to chronic lymphocytic leukemia from integrative epigenomics

Author/s: Helen E. Speedy, Renée Beekman, Vicente Chapaprieta, Giulia Orlando, Philip J. Law, David Martín-García, Jesús Gutiérrez-Abril, Daniel Catovsky, Sílvia Beà, Guillem Clot, Montserrat Puiggros, David Torrents, Xose S. Puente, James M. Allan, Carlos López-Otín, Elias Campo, Richard S. Houlston, José I. Martín-Subero

Nature Communications, volume 10, 3615 (2019)

doi:10.1038/s41467-019-11582-2

 

Abstract

Genome-wide association studies (GWAS) have provided evidence for inherited genetic predisposition to chronic lymphocytic leukemia (CLL). However, efforts to define the mechanisms mediating the, largely non-coding, signals revealed by GWAS have been constrained by a lack of integrated genome-wide data in large CLL series. We performed a detailed epigenomic characterization of the 42 known CLL risk loci by analysing chromatin accessibility, active regulatory elements marked by H3K27ac, and DNA methylation in up to 486 primary CLLs. Risk loci were significantly enriched for active chromatin in CLL with evidence of being CLL-specific or differentially regulated in normal B-cell development. We used in situ promoter capture Hi-C (CHi-C), in conjunction with gene expression data to identify likely target genes of the risk loci. Candidate target genes were enriched for pathways related to B-cell development such as MYC and BCL2 signalling. At 14 loci our analysis highlights 63 variants that potentially influence CLL risk. In summary, the integration of genetic and epigenetic information has provided insights into the relationship between inherited predisposition and the regulatory landscape of CLL.

UCSC tracks

Please, use this link to access the CLL Referece Epigenome tracks in the UCSC genome browser. The presented tracks are briefly described in this document.

Data

All raw data for this study was mined from previous studies and has been deposited at the European Genome-Phenome Archive (EGA, https://ega-archive.org), which is hosted at the European Bioinformatics Institute (EBI), under accession numbers EGAS00000000092, EGAD00001004046, EGAS00001000272 and EGAS00001001911.

The normalized data matrices for the QTL analyses can be found in the following table.

 
File Details File size Download

H3K27ac peaks

This file contains the information which H3K27ac

peaks were used for the QTL analysis. Per SNP

(columns), only the peaks which have assigned

a 2 were used. 0 = peak present in less than 10%

(with a minimum of two) of the patients in all of the

following subgroups: homozygous non-risk,

heterozygous risk or homozygous risk based on

the sentinel SNP genotype; 1 = peak present

in at least 10% (with a minimum of two) of

the patients in one or more of the following

subgroups: homozygous non-risk,

heterozygous risk or homozygous risk based

on the sentinel SNP genotype located outside the

linkage disequilibrium region (LD, r2≥0.2) of

the corresponding SNPs; 2 = peak present

in at least 10% (with a minimum of two) of the

patients in one or more of the following

subgroups: homozygous non-risk,

heterozygous risk or homozygous risk

based on the sentinel SNP genotype located

within the linkage disequilibrium region

(LD, r2≥0.2) of the corresponding SNPs.

1,3M Link

H3K27ac normalized

values

This file contains the normalized H3K27ac

signals used for the QTL analysis.

91M Link

ATAC-seq peaks

This file contains the information which ATAC-seq

peaks were used for the QTL analysis. Per SNP

(columns), only the peaks which have assigned

a 2 were used. 0 = peak present in less than 10%

(with a minimum of two) of the patients in all of the

following subgroups: homozygous non-risk,

heterozygous risk or homozygous risk based on

the sentinel SNP genotype; 1 = peak present

in at least 10% (with a minimum of two) of

the patients in one or more of the following

subgroups: homozygous non-risk,

heterozygous risk or homozygous risk based

on the sentinel SNP genotype located outside the

linkage disequilibrium region (LD, r2≥0.2) of

the corresponding SNPs; 2 = peak present

in at least 10% (with a minimum of two) of the

patients in one or more of the following

subgroups: homozygous non-risk,

heterozygous risk or homozygous risk

based on the sentinel SNP genotype located

within the linkage disequilibrium region

(LD, r2≥0.2) of the corresponding SNPs.

2M Link

ATAC-seq

normalized

values

This file contains the normalized ATAC-seq

signals used for the QTL analysis.

132M Link

DNA methylation

CpGs

Per SNP the list of CpGs used for the QTL

analysis are listed. These CpGs are located

within the linkage disequilibrium region (LD,

r2≥0.2) of the corresponding SNPs.

14K Link

DNA methylation

normalized beta

values

This file contains the normalized beta values

of the DNA methylation data which were

transformed to M-values for the QTL analysis.

1,8G Link

Gene expression

probes

This file contains the information which probes

were used for the RNA expression QTL

analysis. Per SNP (columns), only the probes

which have assigned a 2 were used. 0 =

probes expressed (GC-RMA levels > 4.5) in

less than 10% (with a minimum of two) of

the patients in all of the following subgroups:

homozygous non-risk, heterozygous risk or

homozygous risk based on the sentinel SNP

genotype; 1 = probes expressed (GC-RMA

levels > 4.5) in at least 10% (with a

minimum of two) of the patients in one or

more of the following subgroups:

homozygous non-risk, heterozygous risk or

homozygous risk based on the sentinel SNP

genotype located outside the linkage

disequilibrium region (LD, r2≥0.2) of the

corresponding SNPs; 2 = probes expressed

(GC-RMA levels > 4.5) in at least 10% (with a

minimum of two) of the patients in one or

more of the following subgroups:

homozygous non-risk, heterozygous risk or

homozygous risk based on the sentinel SNP

genotype located within the linkage

disequilibrium region (LD, r2≥0.2) of the

corresponding SNPs.

6,4K Link

Gene expression

normalized data

 

This file contains the GC-RMA normalized

gene expression data used for the QTL

analysis.

79M Link

Code

We also provide the custom code in the next table.

File Details File Size Download

Chromatin states

enrichment

Evaluation of chromatin states enrichment in 7

chronic lymphocytic leukemia (CLL) patients

at CLL, breast cancer (BC) and colorectal

cancer (CRC) risk loci.

79K Link
H3K27ac enrichment

Enrichment of H3K27ac on non-individual peaks

within the linkage disequilibrium regions

10M Link

Allelic imbalance

Statistical analysis evaluating allelic imbalance

in 99 CLL patients.

3,6M Link