Documentation


Citation

For all resources provided on the AlleleDB website, including the ASB, ASE, accessible SNVs, supplementary materials, personal genomes and scripts, please cite:
Chen J, Rozowsky J, Galeev TR, Harmanci A, Kitchen R, Bedford J, Abyzov A, Kong Y, Regan L, Gerstein M. A uniform survey of allele-specific binding and expression over 1000-Genomes-Project individuals (2016). Nat Commun. 7:11101

For more details on the original AlleleSeq and vcf2diploid, please visit the Alleleseq website here.

Querying the database

This database interfaces with the UCSC genome browser search engine; all coordinates are from human reference genome HG19.

There are three basic query options that can be entered into the search box at the 'Query' page.

1) A region of the genome.
E.g. chr15:25247000-25365000. This will directly query the database with that region.

2) Name of a gene (HGNC symbol or otherwise).
E.g. EEF2. This will return UCSC genome annotations (with positional information) associated with that gene name. This will appear as a link to query the database for that region.

3) Keywords.
E.g. tetratricopeptide repeat.

Output of AlleleDB

Upon successful query of the database, two files are produced: "out.bed" and "view.txt".

Sample contents of "out.bed" are shown below.

chr3 10192471  10192472  NA18486_ASE  G/A 84  0 92  0 0 ASE
chr3 10192671  10192672  NA18486_ASE  G/A 98  0 73  0 0 ASE
chr3 10191942  10191943  NA20505_ASE  G/A 25  0 27  0 0 ASE
chr3 10192671  10192672  NA20505_ASE  G/A 30  0 16  0 0 ASE
chr3 10192671  10192672  NA12878_ASB-Pol2  G/A 7   0 4   0 0 ASB-Pol2
chr3 10193682  10193683  NA12878_ASB-Pol2  T/G 0   0 8   6 0 ASB-Pol2
chr3 10184955  10184956  NA12878_ASB-SRF   A/T 12  0 0   8 0 ASB-SRF
The columns are:
1. Chromosome of the SNV
2. Start position of the SNV (0-based)
3. End position of the SNV (1-based)
4. Identifier of the individual (as per the 1000 Genomes Project) with ASB/ASE annotation. For more information on the individuals, see here.
5. Reference allele / Alternate allele
6. Read counts for Adenine
7. Read counts for Cytosine
8. Read counts counts for Guanine
9. Read counts for Thymine
10. Was this SNV detected as AS? (0:No | 1:Yes)
11. Additional annotation: ASB/ASE, followed by TF name if ASB.

Sample contents of "view.txt" are below.

browser position chr3:10183319-10195354
track name="AS SNVs" description="AlleleDB Output" itemRgb="On"
chr3  10192471  10192472  NA18486_ASE  0 + 10192471  10192472  0,0,0
chr3  10192671  10192672  NA18486_ASE  0 + 10192671  10192672  0,0,0
chr3  10191942  10191943  NA20505_ASE  0 + 10191942  10191943  0,0,0
chr3  10192671  10192672  NA20505_ASE  0 + 10192671  10192672  0,0,0
chr3  10193682  10193683  NA12878_ASB-Pol2  0 + 10193682  10193683  0,0,0
The "view.txt" file is meant to be viewed in the UCSC genome browser as a track (the output page provides a link to do this). We provide this option to download the file so that one can save this as a custom track for later use.

On the track, ASB SNVs are colored red and ASE SNVs are black.

Downloading precompiled results

If one is interested in downloading more complete data with TF (for ASB) or gene (for ASE) and sample annotations, these are available on our 'Download' page.

Data sources

DNA-seq:
1000 Genomes Project (Abecasis G. et al., Nature 2012); PMID: 23128226
RNA-seq:
gEUVADIS (Lappalainen, T. et al., Nature, 2013); PMID: 24037378
ENCODE (ENCODE Project Consortium, Nature, 2012); PMID: 22955616
Lalonde et al., Genome Res (2011); PMID: 21173033
Montgomery et al., Nature (2010); PMID: 20220756
Pickrell et al., Nature (2010); PMID: 20220758
Kilpinen et al., Science (2013); PMID: 24136355
Kasowski et al., Science (2013); PMID: 24136358
ChIP-seq:
ENCODE (ENCODE Project Consortium, Nature, 2012)
McVicker et al., Science (2013); PMID: 24136359
Kilpinen et al., Science (2013); PMID: 24136355
Kasowski et al., Science (2013); PMID: 24136358

Scripts

The AlleleDB pipeline uses the following scripts from GitHub for filtering of ambiguous mapping bias reads and allele-specific SNV detection using beta-binomial calculations, in conjunction with the AlleleSeq pipeline (v1.2a; vcf2diploid tool v0.2.6 for personal genome construction):
  • alleleDB scripts v2.0
  • alleleDB scripts v1.0

    Questions/Comments

    Please contact J. Chen (jieming dot chen at yale dot edu) for questions, comments or feedback on AlleleDB.