Shoudan Liang Bioinformatics and Computational Biology


deep sequencing-based
Prediction of Imprinted Genes (dsPIG)


Overview

dsPIG (deep sequencing-based Prediction of Imprinted Genes) is a Bayesian model developed to predict imprinted genes from the mRNA-Seq data of multiple independent tissues (the type of tissues may be the same, or different). dsPIG is applicable to all mammals with genomic imprinting. For more details of this model, please refer to “dsPIG: a useful tool to predict imprinted genes from the deep sequencing of transcriptomes”. On this website, you may upload your mapped mRNA-Seq data (in a fixed format as shown below) and our server will run dsPIG to predict imprinted genes for you. The results will be emailed back to you.

Alternatively, you can download the R package on this website to run dsPIG locally. This web service and the R package will generate the same result on predictions of imprinted genes..


Example

First, in Section A, upload your mapped mRNA-Seq data in either SAM format (tab-delimited) or Solexa Eland format (space-delimited). Both formats should contain at least the following information: the sequence of the reads, the chromosomes, strands and positions the reads have been mapped to. Do remember to specify the human genome build you have used for mapping in Section B. Alternatively, if you want to reduce the calculation time and obtain the results quickly, you may upload a processed data file that has allelic counts for each SNP. Here is a brief example:

 

A

C

G

T

rs11538691

0

0

3

5

rs178412

10

0

0

12

rs17094371

8

0

13

1

rs2596331

6

8

1

0

rs8110904

0

0

21

0

….

….

….



The column names are the SNP IDs, and the row names are four nucleotides (A/C/G/T). The numbers in each row are the counts of the four nucleotides at each SNP site, which are obtained from the mRNA-Seq data of a tissue sample.

Please note that each uploaded text file is for one single individual.

After you upload the text files, you may set the parameters used in dsPIG in Section B, such as the sequencing error and the cut-off for the posteriors; otherwise dsPIG will use the default values (used in our paper) to predict imprinted genes. You also need to specify the SNP database and human genome build you want to use in dsPIG. The final output file will be a text file with the following format:

SNP

Chr

Location

Str

Posterior

GeneID

Symbol

SS

rs11538691

chr17

4789783

+

1

5216

PFN1

15

rs178412

chr7

73173272

-

1

3984

LIMK1

13

rs17094371

chr14

57677831

+

1

145407

C14orf37

12

rs2596331

chr1

143820905

-

0.999

9554

SEC22B

11

rs11555395

chr17

67629054

+

0.998

6662

SOX9

8

rs4015375

chr7

89628110

+

0.988

26872

STEAP1

12

rs10208923

chr2

141157767

+

0.955

53353

LRP1B

9

rs10800864

chr1

201003241

+

0.910

10765

KDM5B

11

rs10306

chr10

74437407

-

0.888

5033

P4HA1

9

You may request additional analysis which could not be performed by the R package of dsPIG, such as (i) suggesting tissues where known imprinted genes most likely have biallelic expression, and (ii) identifying SNPs of which one specific allele has a higher transcript level than the other one among various tissues and individuals. You have to mention this in Section B along with the parameters you set for dsPIG. Please remember, it will take a longer time to complete the additional analysis.


Upload File and choose parameters

Section A:

1. Upload your file using the following LINK.

2. Send email with the following information and file link (please choose unique name, so that the file is not overwritten).

Cut-off for the posterior = 0.2
Sequencing error = 0.02
Prior = 0.01
QS < 0.9
SNP database version: (default value = SNP129).
Human genome build version: (default value = hg18).


Download the R Package

R package: dsPIG (version 3.0) is available for download here: Windows Version and Unix version. The instruction and sample files for dsPIG are provided here.

We have also attached the annotated code for dsPIG (including R code and C code) used in our study here.


Credit

If you have used this web service in your research, please cite the paper “dsPIG: a useful tool to predict imprinted genes from the deep sequencing of transcriptomes”.