Introduction

We trained a SVM classifier based on a combination of Miwi CLIP-Seq derived features and position derived features to predict potential targets of piRNAs on mRNAs. Applying the method on round spermatids from mouse testis, 3,781 mRNAs from 2,587 protein coding genes were predicted as piRNA targets. With the source code provided below, the model could be trained and applied on different tissue types with anti-Miwi CLIP-Seq data available.

CITATION

Yuan, J.,Zhang, P.,Cui, Y.,Wang, J.,Skogerbo, G.,Huang, D. W.,Chen, R.,He, S.Computational identification of piRNA targets on mouse mRNAs.Bioinformatics, 2016 Apr 15;32(8):1170-7.
PMID:26677964

Run pirnaPre on Web Server


Click HERE to run pirnaPre on our web server.

Download

pirnaPre Version 1.0.0 is now available

pirnaPre 1.0.0 Download

piRNA_target_mRNA.lst Download

Installation

Before installing pirnaPre, make sure the following software is installed in system(test in Ubuntu-14.04):
1. Python2.7.6
2. libx11-dev
3. gnuplot
4. libsvm

1. Install libx11-dev
$ sudo apt-get install libx11-dev

2. Install gnuplot
(1)Download gnuplot
http://sourceforge.net/projects/gnuplot/files/
(2)Make install gnuplot
$ tar zxvf gnuplot-5.0.1.tar.gz
$ cd gnuplot-5.0.1/
$ ./configure
$ make
$ sudo make install

3. Install pirnaPre
(1)Download pirnaPre
pirnaPrev1.tar.gz
(2)Make install LibSVM
$ tar zxvf pirnaPrev1.tar.gz
$ cd pirnaPrev1
$ tar zxvf libsvm-3.20.tar.gz
$ cd libsvm-3.20
$ make
$ cd python
$ make
(3)Configure LIBSVM Tools
1. "which gnuplot" to get gnuplot path, eg:"/usr/local/bin/gnuplot".
2. Edit the path of gnuplot in plotroc.py,grid.py in necessary.

4. pirnaPre Input Files Format Description

pirnaPre requires positive and negative data set (pre-defined piRNA-mediated cleavage sites on mRNAs and sites most unlikely to be piRNA targets) to build the SVM model.
The format of the input file required for generating the model is:
   [label] [index1]:[value1] [index2]:[value2] ...

Each line represents an instance, with a label of -1 / +1 indicating a pre-defined false / true piRNA-mediated cleavage sites on mRNAs. The pair : gives a feature value: is an integer starting from 1 and is a real number.
The feature values for 86-dimensional features are required:
   1 - quantitative values: the number of target fragments detected in the Miwi complex mapping to a region ranging from 150 nt upstream to 150nt downstream of the given site (see Figure 1 for details);
   2 - quantitative values: the number of distinct piRNAs with 5’ ends located exactly 10 nt downstream of the given target site based on base-pairing complementarity to the flanking sequence (see Figure 1 for details);
   3 - {0,1}, about conservation: 1 means that the given target is located in 3’UTR of the host mRNA
   4 - {0,1}, about conservation: 1 means that the given target is located in 5’UTR of the host mRNA
   5 - {0,1}, about conservation: 1 means that the given target is located in coding region of the host mRNA;
   6 - {0,1}, about repeat: 1 means that the given target is located in simple repeat;
   7 - {0,1}, about nucleotide usage: 1 means that the nucleotide on 10th position upstream of the given cleavage target is A;
   8 - {0,1}, about nucleotide usage: 1 means that the nucleotide on 10th position upstream of the given cleavage target is T;
   9 - {0,1}, about nucleotide usage: 1 means that the nucleotide on 10th position upstream of the given cleavage target is C;
   10 - {0,1}, about nucleotide usage: 1 means that the nucleotide on 10th position upstream of the given cleavage target is G;
   11~86 - {0,1}, about nucleotide usage: similar with 7~10th features, but on the following 19 downstream positions. As a whole, the 7~86th features correspond to the nucleotide usage of the 20nt sequence with the target site exactly in the middle.

5. Run pirnaPre
$ cd ./pirnaPrev1/
$ sh run.sh

6. Output

---pirnaPre/
   ---tran_pirna_site.data-roc.png
   ---test_pirna_site.data
   ---tran_pirna_site.data

ANY QUESTIONS?  Contact Us: heshunmin@gmail.com