Helper Tools

image

sRNAtoolbox implements several ‘helper tools’ which are intended to aid the generation of a local sRNAbench database or/and which might be useful for the preparation of input data of some of the tools. Below, the six different helper tools are described in more detail.

Ensembl Parser

sRNAbench can read specially prepared annotations, i.e. the transcript name and the classification separated by ‘:’. The 'Ensembl Parser' reads a Ensembl fasta annotation file and generates the sRNAbench format. In can use cDNA and ncRNA fasta files from this page.

NCBI Parser

sRNAbench can read specially prepared annotations, i.e. the transcript name and the classification separated by ‘:’. The 'NCBI Parser' reads a NCBI fasta annotation file and generates the sRNAbench format. In can use *.rna.fna.gz files from the following page.

RNA central Parser

This parser extracts the non-coding sequences of a given species or taxonomy level from the RNA central database version 2 (February 2015) http://rnacentral.org/ and prepares the sRNAbench format libraries.

Genomic tRNA parser

tRNA annotations can be obtained parsing a species out of the genomic tRNA database http://gtrnadb.ucsc.edu/.

Remove Duplicates from a Fasta File and manipulate names

This tool can:

  • Detect and remove duplicated IDs
  • Detect and remove duplicated sequences
  • Detect and remove duplicated sequences & generate a new ID by pasting the sequence IDs that have the same sequence
  • Manipulate the sequences names (eliminate a certain string)

Extract Sequences from a fasta file:

This parser allows to specify a search criterion for fasta sequence names. For example, microRNA sequences from miRBase start with the species name like 'hsa' (Homo sapiens). Providing a miRBase fasta file specifying 'hsa' would make the program return only human (hsa) sequences from the file.