sRNAblast

Profiling of small RNAs of unknown origin

image

Description

This tool is intended for the analysis of reads that could not be mapped using sRNAbench or other profiling tools. The results could point towards either contamination sources or biological meaningful information like the presence of unexpected viral or bacterial RNA molecules. Result example.

Input

The datasets must be provided uploading a file from a local computer or by means of an URL. It accepts the same input formats as sRNAbench. In general, all formats can be compressed with gzip. Additionally a sRNAbench ID can be used as input - in this case, only the unmapped reads are used, i.e. to determine the origin of those unmapped reads.

Adapter removal: sRNAblast can perform the adapter trimming. The web-server version will by default search for the first 10 bases of the adapter allowing a maximum of one mismatches. It is recommended to provide the adapter sequence or select one of the options given by the application, which are the most common adapters used on microRNA analysis: Illumina RA3, Illumina (alternative) or SOLiD (SREK). If the adapter is not known, although it is not recommended, guess the adapter sequence option should be activated. Then, sRNAblast will align the first 250,000 reads to the genome using the bowtie seed functionality (the adapters will not count for the mismatches). Out of all aligned reads, the adapter sequence is defined as the most frequent 10-mer starting at the first mismatch. Lastly, when the adapter is sequenced at the very end of the read, sometimes its length is shorter than the length threshold, so it must be search in a recursively way without taking into account the minimum length.

image

image

Results

  1. This table provide the number of reads detected in each taxonomic group: a. Taxonomy: Taxonomy group Name b. Read Count: Number of reads detected in this taxonomic group c. Percentage Read Count: Percentage of reads detected in this taxonomic group

  2. This table provide the number of reads detected in each specie. a. Species: Species Name b. Read Count: Number of reads detected for this species c. Percentage Read Count: Percentage of reads detected for this species

  3. In this section, the full result can be downloaded as zip file.

  4. This table contain the number of reads for each blast result: a. Query id: Query id b. Read count: Number of reads in this subject sequence c. Subject id: Subject id d. Evalue: Blast e-value e. Percentage of Identity: Blast percentage of identity value f. Specie Name: Subject specie g. Specie Title: NCBI Subject Title