Parameters

Parameters that can be adjusted are the minimum read counts, the choice for batch-effect correction and/or differential expression analysis and FDR.

Optional parameters

Minimum RC

A feature is not excluded from the analysis if it is below the given threshold. For example, read counts of 85 are excluded if the given threshold is 100.

Batch effect correction

Batch effects can arise when a “batch” of samples are processed differently relative to other samples due to logistical or practical restrictions. The resulting differences between batches of samples result in technical variation in an experiment and can have unfavorable impact on downstream biological analysis. Computational batch correction can be applied whenever experimental batch effects cannot be avoided. NormSeq applies the ComBat-Seq tool (Zhang et al, NAR Genomics and Bioinformatics, 2020, https://doi-org.vu-nl.idm.oclc.org/10.1093/nargab/lqaa078).

If batch-effect correction is desired, simply provide a Batch Effect Annotation File. Again, the Batch Effect Annotation File can be directly uploaded from a computer or can be provided via a link/URL in .txt/.csv/.tsv/.xls formats. The Batch Effect Annotation File should contain a raw read count table with a number of columns that matches the sample number and a number of rows that matches the number of features. The first column should be annotated as “sample” and the second column should be annotated with the “batchEffect”. List the sample names under “sample” and assign the different batches for correction under “batchEffect”. A sample template can be downloaded for your convenience. An example can also be found under the header “Templates”.

Differential Expression

Differential expression analysis is an optional module for the detection of differentially expressed RNA. NormSeq applies edgeR, deseq2, noiseq and t-test for differential expression analysis. The analysis is performed on the sample input matrix that was provided. The False Discovery Rate (FDR) is used as the measure of statistical significance of the analysed set. The threshold is arbitrary but is generally set at 5%.