Tools4miRs

Tools4miRs to baza publikacji dotyczących narzędzi umożliwiających szerokopojętą analizę cząsteczek miRNA. Obecnie zebrane publikacje opisują algorytmy/metody do przewidywania prekursorów miRNA, przewidywania targetów miRNA, przewidywania nowych miRNA, analizy danych otrzymanych z sekwencjonowania małych RNA oraz różnorodne bazy danych. W przyszłości znajdzie się tutaj również serwis skupiający wszystkie te narzędzia.”

[1]
M. Zorc, D. Jevsinek Skok, I. Godnic, G. A. Calin, S. Horvat, Z. Jiang, P. Dovc, and T. Kunej, “Catalog of MicroRNA Seed Polymorphisms in Vertebrates,” PLoS ONE, vol. 7, no. 1, p. e30737, Jan. 2012.

Abstract: MicroRNAs (miRNAs) are a class of non-coding RNA that plays an important role in posttranscriptional regulation of mRNA. Evidence has shown that miRNA gene variability might interfere with its function resulting in phenotypic variation and disease susceptibility. A major role in miRNA target recognition is ascribed to complementarity with the miRNA seed region that can be affected by polymorphisms. In the present study, we developed an online tool for the detection of miRNA polymorphisms (miRNA SNiPer) in vertebrates (http://www.integratomics-time.com/miRNA-SNiPer) and generated a catalog of miRNA seed region polymorphisms (miR-seed-SNPs) consisting of 149 SNPs in six species. Although a majority of detected polymorphisms were due to point mutations, two consecutive nucleotide substitutions (double nucleotide polymorphisms, DNPs) were also identified in nine miRNAs. We determined that miR-SNPs are frequently located within the quantitative trait loci (QTL), chromosome fragile sites, and cancer susceptibility loci, indicating their potential role in the genetic control of various complex traits. To test this further, we performed an association analysis between the mmu-miR-717 seed SNP rs30372501, which is polymorphic in a large number of standard inbred strains, and all phenotypic traits in these strains deposited in the Mouse Phenome Database. Analysis showed a significant association between the mmu-miR-717 seed SNP and a diverse array of traits including behavior, blood-clinical chemistry, body weight size and growth, and immune system suggesting that seed SNPs can indeed have major pleiotropic effects. The bioinformatics analyses, data and tools developed in the present study can serve researchers as a starting point in testing more targeted hypotheses and designing experiments using optimal species or strains for further mechanistic studies.

1.
Liu, H., Yue, D., Chen, Y., Gao, S.-J. & Huang, Y. Improving performance of mammalian microRNA target prediction. BMC Bioinformatics 11, 476 (2010).

Abstract: PMID: 20860840

[1]
Y. Friedman, G. Naamati, and M. Linial, “MiRror: a combinatorial analysis web tool for ensembles of microRNAs and their targets,” Bioinformatics, vol. 26, no. 15, pp. 1920–1921, Aug. 2010.

Abstract: Summary: The miRror application provides insights on microRNA (miRNA) regulation. It is based on the notion of a combinatorial regulation by an ensemble of miRNAs or genes. miRror integrates predictions from a dozen of miRNA resources that are based on complementary algorithms into a unified statistical framework. For miRNAs set as input, the online tool provides a ranked list of targets, based on set of resources selected by the user, according to their significance of being coordinately regulated. Symmetrically, a set of genes can be used as input to suggest a set of miRNAs. The user can restrict the analysis for the preferred tissue or cell line. miRror is suitable for analyzing results from miRNAs profiling, proteomics and gene expression arrays. Availability: http://www.proto.cs.huji.ac.il/mirror Contact: michall@cc.huji.ac.il

[1]
A. Kozomara and S. Griffiths-Jones, “miRBase: annotating high confidence microRNAs using deep sequencing data,” Nucl. Acids Res., vol. 42, no. D1, pp. D68–D73, Jan. 2014.

Abstract: We describe an update of the miRBase database (http://www.mirbase.org/), the primary microRNA sequence repository. The latest miRBase release (v20, June 2013) contains 24 521 microRNA loci from 206 species, processed to produce 30 424 mature microRNA products. The rate of deposition of novel microRNAs and the number of researchers involved in their discovery continue to increase, driven largely by small RNA deep sequencing experiments. In the face of these increases, and a range of microRNA annotation methods and criteria, maintaining the quality of the microRNA sequence data set is a significant challenge. Here, we describe recent developments of the miRBase database to address this issue. In particular, we describe the collation and use of deep sequencing data sets to assign levels of confidence to miRBase entries. We now provide a high confidence subset of miRBase entries, based on the pattern of mapped reads. The high confidence microRNA data set is available alongside the complete microRNA collection at http://www.mirbase.org/. We also describe embedding microRNA-specific Wikipedia pages on the miRBase website to encourage the microRNA community to contribute and share textual and functional information.

[1]
C.-J. Chen, N. Servant, J. Toedling, A. Sarazin, A. Marchais, E. Duvernois-Berthet, V. Cognat, V. Colot, O. Voinnet, E. Heard, C. Ciaudo, and E. Barillot, “ncPRO-seq: a tool for annotation and profiling of ncRNAs in sRNA-seq data,” Bioinformatics, vol. 28, no. 23, pp. 3147–3149, Dec. 2012.

Abstract: Summary: Non-coding RNA (ncRNA) PROfiling in small RNA (sRNA)-seq (ncPRO-seq) is a stand-alone, comprehensive and flexible ncRNA analysis pipeline. It can interrogate and perform detailed profiling analysis on sRNAs derived from annotated non-coding regions in miRBase, Rfam and RepeatMasker, as well as specific regions defined by users. The ncPRO-seq pipeline performs both gene-based and family-based analyses of sRNAs. It also has a module to identify regions significantly enriched with short reads, which cannot be classified under known ncRNA families, thus enabling the discovery of previously unknown ncRNA- or small interfering RNA (siRNA)-producing regions. The ncPRO-seq pipeline supports input read sequences in fastq, fasta and color space format, as well as alignment results in BAM format, meaning that sRNA raw data from the three current major platforms (Roche-454, Illumina-Solexa and Life technologies-SOLiD) can be analyzed with this pipeline. The ncPRO-seq pipeline can be used to analyze read and alignment data, based on any sequenced genome, including mammals and plants. Availability: Source code, annotation files, manual and online version are available at http://ncpro.curie.fr/. Contact: bioinfo.ncproseq@curie.fr or cciaudo@ethz.ch Supplementary information: Supplementary data are available at Bioinformatics online.

[1]
C. E. Vejnar, M. Blum, and E. M. Zdobnov, “miRmap web: comprehensive microRNA target prediction online,” Nucl. Acids Res., vol. 41, no. W1, pp. W165–W168, Jul. 2013.

Abstract: MicroRNAs (miRNAs) posttranscriptionally repress the expression of protein-coding genes. Based on the partial complementarity between miRNA and messenger RNA pairs with a mandatory so-called ‘seed’ sequence, many thousands of potential targets can be identified. Our open-source software library, miRmap, ranks these potential targets with a biologically meaningful criterion, the repression strength. MiRmap combines thermodynamic, evolutionary, probabilistic and sequence-based features, which cover features from TargetScan, PITA, PACMIT and miRanda. Our miRmap web application offers a user-friendly and feature-rich resource for browsing precomputed miRNA target predictions for model organisms, as well as for predicting and ranking targets for user-submitted sequences. MiRmap web integrates sorting, filtering and exporting of results from multiple queries, as well as providing programmatic access, and is available at http://mirmap.ezlab.org.

[1]
C. Addo-Quaye, W. Miller, and M. J. Axtell, “CleaveLand: a pipeline for using degradome data to find cleaved small RNA targets,” Bioinformatics, vol. 25, no. 1, pp. 130–131, Jan. 2009.

Abstract: Summary: MicroRNAs (miRNAs) are ∼20- to 22-nt long endogenous RNA sequences that play a critical role in the regulation of gene expression in eukaryotic genomes. Confident identification of miRNA targets is vital to understand their functions. Currently available computational algorithms for miRNA target prediction have diverse degrees of sensitivity and specificity and as a consequence each predicted target generally requires experimental confirmation. miRNAs and other small RNAs that direct endonucleolytic cleavage of target mRNAs produce diagnostic uncapped, polyadenylated mRNA fragments. Degradome sequencing [also known as PARE (parallel analysis of RNA ends) and GMUCT (genome-wide mapping of uncapped transcripts)] samples the 5′-ends of uncapped mRNAs and can be used to discover in vivo miRNA targets independent of computational predictions. Here, we describe a generalizable computational pipeline, CleaveLand, for the detection of cleaved miRNA targets from degradome data. CleaveLand takes as input degradome sequences, small RNAs and an mRNA database and outputs small RNA targets. CleaveLand can thus be applied to degradome data from any species provided a set of mRNA transcripts and a set of query miRNAs or other small RNAs are available. Availability: The code and documentation for CleaveLand is freely available under a GNU license at http://www.bio.psu.edu/people/faculty/Axtell/AxtellLab/Software.html Contact: mja18@psu.edu

[1]
M. Kertesz, N. Iovino, U. Unnerstall, U. Gaul, and E. Segal, “The role of site accessibility in microRNA target recognition,” Nat Genet, vol. 39, no. 10, pp. 1278–1284, Oct. 2007.

Abstract: MicroRNAs are key regulators of gene expression, but the precise mechanisms underlying their interaction with their mRNA targets are still poorly understood. Here, we systematically investigate the role of target-site accessibility, as determined by base-pairing interactions within the mRNA, in microRNA target recognition. We experimentally show that mutations diminishing target accessibility substantially reduce microRNA-mediated translational repression, with effects comparable to those of mutations that disrupt sequence complementarity. We devise a parameter-free model for microRNA-target interaction that computes the difference between the free energy gained from the formation of the microRNA-target duplex and the energetic cost of unpairing the target to make it accessible to the microRNA. This model explains the variability in our experiments, predicts validated targets more accurately than existing algorithms, and shows that genomes accommodate site accessibility by preferentially positioning targets in highly accessible regions. Our study thus demonstrates that target accessibility is a critical factor in microRNA function.

1.
Hsu, S.-D. et al. miRTarBase update 2014: an information resource for experimentally validated miRNA-target interactions. Nucl. Acids Res. 42, D78–D85 (2014).

Abstract: MicroRNAs (miRNAs) are small non-coding RNA molecules capable of negatively regulating gene expression to control many cellular mechanisms. The miRTarBase database (http://mirtarbase.mbc.nctu.edu.tw/) provides the most current and comprehensive information of experimentally validated miRNA-target interactions. The database was launched in 2010 with data sources for \textgreater100 published studies in the identification of miRNA targets, molecular networks of miRNA targets and systems biology, and the current release (2013, version 4) includes significant expansions and enhancements over the initial release (2010, version 1). This article reports the current status of and recent improvements to the database, including (i) a 14-fold increase to miRNA-target interaction entries, (ii) a miRNA-target network, (iii) expression profile of miRNA and its target gene, (iv) miRNA target-associated diseases and (v) additional utilities including an upgrade reminder and an error reporting/user feedback system.

1.
Paraskevopoulou, M. D. et al. DIANA-microT web server v5.0: service integration into miRNA functional analysis workflows. Nucl. Acids Res. 41, W169–W173 (2013).

Abstract: MicroRNAs (miRNAs) are small endogenous RNA molecules that regulate gene expression through mRNA degradation and/or translation repression, affecting many biological processes. DIANA-microT web server (http://www.microrna.gr/webServer) is dedicated to miRNA target prediction/functional analysis, and it is being widely used from the scientific community, since its initial launch in 2009. DIANA-microT v5.0, the new version of the microT server, has been significantly enhanced with an improved target prediction algorithm, DIANA-microT-CDS. It has been updated to incorporate miRBase version 18 and Ensembl version 69. The in silico-predicted miRNA–gene interactions in Homo sapiens, Mus musculus, Drosophila melanogaster and Caenorhabditis elegans exceed 11 million in total. The web server was completely redesigned, to host a series of sophisticated workflows, which can be used directly from the on-line web interface, enabling users without the necessary bioinformatics infrastructure to perform advanced multi-step functional miRNA analyses. For instance, one available pipeline performs miRNA target prediction using different thresholds and meta-analysis statistics, followed by pathway enrichment analysis. DIANA-microT web server v5.0 also supports a complete integration with the Taverna Workflow Management System (WMS), using the in-house developed DIANA-Taverna Plug-in. This plug-in provides ready-to-use modules for miRNA target prediction and functional analysis, which can be used to form advanced high-throughput analysis pipelines.

[1]
Q. Jiang, Y. Wang, Y. Hao, L. Juan, M. Teng, X. Zhang, M. Li, G. Wang, and Y. Liu, “miR2Disease: a manually curated database for microRNA deregulation in human disease,” Nucl. Acids Res., vol. 37, no. suppl 1, pp. D98–D104, Jan. 2009.

Abstract: ‘miR2Disease’, a manually curated database, aims at providing a comprehensive resource of microRNA deregulation in various human diseases. The current version of miR2Disease documents 1939 curated relationships between 299 human microRNAs and 94 human diseases by reviewing more than 600 published papers. Around one-seventh of the microRNA–disease relationships represent the pathogenic roles of deregulated microRNA in human disease. Each entry in the miR2Disease contains detailed information on a microRNA–disease relationship, including a microRNA ID, the disease name, a brief description of the microRNA–disease relationship, an expression pattern of the microRNA, the detection method for microRNA expression, experimentally verified target gene(s) of the microRNA and a literature reference. miR2Disease provides a user-friendly interface for a convenient retrieval of each entry by microRNA ID, disease name, or target gene. In addition, miR2Disease offers a submission page that allows researchers to submit established microRNA–disease relationships that are not documented. Once approved by the submission review committee, the submitted records will be included in the database. miR2Disease is freely available at http://www.miR2Disease.org.

1.
Zhang, Y. et al. CPSS: a computational platform for the analysis of small RNA deep sequencing data. Bioinformatics 28, 1925–1927 (2012).

Abstract: Summary: Next generation sequencing (NGS) techniques have been widely used to document the small ribonucleic acids (RNAs) implicated in a variety of biological, physiological and pathological processes. An integrated computational tool is needed for handling and analysing the enormous datasets from small RNA deep sequencing approach. Herein, we present a novel web server, CPSS (a computational platform for the analysis of small RNA deep sequencing data), designed to completely annotate and functionally analyse microRNAs (miRNAs) from NGS data on one platform with a single data submission. Small RNA NGS data can be submitted to this server with analysis results being returned in two parts: (i) annotation analysis, which provides the most comprehensive analysis for small RNA transcriptome, including length distribution and genome mapping of sequencing reads, small RNA quantification, prediction of novel miRNAs, identification of differentially expressed miRNAs, piwi-interacting RNAs and other non-coding small RNAs between paired samples and detection of miRNA editing and modifications and (ii) functional analysis, including prediction of miRNA targeted genes by multiple tools, enrichment of gene ontology terms, signalling pathway involvement and protein–protein interaction analysis for the predicted genes. CPSS, a ready-to-use web server that integrates most functions of currently available bioinformatics tools, provides all the information wanted by the majority of users from small RNA deep sequencing datasets. Availability: CPSS is implemented in PHP/PERL+MySQL+R and can be freely accessed at http://mcg.ustc.edu.cn/db/cpss/index.html or http://mcg.ustc.edu.cn/sdap1/cpss/index.html. Contact: xueyu@mail.hust.edu.cn or qshi@ustc.edu.cn Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Milev, I., Yahubyan, G., Minkov, I. & Baev, V. miRTour: Plant miRNA and target prediction tool. Bioinformation 6, 248–249 (2011).

Abstract: MicroRNAs (miRNAs) are important negative regulators of gene expression in plant and animals, which are endogenously produced from their own genes. Computational comparative approach based on evolutionary conservation of mature miRNAs has revealed a number of orthologs of known miRNAs in different plant species. The homology-based plant miRNA discovery, followed by target prediction, comprises several steps, which have been done so far manually. Here, we present the bioinformatics pipeline miRTour which automates all the steps of miRNA similarity search, miRNA precursor selection, target prediction and annotation, each of them performed with the same set of input sequences.

[1]
S. Kadri, V. Hinman, and P. V. Benos, “HHMMiR: efficient de novo prediction of microRNAs using hierarchical hidden Markov models,” BMC Bioinformatics, vol. 10, no. Suppl 1, p. S35, Jan. 2009.

Abstract: Background MicroRNAs (miRNAs) are small non-coding single-stranded RNAs (20–23 nts) that are known to act as post-transcriptional and translational regulators of gene expression. Although, they were initially overlooked, their role in many important biological processes, such as development, cell differentiation, and cancer has been established in recent times. In spite of their biological significance, the identification of miRNA genes in newly sequenced organisms is still based, to a large degree, on extensive use of evolutionary conservation, which is not always available. Results We have developed HHMMiR, a novel approach for de novo miRNA hairpin prediction in the absence of evolutionary conservation. Our method implements a Hierarchical Hidden Markov Model (HHMM) that utilizes region-based structural as well as sequence information of miRNA precursors. We first established a template for the structure of a typical miRNA hairpin by summarizing data from publicly available databases. We then used this template to develop the HHMM topology. Conclusion Our algorithm achieved average sensitivity of 84% and specificity of 88%, on 10-fold cross-validation of human miRNA precursor data. We also show that this model, trained on human sequences, works well on hairpins from other vertebrate as well as invertebrate species. Furthermore, the human trained model was able to correctly classify \textasciitilde97% of plant miRNA precursors. The success of this approach in such a diverse set of species indicates that sequence conservation is not necessary for miRNA prediction. This may lead to efficient prediction of miRNA genes in virtually any organism.

[1]
Z. Zhang, L. Jiang, J. Wang, P. Gu, and M. Chen, “MTide: an integrated tool for the identification of miRNA–target interaction in plants,” Bioinformatics, vol. 31, no. 2, pp. 290–291, Jan. 2015.

Abstract: Motivation: Small RNA sequencing and degradome sequencing (also known as parallel analysis of RNA ends) have provided rich information on the microRNA (miRNA) and its cleaved mRNA targets on a genome-wide scale in plants, but no computational tools have been developed to effectively and conveniently deconvolute the miRNA–target interaction (MTI). Results: A freely available package, MTide, was developed by combining modified miRDeep2 and CleaveLand4 with some other useful scripts to explore MTI in a comprehensive way. By searching for targets of a complete miRNAs, we can facilitate large-scale identification of miRNA targets, allowing us to discover regulatory interaction networks. Availability and implementation: http://bis.zju.edu.cn/MTide Contact: mchen@zju.edu.cn

1.
Dweep, H., Sticht, C., Pandey, P. & Gretz, N. miRWalk – Database: Prediction of possible miRNA binding sites by “walking” the genes of three genomes. Journal of Biomedical Informatics 44, 839–847 (2011).

Abstract: MicroRNAs are small, non-coding RNA molecules that can complementarily bind to the mRNA 3′-UTR region to regulate the gene expression by transcriptional repression or induction of mRNA degradation. Increasing evidence suggests a new mechanism by which miRNAs may regulate target gene expression by binding in promoter and amino acid coding regions. Most of the existing databases on miRNAs are restricted to mRNA 3′-UTR region. To address this issue, we present miRWalk, a comprehensive database on miRNAs, which hosts predicted as well as validated miRNA binding sites, information on all known genes of human, mouse and rat. All mRNAs, mitochondrial genes and 10 kb upstream flanking regions of all known genes of human, mouse and rat were analyzed by using a newly developed algorithm named ‘miRWalk’ as well as with eight already established programs for putative miRNA binding sites. An automated and extensive text-mining search was performed on PubMed database to extract validated information on miRNAs. Combined information was put into a MySQL database. miRWalk presents predicted and validated information on miRNA-target interaction. Such a resource enables researchers to validate new targets of miRNA not only on 3′-UTR, but also on the other regions of all known genes. The ‘Validated Target module’ is updated every month and the ‘Predicted Target module’ is updated every 6 months. miRWalk is freely available at http://mirwalk.uni-hd.de/.

1.
Sun, Z. et al. CAP-miRSeq: a comprehensive analysis pipeline for microRNA sequencing data. BMC Genomics 15, 423 (2014).

Abstract: PMID: 24894665

1.
Cho, S. et al. miRGator v3.0: a microRNA portal for deep sequencing, expression profiling and mRNA targeting. Nucl. Acids Res. 41, D252–D257 (2013).

Abstract: Biogenesis and molecular function are two key subjects in the field of microRNA (miRNA) research. Deep sequencing has become the principal technique in cataloging of miRNA repertoire and generating expression profiles in an unbiased manner. Here, we describe the miRGator v3.0 update (http://mirgator.kobic.re.kr) that compiled the deep sequencing miRNA data available in public and implemented several novel tools to facilitate exploration of massive data. The miR-seq browser supports users to examine short read alignment with the secondary structure and read count information available in concurrent windows. Features such as sequence editing, sorting, ordering, import and export of user data would be of great utility for studying iso-miRs, miRNA editing and modifications. miRNA–target relation is essential for understanding miRNA function. Coexpression analysis of miRNA and target mRNAs, based on miRNA-seq and RNA-seq data from the same sample, is visualized in the heat-map and network views where users can investigate the inverse correlation of gene expression and target relations, compiled from various databases of predicted and validated targets. By keeping datasets and analytic tools up-to-date, miRGator should continue to serve as an integrated resource for biogenesis and functional investigation of miRNAs.

1.
Li, Y. et al. HMDD v2.0: a database for experimentally supported human microRNA and disease associations. Nucl. Acids Res. 42, D1070–D1074 (2014).

Abstract: The Human microRNA Disease Database (HMDD; available via the Web site at http://cmbi.bjmu.edu.cn/hmdd and http://202.38.126.151/hmdd/tools/hmdd2.html) is a collection of experimentally supported human microRNA (miRNA) and disease associations. Here, we describe the HMDD v2.0 update that presented several novel options for users to facilitate exploration of the data in the database. In the updated database, miRNA–disease association data were annotated in more details. For example, miRNA–disease association data from genetics, epigenetics, circulating miRNAs and miRNA–target interactions were integrated into the database. In addition, HMDD v2.0 presented more data that were generated based on concepts derived from the miRNA–disease association data, including disease spectrum width of miRNAs and miRNA spectrum width of human diseases. Moreover, we provided users a link to download all the data in the HMDD v2.0 and a link to submit novel data into the database. Meanwhile, we also maintained the old version of HMDD. By keeping data sets up-to-date, HMDD should continue to serve as a valuable resource for investigating the roles of miRNAs in human disease.

1.
Bhattacharya, A., Ziebarth, J. D. & Cui, Y. SomamiR: a database for somatic mutations impacting microRNA function in cancer. Nucleic Acids Res. 41, D977-982 (2013).

Abstract: Whole-genome sequencing of cancers has begun to identify thousands of somatic mutations that distinguish the genomes of normal tissues from cancers. While many germline mutations within microRNAs (miRNAs) and their targets have been shown to alter miRNA function in cancers and have been associated with cancer risk, the impact of somatic mutations on miRNA function has received relatively little attention. Here, we have created the SomamiR database (http://compbio.uthsc.edu/SomamiR/) to provide a comprehensive resource that integrates several types of data for use in investigating the impact of somatic and germline mutations on miRNA function in cancer. The database contains somatic mutations that may create or disrupt miRNA target sites and integrates these somatic mutations with germline mutations within the same target sites, genome-wide and candidate gene association studies of cancer and functional annotations that link genes containing mutations with cancer. Additionally, the database contains a collection of germline and somatic mutations in miRNAs and their targets that have been experimentally shown to impact miRNA function and have been associated with cancer.

1.
Lu, T.-P. et al. miRSystem: An Integrated System for Characterizing Enriched Functions and Pathways of MicroRNA Targets. PLoS ONE 7, e42390 (2012).

Abstract: BackgroundMany prediction tools for microRNA (miRNA) targets have been developed, but inconsistent predictions were observed across multiple algorithms, which can make further analysis difficult. Moreover, the nomenclature of human miRNAs changes rapidly. To address these issues, we developed a web-based system, miRSystem, for converting queried miRNAs to the latest annotation and predicting the function of miRNA by integrating miRNA target gene prediction and function/pathway analyses.ResultsFirst, queried miRNA IDs were converted to the latest annotated version to prevent potential conflicts resulting from multiple aliases. Next, by combining seven algorithms and two validated databases, potential gene targets of miRNAs and their functions were predicted based on the consistency across independent algorithms and observed/expected ratios. Lastly, five pathway databases were included to characterize the enriched pathways of target genes through bootstrap approaches. Based on the enriched pathways of target genes, the functions of queried miRNAs could be predicted.ConclusionsMiRSystem is a user-friendly tool for predicting the target genes and their associated pathways for many miRNAs simultaneously. The web server and the documentation are freely available at http://mirsystem.cgm.ntu.edu.tw/.

1.
Oulas, A. et al. Prediction of novel microRNA genes in cancer-associated genomic regions—a combined computational and experimental approach. Nucl. Acids Res. 37, 3276–3287 (2009).

Abstract: The majority of existing computational tools rely on sequence homology and/or structural similarity to identify novel microRNA (miRNA) genes. Recently supervised algorithms are utilized to address this problem, taking into account sequence, structure and comparative genomics information. In most of these studies miRNA gene predictions are rarely supported by experimental evidence and prediction accuracy remains uncertain. In this work we present a new computational tool (SSCprofiler) utilizing a probabilistic method based on Profile Hidden Markov Models to predict novel miRNA precursors. Via the simultaneous integration of biological features such as sequence, structure and conservation, SSCprofiler achieves a performance accuracy of 88.95% sensitivity and 84.16% specificity on a large set of human miRNA genes. The trained classifier is used to identify novel miRNA gene candidates located within cancer-associated genomic regions and rank the resulting predictions using expression information from a full genome tiling array. Finally, four of the top scoring predictions are verified experimentally using northern blot analysis. Our work combines both analytical and experimental techniques to show that SSCprofiler is a highly accurate tool which can be used to identify novel miRNA gene candidates in the human genome. SSCprofiler is freely available as a web service at http://www.imbb.forth.gr/SSCprofiler.html.

1.
Ahmadi, H. et al. HomoTarget: A new algorithm for prediction of microRNA targets in Homo sapiens. Genomics 101, 94–100 (2013).

Abstract: MiRNAs play an essential role in the networks of gene regulation by inhibiting the translation of target mRNAs. Several computational approaches have been proposed for the prediction of miRNA target-genes. Reports reveal a large fraction of under-predicted or falsely predicted target genes. Thus, there is an imperative need to develop a computational method by which the target mRNAs of existing miRNAs can be correctly identified. In this study, combined pattern recognition neural network (PRNN) and principle component analysis (PCA) architecture has been proposed in order to model the complicated relationship between miRNAs and their target mRNAs in humans. The results of several types of intelligent classifiers and our proposed model were compared, showing that our algorithm outperformed them with higher sensitivity and specificity. Using the recent release of the mirBase database to find potential targets of miRNAs, this model incorporated twelve structural, thermodynamic and positional features of miRNA:mRNA binding sites to select target candidates.

1.
Dai, X. & Zhao, P. X. psRNATarget: a plant small RNA target analysis server. Nucl. Acids Res. gkr319 (2011) doi:10.1093/nar/gkr319.

Abstract: Plant endogenous non-coding short small RNAs (20–24 nt), including microRNAs (miRNAs) and a subset of small interfering RNAs (ta-siRNAs), play important role in gene expression regulatory networks (GRNs). For example, many transcription factors and development-related genes have been reported as targets of these regulatory small RNAs. Although a number of miRNA target prediction algorithms and programs have been developed, most of them were designed for animal miRNAs which are significantly different from plant miRNAs in the target recognition process. These differences demand the development of separate plant miRNA (and ta-siRNA) target analysis tool(s). We present psRNATarget, a plant small RNA target analysis server, which features two important analysis functions: (i) reverse complementary matching between small RNA and target transcript using a proven scoring schema, and (ii) target-site accessibility evaluation by calculating unpaired energy (UPE) required to ‘open’ secondary structure around small RNA’s target site on mRNA. The psRNATarget incorporates recent discoveries in plant miRNA target recognition, e.g. it distinguishes translational and post-transcriptional inhibition, and it reports the number of small RNA/target site pairs that may affect small RNA binding activity to target transcript. The psRNATarget server is designed for high-throughput analysis of next-generation data with an efficient distributed computing back-end pipeline that runs on a Linux cluster. The server front-end integrates three simplified user-friendly interfaces to accept user-submitted or preloaded small RNAs and transcript sequences; and outputs a comprehensive list of small RNA/target pairs along with the online tools for batch downloading, key word searching and results sorting. The psRNATarget server is freely available at http://plantgrn.noble.org/psRNATarget/.

1.
Bonnet, E., He, Y., Billiau, K. & Peer, Y. V. de. TAPIR, a web server for the prediction of plant microRNA targets, including target mimics. Bioinformatics 26, 1566–1568 (2010).

Abstract: Summary: We present a new web server called TAPIR, designed for the prediction of plant microRNA targets. The server offers the possibility to search for plant miRNA targets using a fast and a precise algorithm. The precise option is much slower but guarantees to find less perfectly paired miRNA-target duplexes. Furthermore, the precise option allows the prediction of target mimics, which are characterized by a miRNA-target duplex having a large loop, making them undetectable by traditional tools. Availability: The TAPIR web server can be accessed at: http://bioinformatics.psb.ugent.be/webtools/tapir Contact: yves.vandepeer@psb.vib-ugent.be Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Nam, S. et al. MicroRNA and mRNA integrated analysis (MMIA): a web tool for examining biological functions of microRNA expression. Nucl. Acids Res. gkp294 (2009) doi:10.1093/nar/gkp294.

Abstract: MicroRNAs (miRNAs) are small (19–24 nt), nonprotein-coding nucleic acids that regulate specific ‘target’ gene products via hybridization to mRNA transcripts, resulting in translational blockade or transcript degradation. Although miRNAs have been implicated in numerous developmental and adult diseases, their specific impact on biological pathways and cellular phenotypes, in addition to miRNA gene promoter regulation, remain largely unknown. To improve and facilitate research of miRNA functions and regulation, we have developed MMIA (microRNA and mRNA integrated analysis), a versatile and user-friendly web server. By incorporating three commonly used and accurate miRNA prediction algorithms, TargetScan, PITA and PicTar, MMIA integrates miRNA and mRNA expression data with predicted miRNA target information for analyzing miRNA-associated phenotypes and biological functions by gene set analysis, in addition to analysis of miRNA primary transcript gene promoters. To assign biological relevance to the integrated miRNA/mRNA profiles, MMIA uses exhaustive human genome coverage, including classification into various disease-associated genes as well as conventional canonical pathways and Gene Ontology. In summary, this novel web server (cancer.informatics.indiana.edu/mmia) will provide life science researchers with a valuable tool for the study of the biological (and pathological) causes and effects of the expression of this class of interesting protein regulators.

1.
Friedländer, M. R. et al. Discovering microRNAs from deep sequencing data using miRDeep. Nat Biotech 26, 407–415 (2008).

Abstract: The capacity of highly parallel sequencing technologies to detect small RNAs at unprecedented depth suggests their value in systematically identifying microRNAs (miRNAs). However, the identification of miRNAs from the large pool of sequenced transcripts from a single deep sequencing run remains a major challenge. Here, we present an algorithm, miRDeep, which uses a probabilistic model of miRNA biogenesis to score compatibility of the position and frequency of sequenced RNA with the secondary structure of the miRNA precursor. We demonstrate its accuracy and robustness using published Caenorhabditis elegans data and data we generated by deep sequencing human and dog RNAs. miRDeep reports altogether 230 previously unannotated miRNAs, of which four novel C. elegans miRNAs are validated by northern blot analysis.

1.
Krek, A. et al. Combinatorial microRNA target predictions. Nat Genet 37, 495–500 (2005).

Abstract: MicroRNAs are small noncoding RNAs that recognize and bind to partially complementary sites in the 3' untranslated regions of target genes in animals and, by unknown mechanisms, regulate protein production of the target transcript1, 2, 3. Different combinations of microRNAs are expressed in different cell types and may coordinately regulate cell-specific target genes. Here, we present PicTar, a computational method for identifying common targets of microRNAs. Statistical tests using genome-wide alignments of eight vertebrate genomes, PicTar's ability to specifically recover published microRNA targets, and experimental validation of seven predicted targets suggest that PicTar has an excellent success rate in predicting targets for single microRNAs and for combinations of microRNAs. We find that vertebrate microRNAs target, on average, roughly 200 transcripts each. Furthermore, our results suggest widespread coordinate control executed by microRNAs. In particular, we experimentally validate common regulation of Mtpn by miR-375, miR-124 and let-7b and thus provide evidence for coordinate microRNA control in mammals.

1.
Griffiths-Jones, S., Saini, H. K., Dongen, S. van & Enright, A. J. miRBase: tools for microRNA genomics. Nucl. Acids Res. 36, D154–D158 (2008).

Abstract: miRBase is the central online repository for microRNA (miRNA) nomenclature, sequence data, annotation and target prediction. The current release (10.0) contains 5071 miRNA loci from 58 species, expressing 5922 distinct mature miRNA sequences: a growth of over 2000 sequences in the past 2 years. miRBase provides a range of data to facilitate studies of miRNA genomics: all miRNAs are mapped to their genomic coordinates. Clusters of miRNA sequences in the genome are highlighted, and can be defined and retrieved with any inter-miRNA distance. The overlap of miRNA sequences with annotated transcripts, both protein- and non-coding, are described. Finally, graphical views of the locations of a wide range of genomic features in model organisms allow for the first time the prediction of the likely boundaries of many miRNA primary transcripts. miRBase is available at http://microrna.sanger.ac.uk/.

1.
Giurato, G. et al. iMir: An integrated pipeline for high-throughput analysis of small non-coding RNA data obtained by smallRNA-Seq. BMC Bioinformatics 14, 362 (2013).

Abstract: PMID: 24330401

1.
Vergoulis, T. et al. mirPub: a database for searching microRNA publications. Bioinformatics btu819 (2014) doi:10.1093/bioinformatics/btu819.

Abstract: Summary: Identifying, amongst millions of publications available in MEDLINE, those that are relevant to specific microRNAs (miRNAs) of interest based on keyword search faces major obstacles. References to miRNA names in the literature often deviate from standard nomenclature for various reasons, since even the official nomenclature evolves. For instance, a single miRNA name may identify two completely different molecules or two different names may refer to the same molecule. mirPub is a database with a powerful and intuitive interface, which facilitates searching for miRNA literature, addressing the aforementioned issues. To provide effective search services, mirPub applies text mining techniques on MEDLINE, integrates data from several curated databases and exploits data from its user community following a crowdsourcing approach. Other key features include an interactive visualization service that illustrates intuitively the evolution of miRNA data, tag clouds summarizing the relevance of publications to particular diseases, cell types or tissues and access to TarBase 6.0 data to oversee genes related to miRNA publications. Availability and Implementation: mirPub is freely available at http://www.microrna.gr/mirpub/. Contact: vergoulis@imis.athena-innovation.gr or dalamag@imis.athena-innovation.gr Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Liu, B., Fang, L., Liu, F., Wang, X. & Chou, K.-C. iMiRNA-PseDPC: microRNA precursor identification with a pseudo distance-pair composition approach. J. Biomol. Struct. Dyn. 1–13 (2015) doi:10.1080/07391102.2015.1014422.

Abstract: A microRNA (miRNA) is a small non-coding RNA molecule, functioning in transcriptional and post-transcriptional regulation of gene expression. The human genome may encode over 1000 miRNAs. Albeit poorly characterized, miRNAs are widely deemed as important regulators of biological processes. Aberrant expression of miRNAs has been observed in many cancers and other disease states, indicating that they are deeply implicated with these diseases, particularly in carcinogenesis. Therefore, it is important for both basic research and miRNA-based therapy to discriminate the real pre-miRNAs from the false ones (such as hairpin sequences with similar stem-loops). Particularly, with the avalanche of RNA sequences generated in the post-genomic age, it is highly desired to develop computational sequence-based methods for effectively identifying the human pre-miRNAs. Here, we propose a predictor called "iMiRNA-PseDPC", in which the RNA sequences are formulated by a novel feature vector called "pseudo distance-pair composition" (PseDPC) with 10 types of structure statuses. Rigorous cross-validations on a much larger and more stringent newly constructed benchmark data-set showed that our approach has remarkably outperformed the existing ones in either prediction accuracy or efficiency, indicating the new predictor is quite promising or at least may become a complementary tool to the existing predictors in this area. For the convenience of most experimental scientists, a user-friendly web server for the new predictor has been established at http://bioinformatics.hitsz.edu.cn/iMiRNA-PseDPC/, by which users can easily get their desired results without the need to go through the mathematical details. It is anticipated that the new predictor may become a useful high throughput tool for genome analysis particularly in dealing with large-scale data.

1.
Stark, A., Brennecke, J., Russell, R. B. & Cohen, S. M. Identification of Drosophila MicroRNA Targets. PLoS Biol 1, e60 (2003).

Abstract: A bioinformatic approach suggests many new target genes for Drosophila microRNAs. A number of them are validated experimentally.

1.
Tran, V. D. T., Tempel, S., Zerath, B., Zehraoui, F. & Tahi, F. miRBoost: boosting support vector machines for microRNA precursor classification. RNA (2015) doi:10.1261/rna.043612.113.

Abstract: Identification of microRNAs (miRNAs) is an important step toward understanding post-transcriptional gene regulation and miRNA-related pathology. Difficulties in identifying miRNAs through experimental techniques combined with the huge amount of data from new sequencing technologies have made in silico discrimination of bona fide miRNA precursors from non-miRNA hairpin-like structures an important topic in bioinformatics. Among various techniques developed for this classification problem, machine learning approaches have proved to be the most promising. However these approaches require the use of training data, which is problematic due to an imbalance in the number of miRNAs (positive data) and non-miRNAs (negative data), which leads to a degradation of their performance. In order to address this issue, we present an ensemble method that uses a boosting technique with support vector machine components to deal with imbalanced training data. Classification is performed following a feature selection on 187 novel and existing features. The algorithm, miRBoost, performed better in comparison with state-of-the-art methods on imbalanced human and cross-species data. It also showed the highest ability among the tested methods for discovering novel miRNA precursors. In addition, miRBoost was over 1400 times faster than the second most accurate tool tested and was significantly faster than most of the other tools. miRBoost thus provides a good compromise between prediction efficiency and execution time, making it highly suitable for use in genome-wide miRNA precursor prediction. The software miRBoost is available on our web server http://EvryRNA.ibisc.univ-evry.fr.

1.
Hsu, S.-D. et al. miRNAMap 2.0: genomic maps of microRNAs in metazoan genomes. Nucl. Acids Res. 36, D165–D169 (2008).

Abstract: MicroRNAs (miRNAs) are small non-coding RNA molecules that can negatively regulate gene expression and thus control numerous cellular mechanisms. This work develops a resource, miRNAMap 2.0, for collecting experimentally verified microRNAs and experimentally verified miRNA target genes in human, mouse, rat and other metazoan genomes. Three computational tools, miRanda, RNAhybrid and TargetScan, were employed to identify miRNA targets in 3′-UTR of genes as well as the known miRNA targets. Various criteria for filtering the putative miRNA targets are applied to reduce the false positive prediction rate of miRNA target sites. Additionally, miRNA expression profiles can provide valuable clues on the characteristics of miRNAs, including tissue specificity and differential expression in cancer/normal cell. Therefore, quantitative polymerase chain reaction experiments were performed to monitor the expression profiles of 224 human miRNAs in 18 major normal tissues in human. The negative correlation between the miRNA expression profile and the expression profiles of its target genes typically helps to elucidate the regulatory functions of the miRNA. The interface is also redesigned and enhanced. The miRNAMap 2.0 is now available at http://miRNAMap.mbc.nctu.edu.tw/.

1.
Zhang, S. et al. PASmiR: a literature-curated database for miRNA molecular regulation in plant response to abiotic stress. BMC Plant Biology 13, 33 (2013).

Abstract: PMID: 23448274

1.
Piriyapongsa, J., Bootchai, C., Ngamphiw, C. & Tongsima, S. microPIR2: a comprehensive database for human–mouse comparative study of microRNA–promoter interactions. Database 2014, bau115 (2014).

Abstract: microRNA (miRNA)–promoter interaction resource (microPIR) is a public database containing over 15 million predicted miRNA target sites located within human promoter sequences. These predicted targets are presented along with their related genomic and experimental data, making the microPIR database the most comprehensive repository of miRNA promoter target sites. Here, we describe major updates of the microPIR database including new target predictions in the mouse genome and revised human target predictions. The updated database (microPIR2) now provides ∼80 million human and 40 million mouse predicted target sites. In addition to being a reference database, microPIR2 is a tool for comparative analysis of target sites on the promoters of human–mouse orthologous genes. In particular, this new feature was designed to identify potential miRNA–promoter interactions conserved between species that could be stronger candidates for further experimental validation. We also incorporated additional supporting information to microPIR2 such as nuclear and cytoplasmic localization of miRNAs and miRNA–disease association. Extra search features were also implemented to enable various investigations of targets of interest. Database URL: http://www4a.biotec.or.th/micropir2

1.
Numnark, S., Mhuantong, W., Ingsriswang, S. & Wichadakul, D. C-mii: a tool for plant miRNA and target identification. BMC Genomics 13, S16 (2012).

Abstract: PMID: 23281648

1.
Bandyopadhyay, S. & Mitra, R. TargetMiner: microRNA target prediction with systematic identification of tissue-specific negative examples. Bioinformatics 25, 2625–2631 (2009).

Abstract: Motivation: Prediction of microRNA (miRNA) target mRNAs using machine learning approaches is an important area of research. However, most of the methods suffer from either high false positive or false negative rates. One reason for this is the marked deficiency of negative examples or miRNA non-target pairs. Systematic identification of non-target mRNAs is still not addressed properly, and therefore, current machine learning approaches are compelled to rely on artificially generated negative examples for training. Results: In this article, we have identified ∼300 tissue-specific negative examples using a novel approach that involves expression profiling of both miRNAs and mRNAs, miRNA–mRNA structural interactions and seed-site conservation. The newly generated negative examples are validated with pSILAC dataset, which elucidate the fact that the identified non-targets are indeed non-targets.These high-throughput tissue-specific negative examples and a set of experimentally verified positive examples are then used to build a system called TargetMiner, a support vector machine (SVM)-based classifier. In addition to assessing the prediction accuracy on cross-validation experiments, TargetMiner has been validated with a completely independent experimental test dataset. Our method outperforms 10 existing target prediction algorithms and provides a good balance between sensitivity and specificity that is not reflected in the existing methods. We achieve a significantly higher sensitivity and specificity of 69% and 67.8% based on a pool of 90 feature set and 76.5% and 66.1% using a set of 30 selected feature set on the completely independent test dataset. In order to establish the effectiveness of the systematically generated negative examples, the SVM is trained using a different set of negative data generated using the method in Yousef et al. A significantly higher false positive rate (70.6%) is observed when tested on the independent set, while all other factors are kept the same. Again, when an existing method (NBmiRTar) is executed with the our proposed negative data, we observe an improvement in its performance. These clearly establish the effectiveness of the proposed approach of selecting the negative examples systematically. Availability: TargetMiner is now available as an online tool at www.isical.ac.in/∼bioinfo_miu Contact: sanghami@isical.ac.in; rmitra_t@isical.ac.in Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Jha, A. & Shankar, R. Employing machine learning for reliable miRNA target identification in plants. BMC Genomics 12, 636 (2011).

Abstract: PMID: 22206472

1.
Xie, F. & Zhang, B. Target-align: a tool for plant microRNA target identification. Bioinformatics 26, 3002–3003 (2010).

Abstract: Motivation: MicroRNAs (miRNAs) are important regulatory molecules. A critical step in elucidating miRNA function is identifying potential miRNA targets. However, few reliable tools have been developed for identifying miRNA targets in plants. Results: Here, we developed a Smith–Waterman-like alignment tool in order to accurately predict miRNA targets. Dynamic programming was used to build a score matrix based on the complementarity of nucleotides in order to trace the optimal local alignments. Important parameters, such as maximum mismatches and maximum consecutive mismatches between miRNAs and their targets, were also used for filtering the optimal local alignments. Almost all of the parameters in this alignment tool can be adjusted by users. Compared to other target prediction tools, Target-align exhibits strong sensitivity and accuracy for identifying miRNA targets. More importantly, Target-align can identify multi-target sites as well potential for non-cleaved targets sites by change the default settings. Windows, web and command-line versions were developed to better serve different users. Availability: http://www.leonxie.com/targetAlign.php. Contact: zhangb@ecu.edu Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Hackenberg, M., Rodríguez-Ezpeleta, N. & Aransay, A. M. miRanalyzer: an update on the detection and analysis of microRNAs in high-throughput sequencing experiments. Nucl. Acids Res. 39, W132–W138 (2011).

Abstract: We present a new version of miRanalyzer, a web server and stand-alone tool for the detection of known and prediction of new microRNAs in high-throughput sequencing experiments. The new version has been notably improved regarding speed, scope and available features. Alignments are now based on the ultrafast short-read aligner Bowtie (granting also colour space support, allowing mismatches and improving speed) and 31 genomes, including 6 plant genomes, can now be analysed (previous version contained only 7). Differences between plant and animal microRNAs have been taken into account for the prediction models and differential expression of both, known and predicted microRNAs, between two conditions can be calculated. Additionally, consensus sequences of predicted mature and precursor microRNAs can be obtained from multiple samples, which increases the reliability of the predicted microRNAs. Finally, a stand-alone version of the miRanalyzer that is based on a local and easily customized database is also available; this allows the user to have more control on certain parameters as well as to use specific data such as unpublished assemblies or other libraries that are not available in the web server. miRanalyzer is available at http://bioinfo2.ugr.es/miRanalyzer/miRanalyzer.php.

1.
Reyes-Herrera, P. H., Ficarra, E., Acquaviva, A. & Macii, E. miREE: miRNA recognition elements ensemble. BMC Bioinformatics 12, 454 (2011).

Abstract: PMID: 22115078

1.
An, J., Lai, J., Sajjanhar, A., Lehman, M. L. & Nelson, C. C. miRPlant: an integrated tool for identification of plant miRNA from RNA sequencing data. BMC Bioinformatics 15, 275 (2014).

Abstract: PMID: 25117656

1.
Jeggari, A., Marks, D. S. & Larsson, E. miRcode: a map of putative microRNA target sites in the long non-coding transcriptome. Bioinformatics 28, 2062–2063 (2012).

Abstract: Summary: Although small non-coding RNAs, such as microRNAs, have well-established functions in the cell, long non-coding RNAs (lncRNAs) have only recently started to emerge as abundant regulators of cell physiology, and their functions may be diverse. A small number of studies describe interactions between small and lncRNAs, with lncRNAs acting either as inhibitory decoys or as regulatory targets of microRNAs, but such interactions are still poorly explored. To facilitate the study of microRNA–lncRNA interactions, we implemented miRcode: a comprehensive searchable map of putative microRNA target sites across the complete GENCODE annotated transcriptome, including 10 419 lncRNA genes in the current version. Availability: http://www.mircode.org Contact: erik.larsson@gu.se Supplementary Information: Supplementary data are available at Bioinformatics online.

1.
Lewis, B. P., Burge, C. B. & Bartel, D. P. Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell 120, 15–20 (2005).

Abstract: We predict regulatory targets of vertebrate microRNAs (miRNAs) by identifying mRNAs with conserved complementarity to the seed (nucleotides 2-7) of the miRNA. An overrepresentation of conserved adenosines flanking the seed complementary sites in mRNAs indicates that primary sequence determinants can supplement base pairing to specify miRNA target recognition. In a four-genome analysis of 3' UTRs, approximately 13,000 regulatory relationships were detected above the estimate of false-positive predictions, thereby implicating as miRNA targets more than 5300 human genes, which represented 30% of our gene set. Targeting was also detected in open reading frames. In sum, well over one third of human genes appear to be conserved miRNA targets.

1.
Qian, K., Auvinen, E., Greco, D. & Auvinen, P. miRSeqNovel: An R based workflow for analyzing miRNA sequencing data. Molecular and Cellular Probes 26, 208–211 (2012).

Abstract: We present miRSeqNovel, an R based workflow for miRNA sequencing data analysis. miRSeqNovel can process both colorspace (SOLiD) and basespace (Illumina/Solexa) data by different mapping algorithms. It finds differentially expressed miRNAs and gives conservative prediction of novel miRNA candidates with customized parameters. miRSeqNovel is freely available at http://sourceforge.net/projects/mirseq/files.

1.
Hammell, M. et al. mirWIP: microRNA target prediction based on microRNA-containing ribonucleoprotein–enriched transcripts. Nat Meth 5, 813–819 (2008).

Abstract: Target prediction for animal microRNAs (miRNAs) has been hindered by the small number of verified targets available to evaluate the accuracy of predicted miRNA-target interactions. Recently, a dataset of 3,404 miRNA-associated mRNA transcripts was identified by immunoprecipitation of the RNA-induced silencing complex components AIN-1 and AIN-2. Our analysis of this AIN-IP dataset revealed enrichment for defining characteristics of functional miRNA-target interactions, including structural accessibility of target sequences, total free energy of miRNA-target hybridization and topology of base-pairing to the 5' seed region of the miRNA. We used these enriched characteristics as the basis for a quantitative miRNA target prediction method, miRNA targets by weighting immunoprecipitation-enriched parameters (mirWIP), which optimizes sensitivity to verified miRNA-target interactions and specificity to the AIN-IP dataset. MirWIP can be used to capture all known conserved miRNA-mRNA target relationships in Caenorhabditis elegans at a lower false-positive rate than can the current standard methods.

1.
Rennie, W. et al. STarMir: a web server for prediction of microRNA binding sites. Nucl. Acids Res. 42, W114–W118 (2014).

Abstract: STarMir web server predicts microRNA (miRNA) binding sites on a target ribonucleic acid (RNA). STarMir is an implementation of logistic prediction models developed with miRNA binding data from crosslinking immunoprecipitation (CLIP) studies (Liu,C., Mallick, B., Long, D., Rennie, W.A., Wolenc, A., Carmack, C.S. and Ding, Y. (2013). CLIP-based prediction of mammalian microRNA binding sites. Nucleic Acids Res., 41(14), e138). In both intra-dataset and inter-dataset validations, the models showed major improvements over established algorithms in predictions of both seed and seedless sites. General applicability of the models was indicated by good performance in cross-species validations. The input data for STarMir is processed by the web server to perform prediction of miRNA binding sites, compute comprehensive sequence, thermodynamic and target structure features and a logistic probability as a measure of confidence for each predicted site. For each of seed and seedless sites and for all three regions of a mRNA (3′ UTR, CDS and 5′ UTR), STarMir output includes the computed binding site features, the logistic probability and a publication-quality diagram of the predicted miRNA:target hybrid. The prediction results are available through both an interactive viewer and downloadable text files. As an application module of the Sfold RNA package (http://sfold.wadsworth.org), STarMir is freely available to all at http://sfold.wadsworth.org/starmir.html.

1.
Noirot, C., Gaspin, C., Schiex, T. & Gouzy, J. LeARN: a platform for detecting, clustering and annotating non-coding RNAs. BMC Bioinformatics 9, 21 (2008).

Abstract: PMID: 18194551

1.
Fasold, M., Langenberger, D., Binder, H., Stadler, P. F. & Hoffmann, S. DARIO: a ncRNA detection and analysis tool for next-generation sequencing experiments. Nucl. Acids Res. 39, W112–W117 (2011).

Abstract: Small non-coding RNAs (ncRNAs) such as microRNAs, snoRNAs and tRNAs are a diverse collection of molecules with several important biological functions. Current methods for high-throughput sequencing for the first time offer the opportunity to investigate the entire ncRNAome in an essentially unbiased way. However, there is a substantial need for methods that allow a convenient analysis of these overwhelmingly large data sets. Here, we present DARIO, a free web service that allows to study short read data from small RNA-seq experiments. It provides a wide range of analysis features, including quality control, read normalization, ncRNA quantification and prediction of putative ncRNA candidates. The DARIO web site can be accessed at http://dario.bioinf.uni-leipzig.de/.

1.
Gaidatzis, D., Nimwegen, E. van, Hausser, J. & Zavolan, M. Inference of miRNA targets using evolutionary conservation and pathway analysis. BMC Bioinformatics 8, 69 (2007).

Abstract: PMID: 17331257

1.
Xiao, F. et al. miRecords: an integrated resource for microRNA–target interactions. Nucl. Acids Res. 37, D105–D110 (2009).

Abstract: MicroRNAs (miRNAs) are an important class of small noncoding RNAs capable of regulating other genes’ expression. Much progress has been made in computational target prediction of miRNAs in recent years. More than 10 miRNA target prediction programs have been established, yet, the prediction of animal miRNA targets remains a challenging task. We have developed miRecords, an integrated resource for animal miRNA–target interactions. The Validated Targets component of this resource hosts a large, high-quality manually curated database of experimentally validated miRNA–target interactions with systematic documentation of experimental support for each interaction. The current release of this database includes 1135 records of validated miRNA–target interactions between 301 miRNAs and 902 target genes in seven animal species. The Predicted Targets component of miRecords stores predicted miRNA targets produced by 11 established miRNA target prediction programs. miRecords is expected to serve as a useful resource not only for experimental miRNA researchers, but also for informatics scientists developing the next-generation miRNA target prediction programs. The miRecords is available at http://miRecords.umn.edu/miRecords.

1.
Sign In.
1.
Dai, E. et al. EpimiR: a database of curated mutual regulation between miRNAs and epigenetic modifications. Database 2014, bau023 (2014).

Abstract: As two kinds of important gene expression regulators, both epigenetic modification and microRNA (miRNA) can play significant roles in a wide range of human diseases. Recently, many studies have demonstrated that epigenetics and miRNA can affect each other in various ways. In this study, we established the EpimiR database, which collects 1974 regulations between 19 kinds of epigenetic modifications (such as DNA methylation, histone acetylation, H3K4me3, H3S10p) and 617 miRNAs across seven species (including Homo sapiens, Mus musculus, Rattus norvegicus, Gallus gallus, Epstein–Barr virus, Canis familiaris and Arabidopsis thaliana) from \textgreater300 references in the literature. These regulations can be divided into two parts: miR2Epi (103 entries describing how miRNA regulates epigenetic modification) and Epi2miR (1871 entries describing how epigenetic modification affects miRNA). Each entry of EpimiR not only contains basic descriptions of the validated experiment (method, species, reference and so on) but also clearly illuminates the regulatory pathway between epigenetics and miRNA. As a supplement to the curated information, the EpimiR extends to gather predicted epigenetic features (such as predicted transcription start site, upstream CpG island) associated with miRNA for users to guide their future biological experiments. Finally, EpimiR offers download and submission pages. Thus, EpimiR provides a fairly comprehensive repository about the mutual regulation between epigenetic modifications and miRNAs, which will promote the research on the regulatory mechanism of epigenetics and miRNA. Database URL: http://bioinfo.hrbmu.edu.cn/EpimiR/.

1.
Wong, N. & Wang, X. miRDB: an online resource for microRNA target prediction and functional annotations. Nucl. Acids Res. 43, D146–D152 (2015).

Abstract: MicroRNAs (miRNAs) are small non-coding RNAs that are extensively involved in many physiological and disease processes. One major challenge in miRNA studies is the identification of genes regulated by miRNAs. To this end, we have developed an online resource, miRDB (http://mirdb.org), for miRNA target prediction and functional annotations. Here, we describe recently updated features of miRDB, including 2.1 million predicted gene targets regulated by 6709 miRNAs. In addition to presenting precompiled prediction data, a new feature is the web server interface that allows submission of user-provided sequences for miRNA target prediction. In this way, users have the flexibility to study any custom miRNAs or target genes of interest. Another major update of miRDB is related to functional miRNA annotations. Although thousands of miRNAs have been identified, many of the reported miRNAs are not likely to play active functional roles or may even have been falsely identified as miRNAs from high-throughput studies. To address this issue, we have performed combined computational analyses and literature mining, and identified 568 and 452 functional miRNAs in humans and mice, respectively. These miRNAs, as well as associated functional annotations, are presented in the FuncMir Collection in miRDB.

1.
Sturm, M., Hackenberg, M., Langenberger, D. & Frishman, D. TargetSpy: a supervised machine learning approach for microRNA target prediction. BMC Bioinformatics 11, 292 (2010).

Abstract: PMID: 20509939

1.
Ganguli, S., Mitra, S. & Datta, A. Antagomirbase- a putative antagomir database. Bioinformation 7, 41–43 (2011).

Abstract: The accurate prediction of a comprehensive set of messenger putative antagomirs against microRNAs (miRNAs) remains an open problem. In particular, a set of putative antagomirs against human miRNA is predicted in this current version of database. We have developed Antagomir database, based on putative antagomirs-miRNA heterodimers. In this work, the human miRNA dataset was used as template to design putative antagomirs, using GC content and secondary structures as parameters. The algorithm used predicted the free energy of unbound antagomirs. Although in its infancy the development of antagomirs, that can target cell specific genes or families of genes, may pave the way forward for the generation of a new class of therapeutics, to treat complex inflammatory diseases. Future versions need to incorporate further sequences from other mammalian homologues for designing of antagomirs for aid in research.

1.
Hansen, T. B., Venø, M. T., Kjems, J. & Damgaard, C. K. miRdentify: high stringency miRNA predictor identifies several novel animal miRNAs. Nucl. Acids Res. 42, e124–e124 (2014).

Abstract: During recent years, miRNAs have been shown to play important roles in the regulation of gene expression. Accordingly, much effort has been put into the discovery of novel uncharacterized miRNAs in various organisms. miRNAs are structurally defined by a hairpin-loop structure recognized by the two-step processing apparatus, Drosha and Dicer, necessary for the production of mature ∼22-nucleotide miRNA guide strands. With the emergence of high-throughput sequencing applications, tools have been developed to identify miRNAs and profile their expression based on sequencing reads. However, as the read depth increases, false-positive predictions increase using established algorithms, underscoring the need for more stringent approaches. Here we describe a transparent pipeline for confident miRNA identification in animals, termed miRdentify. We show that miRdentify confidently discloses more than 400 novel miRNAs in humans, including the first male-specific miRNA, which we successfully validate. Moreover, novel miRNAs are predicted in the mouse, the fruit fly and nematodes, suggesting that the pipeline applies to all animals. The entire software package is available at www.ncrnalab.dk/mirdentify.

1.
Gennarino, V. A. et al. HOCTAR database: A unique resource for microRNA target prediction. Gene 480, 51–58 (2011).

Abstract: microRNAs (miRNAs) are the most abundant class of small RNAs in mammals. They play an important role in regulation of gene expression by inducing mRNA cleavage or translational inhibition. Each miRNA targets an average of 100–200 genes by binding, preferentially, to their 3′ UTRs by means of partial sequence complementarity. Most miRNAs are localized within transcriptional units, termed host genes, and show similar expression behavior with respect to their corresponding host genes. Considering the impact of miRNA in the regulation of gene expression and their involvement in a growing number of human disorders, it is vital to develop sensitive computational approaches able to identify miRNA target genes. The HOCTAR database (db) is a publicly available resource collecting ranked list of predicted target genes for 290 intragenic miRNAs annotated in human. HOCTARdb is a unique resource that integrates miRNA target prediction genes and transcriptomic data to score putative miRNA targets looking at the expression behavior of their host genes. We demonstrated, by testing 135 known validated target genes (either at the translational or transcriptional level) for different miRNAs, that the miRNA target prediction lists present in HOCTARdb are highly reliable. Moreover, HOCTARdb associates biological roles to each miRNA-controlled transcriptional network by means of Gene Ontology analysis. This information is easily accessible through a user-friendly query page. The HOCTARdb is available at http://hoctar.tigem.it/. We believe that a detailed relationship between miRNAs and their target genes and a constant update of the information contained in HOCTARdb will provide an extremely valuable resource to assist the researcher in the discovery of miRNA target genes.

1.
Krüger, J. & Rehmsmeier, M. RNAhybrid: microRNA target prediction easy, fast and flexible. Nucl. Acids Res. 34, W451–W454 (2006).

Abstract: In the elucidation of the microRNA regulatory network, knowledge of potential targets is of highest importance. Among existing target prediction methods, RNAhybrid [M. Rehmsmeier, P. Steffen, M. Höchsmann and R. Giegerich (2004) RNA, 10, 1507–1517] is unique in offering a flexible online prediction. Recently, some useful features have been added, among these the possibility to disallow G:U base pairs in the seed region, and a seed-match speed-up, which accelerates the program by a factor of 8. In addition, the program can now be used as a webservice for remote calls from user-implemented programs. We demonstrate RNAhybrid's flexibility with the prediction of a non-canonical target site for Caenorhabditis elegans miR-241 in the 3′-untranslated region of lin-39. RNAhybrid is available at http://bibiserv.techfak.uni-bielefeld.de/rnahybrid.

1.
Kim, J. et al. MAGI: a Node.js web service for fast microRNA-Seq analysis in a GPU infrastructure. Bioinformatics 30, 2826–2827 (2014).

Abstract: Summary: MAGI is a web service for fast MicroRNA-Seq data analysis in a graphics processing unit (GPU) infrastructure. Using just a browser, users have access to results as web reports in just a few hours—\textgreater600% end-to-end performance improvement over state of the art. MAGI’s salient features are (i) transfer of large input files in native FASTA with Qualities (FASTQ) format through drag-and-drop operations, (ii) rapid prediction of microRNA target genes leveraging parallel computing with GPU devices, (iii) all-in-one analytics with novel feature extraction, statistical test for differential expression and diagnostic plot generation for quality control and (iv) interactive visualization and exploration of results in web reports that are readily available for publication. Availability and implementation: MAGI relies on the Node.js JavaScript framework, along with NVIDIA CUDA C, PHP: Hypertext Preprocessor (PHP), Perl and R. It is freely available at http://magi.ucsd.edu. Contact: j5kim@ucsd.edu Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Ronen, R. et al. miRNAkey: a software for microRNA deep sequencing analysis. Bioinformatics 26, 2615–2616 (2010).

Abstract: Motivation: MicroRNAs (miRNAs) are short abundant non-coding RNAs critical for many cellular processes. Deep sequencing (next-generation sequencing) technologies are being readily used to receive a more accurate depiction of miRNA expression profiles in living cells. This type of analysis is a key step towards improving our understanding of the complexity and mode of miRNA regulation. Results: miRNAkey is a software package designed to be used as a base-station for the analysis of miRNA deep sequencing data. The package implements common steps taken in the analysis of such data, as well as adds unique features, such as data statistics and multiple read determination, generating a novel platform for the analysis of miRNA expression. A user-friendly graphical interface is applied to determine the analysis steps. The tabular and graphical output contains general and detailed reports on the sequence reads and provides an accurate picture of the differentially expressed miRNAs in paired samples. Availability and implementation: See http://ibis.tau.ac.il/miRNAkey Contact: nshomron@post.tau.ac.il Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Liu, B., Fang, L., Chen, J., Liu, F. & Wang, X. miRNA-dis: microRNA precursor identification based on distance structure status pairs. Mol Biosyst 11, 1194–1204 (2015).

Abstract: MicroRNA precursor identification is an important task in bioinformatics. Support Vector Machine (SVM) is one of the most effective machine learning methods used in this field. The performance of SVM-based methods depends on the vector representations of RNAs. However, the discriminative power of the existing feature vectors is limited, and many methods lack an interpretable model for analysis of characteristic sequence features. Prior studies have demonstrated that sequence or structure order effects were relevant for discrimination, but little work has explored how to use this kind of information for human pre-microRNA identification. In this study, in order to incorporate the structure-order information into the prediction, a method called "miRNA-dis" was proposed, in which the feature vector was constructed by the occurrence frequency of the "distance structure status pair" or just the "distance-pair". Rigorous cross-validations on a much larger and more stringent newly constructed benchmark dataset showed that the miRNA-dis outperformed some state-of-the-art predictors in this area. Remarkably, miRNA-dis trained with human data can correctly predict 87.02% of the 4022 pre-miRNAs from 11 different species ranging from animals, plants and viruses. miRNA-dis would be a useful high throughput tool for large-scale analysis of microRNA precursors. In addition, the learnt model can be easily analyzed in terms of discriminative features, and some interesting patterns were discovered, which could reflect the characteristics of microRNAs. A user-friendly web-server of miRNA-dis was constructed, which is freely accessible to the public at the web-site on .

1.
Nielsen, C. B. et al. Determinants of targeting by endogenous and exogenous microRNAs and siRNAs. RNA 13, 1894–1910 (2007).

Abstract: Vertebrate mRNAs are frequently targeted for post-transcriptional repression by microRNAs (miRNAs) through mechanisms involving pairing of 3′ UTR seed matches to bases at the 5′ end of miRNAs. Through analysis of expression array data following miRNA or siRNA overexpression or inhibition, we found that mRNA fold change increases multiplicatively (i.e., log-additively) with seed match count and that a single 8 mer seed match mediates down-regulation comparable to two 7 mer seed matches. We identified several targeting determinants that enhance seed match-associated mRNA repression, including the presence of adenosine opposite miRNA base 1 and of adenosine or uridine opposite miRNA base 9, independent of complementarity to the siRNA/miRNA. Increased sequence conservation in the ∼50 bases 5′ and 3′ of the seed match and increased AU content 3′ of the seed match were each independently associated with increased mRNA down-regulation. All of these determinants are enriched in the vicinity of conserved miRNA seed matches, supporting their activity in endogenous miRNA targeting. Together, our results enable improved siRNA off-target prediction, allow integrated ranking of conserved and nonconserved miRNA targets, and show that targeting by endogenous and exogenous miRNAs/siRNAs involves similar or identical determinants.

1.
Hsu, P. W.-C., Lin, L.-Z., Hsu, S.-D., Hsu, J. B.-K. & Huang, H.-D. ViTa: prediction of host microRNAs targets on viruses. Nucl. Acids Res. 35, D381–D385 (2007).

Abstract: MicroRNAs (miRNAs) are involved in various biological processes by suppressing gene expression. A recent work has indicated that host miRNAs are also capable of regulating viral gene expression by targeting the virus genomes. To investigate regulatory relationships between host miRNAs and related viruses, we present a novel database, namely ViTa, to curate the known virus miRNA genes and the known/putative target sites of human, mice, rat and chicken miRNAs. Known miRNAs are obtained from miRBase. Virus data are collected and referred from ICTVdB, VBRC and VirGen. Experimentally validated miRNA targets on viruses were derived from literatures. Then, miRanda and TargetScan are utilized to predict miRNA targets within virus genomes. ViTa also provides the virus annotations, virus-infected tissues and tissue specificity of host miRNAs. This work also facilitates the comparisons between subtypes of viruses, such as influenza viruses, human liver viruses and the conserved regions between viruses. Both textual and graphical web interfaces are provided to facilitate the data retrieves in the ViTa database. The database is now freely available at http://vita.mbc.nctu.edu.tw/.

1.
Miranda, K. C. et al. A Pattern-Based Method for the Identification of MicroRNA Binding Sites and Their Corresponding Heteroduplexes. Cell 126, 1203–1217 (2006).

Abstract: Summary We present rna22, a method for identifying microRNA binding sites and their corresponding heteroduplexes. Rna22 does not rely upon cross-species conservation, is resilient to noise, and, unlike previous methods, it first finds putative microRNA binding sites in the sequence of interest, then identifies the targeting microRNA. Computationally, we show that rna22 identifies most of the currently known heteroduplexes. Experimentally, with luciferase assays, we demonstrate average repressions of 30% or more for 168 of 226 tested targets. The analysis suggests that some microRNAs may have as many as a few thousand targets, and that between 74% and 92% of the gene transcripts in four model genomes are likely under microRNA control through their untranslated and amino acid coding regions. We also extended the method's key idea to a low-error microRNA-precursor-discovery scheme; our studies suggest that the number of microRNA precursors in mammalian genomes likely ranges in the tens of thousands.

1.
Breakfield, N. W. et al. High-resolution experimental and computational profiling of tissue-specific known and novel miRNAs in Arabidopsis. Genome Res. 22, 163–176 (2012).

Abstract: Small non-coding RNAs (ncRNAs) are key regulators of plant development through modulation of the processing, stability, and translation of larger RNAs. We present small RNA data sets comprising more than 200 million aligned Illumina sequence reads covering all major cell types of the root as well as four distinct developmental zones. MicroRNAs (miRNAs) constitute a class of small ncRNAs that are particularly important for development. Of the 243 known miRNAs, 133 were found to be expressed in the root, and most showed tissue- or zone-specific expression patterns. We identified 66 new high-confidence miRNAs using a computational pipeline, PIPmiR, specifically developed for the identification of plant miRNAs. PIPmiR uses a probabilistic model that combines RNA structure and expression information to identify miRNAs with high precision. Knockdown of three of the newly identified miRNAs results in altered root growth phenotypes, confirming that novel miRNAs predicted by PIPmiR have functional relevance.

1.
Kery, M. B., Feldman, M., Livny, J. & Tjaden, B. TargetRNA2: identifying targets of small regulatory RNAs in bacteria. Nucl. Acids Res. 42, W124–W129 (2014).

Abstract: Many small, noncoding RNAs (sRNAs) in bacteria act as posttranscriptional regulators of messenger RNAs. TargetRNA2 is a web server that identifies mRNA targets of sRNA regulatory action in bacteria. As input, TargetRNA2 takes the sequence of an sRNA and the name of a sequenced bacterial replicon. When searching for targets of RNA regulation, TargetRNA2 uses a variety of features, including conservation of the sRNA in other bacteria, the secondary structure of the sRNA, the secondary structure of each candidate mRNA target and the hybridization energy between the sRNA and each candidate mRNA target. TargetRNA2 outputs a ranked list of likely regulatory targets for the input sRNA. When evaluated on a comprehensive set of sRNA-target interactions, TargetRNA2 was found to be both accurate and efficient in identifying targets of sRNA regulatory action. Furthermore, TargetRNA2 has the ability to integrate RNA-seq data, if available. If an sRNA is differentially expressed in two or more RNA-seq experiments, TargetRNA2 considers co-differential gene expression when searching for regulatory targets, significantly improving the accuracy of target identifications. The TargetRNA2 web server is freely available for use at http://cs.wellesley.edu/∼btjaden/TargetRNA2.

1.
Zhang, Z. et al. PMRD: plant microRNA database. Nucl. Acids Res. 38, D806–D813 (2010).

Abstract: MicroRNAs (miRNA) are ∼21 nucleotide-long non-coding small RNAs, which function as post-transcriptional regulators in eukaryotes. miRNAs play essential roles in regulating plant growth and development. In recent years, research into the mechanism and consequences of miRNA action has made great progress. With whole genome sequence available in such plants as Arabidopsis thaliana, Oryza sativa, Populus trichocarpa, Glycine max, etc., it is desirable to develop a plant miRNA database through the integration of large amounts of information about publicly deposited miRNA data. The plant miRNA database (PMRD) integrates available plant miRNA data deposited in public databases, gleaned from the recent literature, and data generated in-house. This database contains sequence information, secondary structure, target genes, expression profiles and a genome browser. In total, there are 8433 miRNAs collected from 121 plant species in PMRD, including model plants and major crops such as Arabidopsis, rice, wheat, soybean, maize, sorghum, barley, etc. For Arabidopsis, rice, poplar, soybean, cotton, medicago and maize, we included the possible target genes for each miRNA with a predicted interaction site in the database. Furthermore, we provided miRNA expression profiles in the PMRD, including our local rice oxidative stress related microarray data (LC Sciences miRPlants_10.1) and the recently published microarray data for poplar, Arabidopsis, tomato, maize and rice. The PMRD database was constructed by open source technology utilizing a user-friendly web interface, and multiple search tools. The PMRD is freely available at http://bioinformatics.cau.edu.cn/PMRD. We expect PMRD to be a useful tool for scientists in the miRNA field in order to study the function of miRNAs and their target genes, especially in model plants and major crops.

1.
Rusinov, V., Baev, V., Minkov, I. N. & Tabler, M. MicroInspector: a web tool for detection of miRNA binding sites in an RNA sequence. Nucl. Acids Res. 33, W696–W700 (2005).

Abstract: Regulation of post-transcriptional gene expression by microRNAs (miRNA) has so far been validated for only a few mRNA targets. Based on the large number of miRNA genes and the possibility that one miRNA might influence gene expression of several targets simultaneously, the quantity of ribo-regulated genes is expected to be much higher. Here, we describe the web tool MicroInspector that will analyse a user-defined RNA sequence, which is typically an mRNA or a part of an mRNA, for the occurrence of binding sites for known and registered miRNAs. The program allows variation of temperature, the setting of energy values as well as the selection of different miRNA databases to identify miRNA-binding sites of different strength. MicroInspector could spot the correct sites for miRNA-interaction in known target mRNAs. Using other mRNAs, for which such an interaction has not yet been described, we discovered frequently potential miRNA binding sites of similar quality, which can now be analysed experimentally. The MicroInspector program is easy to use and does not require specific computer skills. The service can be accessed via the MicroInspector web server at http://www.imbb.forth.gr/microinspector.

1.
Sun, X. et al. PMTED: a plant microRNA target expression database. BMC Bioinformatics 14, 174 (2013).

Abstract: PMID: 23725466

1.
Vlachos, I. S. et al. DIANA-TarBase v7.0: indexing more than half a million experimentally supported miRNA:mRNA interactions. Nucl. Acids Res. 43, D153–D159 (2015).

Abstract: microRNAs (miRNAs) are short non-coding RNA species, which act as potent gene expression regulators. Accurate identification of miRNA targets is crucial to understanding their function. Currently, hundreds of thousands of miRNA:gene interactions have been experimentally identified. However, this wealth of information is fragmented and hidden in thousands of manuscripts and raw next-generation sequencing data sets. DIANA-TarBase was initially released in 2006 and it was the first database aiming to catalog published experimentally validated miRNA:gene interactions. DIANA-TarBase v7.0 (http://www.microrna.gr/tarbase) aims to provide for the first time hundreds of thousands of high-quality manually curated experimentally validated miRNA:gene interactions, enhanced with detailed meta-data. DIANA-TarBase v7.0 enables users to easily identify positive or negative experimental results, the utilized experimental methodology, experimental conditions including cell/tissue type and treatment. The new interface provides also advanced information ranging from the binding site location, as identified experimentally as well as in silico, to the primer sequences used for cloning experiments. More than half a million miRNA:gene interactions have been curated from published experiments on 356 different cell types from 24 species, corresponding to 9- to 250-fold more entries than any other relevant database. DIANA-TarBase v7.0 is freely available.

1.
Friedman, R. C., Farh, K. K.-H., Burge, C. B. & Bartel, D. P. Most mammalian mRNAs are conserved targets of microRNAs. Genome Res. 19, 92–105 (2009).

Abstract: MicroRNAs (miRNAs) are small endogenous RNAs that pair to sites in mRNAs to direct post-transcriptional repression. Many sites that match the miRNA seed (nucleotides 2–7), particularly those in 3′ untranslated regions (3′UTRs), are preferentially conserved. Here, we overhauled our tool for finding preferential conservation of sequence motifs and applied it to the analysis of human 3′UTRs, increasing by nearly threefold the detected number of preferentially conserved miRNA target sites. The new tool more efficiently incorporates new genomes and more completely controls for background conservation by accounting for mutational biases, dinucleotide conservation rates, and the conservation rates of individual UTRs. The improved background model enabled preferential conservation of a new site type, the “offset 6mer,” to be detected. In total, \textgreater45,000 miRNA target sites within human 3′UTRs are conserved above background levels, and \textgreater60% of human protein-coding genes have been under selective pressure to maintain pairing to miRNAs. Mammalian-specific miRNAs have far fewer conserved targets than do the more broadly conserved miRNAs, even when considering only more recently emerged targets. Although pairing to the 3′ end of miRNAs can compensate for seed mismatches, this class of sites constitutes less than 2% of all preferentially conserved sites detected. The new tool enables statistically powerful analysis of individual miRNA target sites, with the probability of preferentially conserved targeting (PCT) correlating with experimental measurements of repression. Our expanded set of target predictions (including conserved 3′-compensatory sites), are available at the TargetScan website, which displays the PCT for each site and each predicted target.

1.
Muller, H., Marzi, M. J. & Nicassio, F. IsomiRage: from functional classification to differential expression of miRNA isoforms. Front. Bioeng. Biotechnol. 2, 38 (2014).

Abstract: As more small RNA sequencing libraries are becoming available, it clearly emerges that microRNAs (miRNAs) are highly heterogeneous both in length and sequence. In comparison to canonical miRNAs, miRNA isoforms (termed as “isomiRs”) might exhibit different biological properties, such as a different target repertoire, or enhanced/reduced stability. Nonetheless, this layer of information has remained largely unexplored due to the scarcity of small RNA NGS-datasets and the absence of proper analytical tools. Here, we present a workflow for the characterization and analysis of miRNAs and their variants in next-generation sequencing datasets. IsomiRs can originate from an alternative dicing event (“templated” forms) or from the addition of nucleotides through an enzymatic activity or target-dependent mechanisms (“non-templated” forms). Our pipeline allows distinguishing canonical miRNAs from templated and non-templated isomiRs by alignment to a custom database, which comprises all possible 3′-, 5′-, and trimmed variants. Functionally equivalent isomiRs can be grouped together according to the type of modification (e.g., uridylation, adenylation, trimming …) to assess which miRNAs are more intensively modified in a given biological context. When applied to the analysis of primary epithelial breast cancer cells, our methodology provided a 40% increase in the number of detected miRNA species and allowed to easily identify and classify more than 1000 variants. Most modifications were compatible with templated IsomiRs, as a consequence of imprecise Drosha or Dicer cleavage. However, some non-templated variants were consistently found either in the normal or in the cancer cells, with the 3′-end adenylation and uridylation as the most frequent events, suggesting that miRNA post-transcriptional modification frequently occurs. In conclusion, our analytical tool permits the deconvolution of miRNA heterogeneity and could be used to explore the functional role of miRNA isoforms.

1.
Leclercq, M., Diallo, A. B. & Blanchette, M. Computational prediction of the localization of microRNAs within their pre-miRNA. Nucl. Acids Res. 41, 7200–7211 (2013).

Abstract: MicroRNAs (miRNAs) are short RNA species derived from hairpin-forming miRNA precursors (pre-miRNA) and acting as key posttranscriptional regulators. Most computational tools labeled as miRNA predictors are in fact pre-miRNA predictors and provide no information about the putative miRNA location within the pre-miRNA. Sequence and structural features that determine the location of the miRNA, and the extent to which these properties vary from species to species, are poorly understood. We have developed miRdup, a computational predictor for the identification of the most likely miRNA location within a given pre-miRNA or the validation of a candidate miRNA. MiRdup is based on a random forest classifier trained with experimentally validated miRNAs from miRbase, with features that characterize the miRNA–miRNA* duplex. Because we observed that miRNAs have sequence and structural properties that differ between species, mostly in terms of duplex stability, we trained various clade-specific miRdup models and obtained increased accuracy. MiRdup self-trains on the most recent version of miRbase and is easy to use. Combined with existing pre-miRNA predictors, it will be valuable for both de novo mapping of miRNAs and filtering of large sets of candidate miRNAs obtained from transcriptome sequencing projects. MiRdup is open source under the GPLv3 and available at http://www.cs.mcgill.ca/∼blanchem/mirdup/.

1.
Manyam, G., Ivan, C., Calin, G. A. & Coombes, K. R. targetHub: a programmable interface for miRNA–gene interactions. Bioinformatics 29, 2657–2658 (2013).

Abstract: Motivation: With the expansion of high-throughput technologies, understanding different kinds of genome-level data is a common task. MicroRNA (miRNA) is increasingly profiled using high-throughput technologies (microarrays or next-generation sequencing). The downstream analysis of miRNA targets can be difficult. Although there are many databases and algorithms to predict miRNA targets, there are few tools to integrate miRNA–gene interaction data into high-throughput genomic analyses. Results: We present targetHub, a CouchDB database of miRNA–gene interactions. TargetHub provides a programmer-friendly interface to access miRNA targets. The Web site provides RESTful access to miRNA–gene interactions with an assortment of gene and miRNA identifiers. It can be a useful tool to integrate miRNA target interaction data directly into high-throughput bioinformatics analyses. Availability: TargetHub is available on the web at http://app1.bioinformatics.mdanderson.org/tarhub/_design/basic/index.html. Contact: coombes.3@osu.edu

1.
Busch, A., Richter, A. S. & Backofen, R. IntaRNA: efficient prediction of bacterial sRNA targets incorporating target site accessibility and seed regions. Bioinformatics 24, 2849–2856 (2008).

Abstract: Motivation: During the last few years, several new small regulatory RNAs (sRNAs) have been discovered in bacteria. Most of them act as post-transcriptional regulators by base pairing to a target mRNA, causing translational repression or activation, or mRNA degradation. Numerous sRNAs have already been identified, but the number of experimentally verified targets is considerably lower. Consequently, computational target prediction is in great demand. Many existing target prediction programs neglect the accessibility of target sites and the existence of a seed, while other approaches are either specialized to certain types of RNAs or too slow for genome-wide searches. Results: We introduce INTARNA, a new general and fast approach to the prediction of RNA–RNA interactions incorporating accessibility of target sites as well as the existence of a user-definable seed. We successfully applied INTARNA to the prediction of bacterial sRNA targets and determined the exact locations of the interactions with a higher accuracy than competing programs. Availability: http://www.bioinf.uni-freiburg.de/Software/ Contact: IntaRNA@informatik.uni-freiburg.de Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Brennecke, J., Stark, A., Russell, R. B. & Cohen, S. M. Principles of MicroRNA–Target Recognition. PLoS Biol 3, e85 (2005).

Abstract: MicroRNA target site recognition falls into two broad categories: 5' dominant sites that require little support from microRNA 3' end; and 3' compensatory sites that require strong 3' pairing to function.

1.
Gerlach, W. & Giegerich, R. GUUGle: a utility for fast exact matching under RNA complementary rules including G–U base pairing. Bioinformatics 22, 762–764 (2006).

Abstract: Motivation: RNA secondary structure analysis often requires searching for potential helices in large sequence data. Results: We present a utility program GUUGle that efficiently locates potential helical regions under RNA base pairing rules, which include Watson–Crick as well as G–U pairs. It accepts a positive and a negative set of sequences, and determines all exact matches under RNA rules between positive and negative sequences that exceed a specified length. The GUUGle algorithm can also be adapted to use a precomputed suffix array of the positive sequence set. We show how this program can be effectively used as a filter preceding a more computationally expensive task such as miRNA target prediction. Availability: GUUGle is available via the Bielefeld Bioinformatics Server at http://bibiserv.techfak.uni-bielefeld.de/guugle Contact:robert@TechFak.Uni-Bielefeld.DE

1.
Sablok, G. et al. isomiRex: Web-based identification of microRNAs, isomiR variations and differential expression using next-generation sequencing datasets. FEBS Letters 587, 2629–2634 (2013).

Abstract: We present an open-access web platform isomiRex, to identify isomiRs and on the fly graphical visualization of the differentially expressed miRNAs in control as well as treated library. The open-access web-platform is not restricted only to NGS sequence dataset from animals and potentially analyzes a wider dataset for plants, animals and viral NGS dataset supporting miRBase (version 19 supporting 193 species). The platform can handle the bloated amount of the read counts and reports the annotated microRNAs from plant, animal and viral NGS datasets. isomiRex also provides an estimation of the the isomiRs, of miRNAs with higher copy number relative to their mature reference sequences indexed in miRBase (version 19 supporting 193 species). Visually enhanced graphs potentially display differentially expressed isomiRs, which will help the user to demonstrate and correlate the abundance of the isomiR as a signature event to the specific condition. An additional module for estimating the differential expression has been implemented allowing the users to postulate the differential expression across the user input samples. The developed web-platform can be accessed at http://bioinfo1.uni-plovdiv.bg/isomiRex/.

1.
Luo, G.-Z., Yang, W., Ma, Y.-K. & Wang, X.-J. ISRNA: an integrative online toolkit for short reads from high-throughput sequencing data. Bioinformatics 30, 434–436 (2014).

Abstract: Summary: Integrative Short Reads NAvigator (ISRNA) is an online toolkit for analyzing high-throughput small RNA sequencing data. Besides the high-speed genome mapping function, ISRNA provides statistics for genomic location, length distribution and nucleotide composition bias analysis of sequence reads. Number of reads mapped to known microRNAs and other classes of short non-coding RNAs, coverage of short reads on genes, expression abundance of sequence reads as well as some other analysis functions are also supported. The versatile search functions enable users to select sequence reads according to their sub-sequences, expression abundance, genomic location, relationship to genes, etc. A specialized genome browser is integrated to visualize the genomic distribution of short reads. ISRNA also supports management and comparison among multiple datasets. Availability: ISRNA is implemented in Java/C++/Perl/MySQL and can be freely accessed at http://omicslab.genetics.ac.cn/ISRNA/. Contact: xjwang@genetics.ac.cn Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Stocks, M. B. et al. The UEA sRNA workbench: a suite of tools for analysing and visualizing next generation sequencing microRNA and small RNA datasets. Bioinformatics 28, 2059–2061 (2012).

Abstract: Summary: RNA silencing is a complex, highly conserved mechanism mediated by small RNAs (sRNAs), such as microRNAs (miRNAs), that is known to be involved in a diverse set of biological functions including development, pathogen control, genome maintenance and response to environmental change. Advances in next generation sequencing technologies are producing increasingly large numbers of sRNA reads per sample at a fraction of the cost of previous methods. However, many bioinformatics tools do not scale accordingly, are cumbersome, or require extensive support from bioinformatics experts. Therefore, researchers need user-friendly, robust tools, capable of not only processing large sRNA datasets in a reasonable time frame but also presenting the results in an intuitive fashion and visualizing sRNA genomic features. Herein, we present the UEA sRNA workbench, a suite of tools that is a successor to the web-based UEA sRNA Toolkit, but in downloadable format and with several enhanced and additional features. Availability: The program and help pages are available at http://srna-workbench.cmp.uea.ac.uk. Contact: vincent.moulton@cmp.uea.ac.uk

1.
Isakov, O. et al. Novel insight into the non-coding repertoire through deep sequencing analysis. Nucl. Acids Res. 40, e86–e86 (2012).

Abstract: Non-coding RNAs (ncRNA) account for a large portion of the transcribed genomic output. This diverse family of untranslated RNA molecules play a crucial role in cellular function. The use of ‘deep sequencing’ technology (also known as ‘next generation sequencing’) to infer transcript expression levels in general, and ncRNA specifically, is becoming increasingly common in molecular and clinical laboratories. We developed a software termed ‘RandA’ (which stands for ncRNA Read-and-Analyze) that performs comprehensive ncRNA profiling and differential expression analysis on deep sequencing generated data through a graphical user interface running on a local personal computer. Using RandA, we reveal the complexity of the ncRNA repertoire in a given cell population. We further demonstrate the relevance of such an extensive ncRNA analysis by elucidating a multitude of characterizing features in pathogen infected mammalian cells. RandA is available for download at http://ibis.tau.ac.il/RandA.

1.
Xie, F., Xiao, P., Chen, D., Xu, L. & Zhang, B. miRDeepFinder: a miRNA analysis tool for deep sequencing of plant small RNAs. Plant Mol Biol 80, 75–84 (2012).

Abstract: miRDeepFinder is a software package developed to identify and functionally analyze plant microRNAs (miRNAs) and their targets from small RNA datasets obtained from deep sequencing. The functions available in miRDeepFinder include pre-processing of raw data, identifying conserved miRNAs, mining and classifying novel miRNAs, miRNA expression profiling, predicting miRNA targets, and gene pathway and gene network analysis involving miRNAs. The fundamental design of miRDeepFinder is based on miRNA biogenesis, miRNA-mediated gene regulation and target recognition, such as perfect or near perfect hairpin structures, different read abundances of miRNA and miRNA*, and targeting patterns of plant miRNAs. To test the accuracy and robustness of miRDeepFinder, we analyzed a small RNA deep sequencing dataset of Arabidopsis thaliana published in the GEO database of NCBI. Our test retrieved 128 of 131 (97.7%) known miRNAs that have a more than 3 read count in Arabidopsis. Because many known miRNAs are not associated with miRNA*s in small RNA datasets, miRDeepFinder was also designed to recover miRNA candidates without the presence of miRNA*. To mine as many miRNAs as possible, miRDeepFinder allows users to compare mature miRNAs and their miRNA*s with other small RNA datasets from the same species. Cleaveland software package was also incorporated into miRDeepFinder for miRNA target identification using degradome sequencing analysis. Using this new computational tool, we identified 13 novel miRNA candidates with miRNA*s from Arabidopsis and validated 12 of them experimentally. Interestingly, of the 12 verified novel miRNAs, a miRNA named AC1 spans the exons of two genes (UTG71C4 and UGT71C3). Both the mature AC1 miRNA and its miRNA* were also found in four other small RNA datasets. We also developed a tool, “miRNA primer designer” to design primers for any type of miRNAs. miRDeepFinder provides a powerful tool for analyzing small RNA datasets from all species, with or without the availability of genome information. miRDeepFinder and miRNA primer designer are freely available at http://www.leonxie.com/DeepFinder.php and at http://www.leonxie.com/miRNAprimerDesigner.php, respectively. A program (called RefFinder: http://www.leonxie.com/referencegene.php) was also developed for assessing the reliable reference genes for gene expression analysis, including miRNAs.

1.
Betel, D., Wilson, M., Gabow, A., Marks, D. S. & Sander, C. The microRNA.org resource: targets and expression. Nucl. Acids Res. 36, D149–D153 (2008).

Abstract: MicroRNA.org (http://www.microrna.org) is a comprehensive resource of microRNA target predictions and expression profiles. Target predictions are based on a development of the miRanda algorithm which incorporates current biological knowledge on target rules and on the use of an up-to-date compendium of mammalian microRNAs. MicroRNA expression profiles are derived from a comprehensive sequencing project of a large set of mammalian tissues and cell lines of normal and disease origin. Using an improved graphical interface, a user can explore (i) the set of genes that are potentially regulated by a particular microRNA, (ii) the implied cooperativity of multiple microRNAs on a particular mRNA and (iii) microRNA expression profiles in various tissues. To facilitate future updates and development, the microRNA.org database structure and software architecture is flexibly designed to incorporate new expression and target discoveries. The web resource provides users with functional information about the growing number of microRNAs and their interaction with target genes in many species and facilitates novel discoveries in microRNA gene regulation.

1.
Thadani, R. & Tammi, M. T. MicroTar: predicting microRNA targets from RNA duplexes. BMC Bioinformatics 7, S20 (2006).

Abstract: PMID: 17254305

1.
Mitra, R. & Bandyopadhyay, S. MultiMiTar: A Novel Multi Objective Optimization based miRNA-Target Prediction Method. PLoS ONE 6, e24583 (2011).

Abstract: Background Machine learning based miRNA-target prediction algorithms often fail to obtain a balanced prediction accuracy in terms of both sensitivity and specificity due to lack of the gold standard of negative examples, miRNA-targeting site context specific relevant features and efficient feature selection process. Moreover, all the sequence, structure and machine learning based algorithms are unable to distribute the true positive predictions preferentially at the top of the ranked list; hence the algorithms become unreliable to the biologists. In addition, these algorithms fail to obtain considerable combination of precision and recall for the target transcripts that are translationally repressed at protein level. Methodology/Principal Finding In the proposed article, we introduce an efficient miRNA-target prediction system MultiMiTar, a Support Vector Machine (SVM) based classifier integrated with a multiobjective metaheuristic based feature selection technique. The robust performance of the proposed method is mainly the result of using high quality negative examples and selection of biologically relevant miRNA-targeting site context specific features. The features are selected by using a novel feature selection technique AMOSA-SVM, that integrates the multi objective optimization technique Archived Multi-Objective Simulated Annealing (AMOSA) and SVM. Conclusions/Significance MultiMiTar is found to achieve much higher Matthew’s correlation coefficient (MCC) of 0.583 and average class-wise accuracy (ACA) of 0.8 compared to the others target prediction methods for a completely independent test data set. The obtained MCC and ACA values of these algorithms range from −0.269 to 0.155 and 0.321 to 0.582, respectively. Moreover, it shows a more balanced result in terms of precision and sensitivity (recall) for the translationally repressed data set as compared to all the other existing methods. An important aspect is that the true positive predictions are distributed preferentially at the top of the ranked list that makes MultiMiTar reliable for the biologists. MultiMiTar is now available as an online tool at www.isical.ac.in/\textasciitildebioinfo_miu/multimitar.htm. MultiMiTar software can be downloaded from www.isical.ac.in/\textasciitildebioinfo_miu/multimitar-download.htm.

1.
Gupta, V., Markmann, K., Pedersen, C. N. S., Stougaard, J. & Andersen, S. U. shortran: a pipeline for small RNA-seq data analysis. Bioinformatics 28, 2698–2700 (2012).

Abstract: Summary: High-throughput sequencing currently generates a wealth of small RNA (sRNA) data, making data mining a topical issue. Processing of these large data sets is inherently multidimensional as length, abundance, sequence composition, and genomic location all hold clues to sRNA function. Analysis can be challenging because the formulation and testing of complex hypotheses requires combined use of visualization, annotation and abundance profiling. To allow flexible generation and querying of these disparate types of information, we have developed the shortran pipeline for analysis of plant or animal short RNA sequencing data. It comprises nine modules and produces both graphical and MySQL format output. Availability: shortran is freely available and can be downloaded from http://users-mb.au.dk/pmgrp/shortran/ Contact: vgupta@cs.au.dk or sua@mb.au.dk Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Barupal, J. K. et al. ExcellmiRDB for Translational Genomics: A Curated Online Resource for Extracellular MicroRNAs. OMICS: A Journal of Integrative Biology 19, 24–30 (2015).

Abstract: A large number of studies have suggested extracellular microRNAs (microRNAs in biofluids) as potential noninvasive biomarkers for pathophysiological conditions such as cancer. However, reported differentially expressed signatures of extracellular miRNAs in diseases are not uniformly consistent among studies. Here, we present “ExcellmiRDB”, a curated online database that provides integrated information about miRNAs levels in biofluids in a user-friendly way. Although many miRNA databases, including disease-oriented databases, have been launched before, the ExcellmiRDB is so far the only one specialized for storing curated data on miRNA levels in biofluid samples. At present, ExcellmiRDB has 2773 disease-extracellular miRNAs and 1108 biofluid-extracellular miRNAs relationships curated from 108 articles selected from more than 600 surveyed PubMed abstracts. Information about 992 miRNAs, 82 diseases, 21 biofluids, 8 species, 63 normalization reference genes, 5 techniques, 14 GEO profiles accession numbers, 7 human ethnic groups, and 18 compared clinical biomarkers have been provided in the database. A user can query ExcellmiRDB by selecting a disease or a miRNA or a biofluid. Additionally, the database provides two online network graphs to visualize and interact with the content of the database. The first network shows disease-extracellular miRNAs relationships, along with expression patterns and number of articles for a relationship. The second network visualizes biofluid-extracellular miRNAs relationships showing miRNAs spectrum across different types of biofluids. In conclusion, ExcellmiRDB is a new innovative resource for both academic and industrial researchers in translational omics who are developing miRNA biomarkers for noninvasive diagnostic or prognostic technologies. ExcellmiRDB is publicly available on www.excellmirdb.brfjaisalmer.com/.

1.
Tempel, S. & Tahi, F. A fast ab-initio method for predicting miRNA precursors in genomes. Nucl. Acids Res. 40, e80–e80 (2012).

Abstract: miRNAs are small non coding RNA structures which play important roles in biological processes. Finding miRNA precursors in genomes is therefore an important task, where computational methods are required. The goal of these methods is to select potential pre-miRNAs which could be validated by experimental methods. With the new generation of sequencing techniques, it is important to have fast algorithms that are able to treat whole genomes in acceptable times. We developed an algorithm based on an original method where an approximation of miRNA hairpins are first searched, before reconstituting the pre-miRNA structure. The approximation step allows a substantial decrease in the number of possibilities and thus the time required for searching. Our method was tested on different genomic sequences, and was compared with CID-miRNA, miRPara and VMir. It gives in almost all cases better sensitivity and selectivity. It is faster than CID-miRNA, miRPara and VMir: it takes ∼30 s to process a 1 MB sequence, when VMir takes 30 min, miRPara takes 20 h and CID-miRNA takes 55 h. We present here a fast ab-initio algorithm for searching for pre-miRNA precursors in genomes, called miRNAFold. miRNAFold is available at http://EvryRNA.ibisc.univ-evry.fr/.

1.
Mathelier, A. & Carbone, A. MIReNA: finding microRNAs with high accuracy and no learning at genome scale and from deep sequencing data. Bioinformatics 26, 2226–2234 (2010).

Abstract: Motivation: MicroRNAs (miRNAs) are a class of endogenes derived from a precursor (pre-miRNA) and involved in post-transcriptional regulation. Experimental identification of novel miRNAs is difficult because they are often transcribed under specific conditions and cell types. Several computational methods were developed to detect new miRNAs starting from known ones or from deep sequencing data, and to validate their pre-miRNAs. Results: We present a genome-wide search algorithm, called MIReNA, that looks for miRNA sequences by exploring a multidimensional space defined by only five (physical and combinatorial) parameters characterizing acceptable pre-miRNAs. MIReNA validates pre-miRNAs with high sensitivity and specificity, and detects new miRNAs by homology from known miRNAs or from deep sequencing data. A performance comparison between MIReNA and four available predictive systems has been done. MIReNA approach is strikingly simple but it turns out to be powerful at least as much as more sophisticated algorithmic methods. MIReNA obtains better results than three known algorithms that validate pre-miRNAs. It demonstrates that machine-learning is not a necessary algorithmic approach for pre-miRNAs computational validation. In particular, machine learning algorithms can only confirm pre-miRNAs that look alike known ones, this being a limitation while exploring species with no known pre-miRNAs. The possibility to adapt the search to specific species, possibly characterized by specific properties of their miRNAs and pre-miRNAs, is a major feature of MIReNA. A parameter adjustment calibrates specificity and sensitivity in MIReNA, a key feature for predictive systems, which is not present in machine learning approaches. Comparison of MIReNA with miRDeep using deep sequencing data to predict miRNAs highlights a highly specific predictive power of MIReNA. Availability: At the address http://www.ihes.fr/˜carbone/data8/ Contact: alessandra.carbone@lip6.fr Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Gudyś, A., Szcześniak, M. W., Sikora, M. & Makałowska, I. HuntMi: an efficient and taxon-specific approach in pre-miRNA identification. BMC Bioinformatics 14, 83 (2013).

Abstract: PMID: 23497112

1.
Wu, H.-J., Ma, Y.-K., Chen, T., Wang, M. & Wang, X.-J. PsRobot: a web-based plant small RNA meta-analysis toolbox. Nucl. Acids Res. 40, W22–W28 (2012).

Abstract: Small RNAs (smRNAs) in plants, mainly microRNAs and small interfering RNAs, play important roles in both transcriptional and post-transcriptional gene regulation. The broad application of high-throughput sequencing technology has made routinely generation of bulk smRNA sequences in laboratories possible, thus has significantly increased the need for batch analysis tools. PsRobot is a web-based easy-to-use tool dedicated to the identification of smRNAs with stem-loop shaped precursors (such as microRNAs and short hairpin RNAs) and their target genes/transcripts. It performs fast analysis to identify smRNAs with stem-loop shaped precursors among batch input data and predicts their targets using a modified Smith–Waterman algorithm. PsRobot integrates the expression data of smRNAs in major plant smRNA biogenesis gene mutants and smRNA-associated protein complexes to give clues to the smRNA generation and functional processes. Besides improved specificity, the reliability of smRNA target prediction results can also be evaluated by mRNA cleavage (degradome) data. The cross species conservation statuses and the multiplicity of smRNA target sites are also provided. PsRobot is freely accessible at http://omicslab.genetics.ac.cn/psRobot/.

1.
Sewer, A. et al. Identification of clustered microRNAs using an ab initio prediction method. BMC Bioinformatics 6, 267 (2005).

Abstract: PMID: 16274478

1.
John, B. et al. Human MicroRNA Targets. PLoS Biol 2, e363 (2004).

Abstract: This computational analysis provides evidence that as many as 10% of human genes are targets for regulation by small RNA molecules called microRNAs.

1.
Szcześniak, M. W. & Makałowska, I. miRNEST 2.0: a database of plant and animal microRNAs. Nucleic Acids Res 42, D74–D77 (2014).

Abstract: Ever growing interest in microRNAs has immensely populated the number of resources and research papers devoted to the field and, as a result, it becomes more and more demanding to find miRNA data of interest. To mitigate this problem, we created miRNEST database (http://mirnest.amu.edu.pl), an integrative microRNAs resource. In its updated version, named miRNEST 2.0, the database is complemented with our extensive miRNA predictions from deep sequencing libraries, data from plant degradome analyses, results of pre-miRNA classification with HuntMi and miRNA splice sites information. We also added download and upload options and improved the user interface to make it easier to browse through miRNA records.

1.
Zhu, E. et al. mirTools: microRNA profiling and discovery based on high-throughput sequencing. Nucl. Acids Res. 38, W392–W397 (2010).

Abstract: miRNAs are small, non-coding RNA that negatively regulate gene expression at post-transcriptional level, which play crucial roles in various physiological and pathological processes, such as development and tumorigenesis. Although deep sequencing technologies have been applied to investigate various small RNA transcriptomes, their computational methods are far away from maturation as compared to microarray-based approaches. In this study, a comprehensive web server mirTools was developed to allow researchers to comprehensively characterize small RNA transcriptome. With the aid of mirTools, users can: (i) filter low-quality reads and 3/5′ adapters from raw sequenced data; (ii) align large-scale short reads to the reference genome and explore their length distribution; (iii) classify small RNA candidates into known categories, such as known miRNAs, non-coding RNA, genomic repeats and coding sequences; (iv) provide detailed annotation information for known miRNAs, such as miRNA/miRNA*, absolute/relative reads count and the most abundant tag; (v) predict novel miRNAs that have not been characterized before; and (vi) identify differentially expressed miRNAs between samples based on two different counting strategies: total read tag counts and the most abundant tag counts. We believe that the integration of multiple computational approaches in mirTools will greatly facilitate current microRNA researches in multiple ways. mirTools can be accessed at http://centre.bioinformatics.zj.cn/mirtools/ and http://59.79.168.90/mirtools.

1.
Teune, J.-H. & Steger, G. NOVOMIR: De Novo Prediction of MicroRNA-Coding Regions in a Single Plant-Genome. J Nucleic Acids 2010, (2010).

Abstract: MicroRNAs (miRNA) are small regulatory, noncoding RNA molecules that are transcribed as primary miRNAs (pri-miRNA) from eukaryotic genomes. At least in plants, their regulatory activity is mediated through base-pairing with protein-coding messenger RNAs (mRNA) followed by mRNA degradation or translation repression. We describe NOVOMIR, a program for the identification of miRNA genes in plant genomes. It uses a series of filter steps and a statistical model to discriminate a pre-miRNA from other RNAs and does rely neither on prior knowledge of a miRNA target nor on comparative genomics. The sensitivity and specificity of NOVOMIR for detection of premiRNAs from Arabidopsis thaliana is \textasciitilde0.83 and \textasciitilde0.99, respectively. Plant pre-miRNAs are more heterogeneous with respect to size and structure than animal pre-miRNAs. Despite these difficulties, NOVOMIR is well suited to perform searches for pre-miRNAs on a genomic scale. NOVOMIR is written in Perl and relies on two additional, free programs for prediction of RNA secondary structure (RNALFOLD, RNASHAPES).

1.
Barenboim, M., Zoltick, B. J., Guo, Y. & Weinberger, D. R. MicroSNiPer: a web tool for prediction of SNP effects on putative microRNA targets. Hum. Mutat. 31, 1223–1232 (2010).

Abstract: MicroRNAs are short, approximately 22 nucleotide noncoding RNAs binding to partially complementary sites in the 3'UTR of target mRNAs. This process generally results in repression of multiple targets by a particular microRNA. There is substantial interest in methods designed to predict the microRNA targets and effect of single nucleotide polymorphisms (SNPs) on microRNA binding, given the impact of microRNA on posttranscriptional regulation and its potential relation to complex diseases. We developed a web-based application, MicroSNiPer, which predicts the impact of a SNP on putative microRNA targets. This application interrogates the 3'-untranslated region and predicts if a SNP within the target site will disrupt/eliminate or enhance/create a microRNA binding site. MicroSNiPer computes these sites and examines the effects of SNPs in real time. MicroSNiPer is a user-friendly Web-based tool. Its advantages include ease of use, flexibility, and straightforward graphical representation of the results. It is freely accessible at http://cbdb.nimh.nih.gov/microsniper.

1.
Lei, J. & Sun, Y. miR-PREFeR: an accurate, fast and easy-to-use plant miRNA prediction tool using small RNA-Seq data. Bioinformatics 30, 2837–2839 (2014).

Abstract: Summary: Plant microRNA prediction tools that use small RNA-sequencing data are emerging quickly. These existing tools have at least one of the following problems: (i) high false-positive rate; (ii) long running time; (iii) work only for genomes in their databases; (iv) hard to install or use. We developed miR-PREFeR (miRNA PREdiction From small RNA-Seq data), which uses expression patterns of miRNA and follows the criteria for plant microRNA annotation to accurately predict plant miRNAs from one or more small RNA-Seq data samples of the same species. We tested miR-PREFeR on several plant species. The results show that miR-PREFeR is sensitive, accurate, fast and has low-memory footprint. Availability and implementation: https://github.com/hangelwen/miR-PREFeR Contact: yannisun@msu.edu Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Pantano, L., Estivill, X. & Martí, E. SeqBuster, a bioinformatic tool for the processing and analysis of small RNAs datasets, reveals ubiquitous miRNA modifications in human embryonic cells. Nucl. Acids Res. 38, e34–e34 (2010).

Abstract: High-throughput sequencing technologies enable direct approaches to catalog and analyze snapshots of the total small RNA content of living cells. Characterization of high-throughput sequencing data requires bioinformatic tools offering a wide perspective of the small RNA transcriptome. Here we present SeqBuster, a highly versatile and reliable web-based toolkit to process and analyze large-scale small RNA datasets. The high flexibility of this tool is illustrated by the multiple choices offered in the pre-analysis for mapping purposes and in the different analysis modules for data manipulation. To overcome the storage capacity limitations of the web-based tool, SeqBuster offers a stand-alone version that permits the annotation against any custom database. SeqBuster integrates multiple analyses modules in a unique platform and constitutes the first bioinformatic tool offering a deep characterization of miRNA variants (isomiRs). The application of SeqBuster to small-RNA datasets of human embryonic stem cells revealed that most miRNAs present different types of isomiRs, some of them being associated to stem cell differentiation. The exhaustive description of the isomiRs provided by SeqBuster could help to identify miRNA-variants that are relevant in physiological and pathological processes. SeqBuster is available at http://estivill_lab.crg.es/seqbuster.

1.
Zhao, W. et al. wapRNA: a web-based application for the processing of RNA sequences. Bioinformatics 27, 3076–3077 (2011).

Abstract: Summary: mRNA/miRNA-seq technology is becoming the leading technology to globally profile gene expression and elucidate the transcriptional regulation mechanisms in living cells. Although there are many tools available for analyzing RNA-seq data, few of them are available as easy accessible online web tools for processing both mRNA and miRNA data for the RNA-seq based user community. As such, we have developed a comprehensive web application tool for processing mRNA-seq and miRNA-seq data. Our web tool wapRNA includes four different modules: mRNA-seq and miRNA-seq sequenced from SOLiD or Solexa platform and all the modules were tested on previously published experimental data. We accept raw sequence data with an optional reads filter, followed by mapping and gene annotation or miRNA prediction. wapRNA also integrates downstream functional analyses such as Gene Ontology, KEGG pathway, miRNA targets prediction and comparison of gene's or miRNA's different expression in different samples. Moreover, we provide the executable packages for installation on user's local server. Availability: wapRNA is freely available for use at http://waprna.big.ac.cn. The executable packages and the instruction for installation can be downloaded from our web site. Contact: husn@big.ac.cn; songshh@big.ac.cn Supplementary Information: Supplementary data are available at Bioinformatics online.

1.
Coronnello, C. et al. Novel Modeling of Combinatorial miRNA Targeting Identifies SNP with Potential Role in Bone Density. PLoS Comput Biol 8, e1002830 (2012).

Abstract: Author SummaryMicroRNA genes (miRNAs) are small non-coding RNAs that regulate the expression levels of mRNAs post-transcriptionally. miRNAs are critical in many important biological processes, like development, and are important markers for many diseases. Identifying the targets of miRNAs is not an easy task. Recent developments of high-throughput data collection methods for identification of all miRNA targets in a cell are promising, but they still depend on computational algorithms to identify the exact miRNA:mRNA interactions. In this paper we present a novel algorithm, ComiR, which addresses a more general question, that is, whether a given mRNA is targeted by a set of miRNAs. ComiR uses miRNA expression to improve the targeting models of four target prediction algorithms. Then it combines their predicted targets using a support vector machine. By applying ComiR to single nucleotide polymorphism (SNP) data, we identified a SNP that is likely to be causally associated to osteoporosis in women.

1.
Bleazard, T., Lamb, J. A. & Griffiths-Jones, S. Bias in microRNA functional enrichment analysis. Bioinformatics btv023 (2015) doi:10.1093/bioinformatics/btv023.

Abstract: Motivation: Many studies have investigated the differential expression of microRNAs (miRNAs) in disease states and between different treatments, tissues and developmental stages. Given a list of perturbed miRNAs, it is common to predict the shared pathways on which they act. The standard test for functional enrichment typically yields dozens of significantly enriched functional categories, many of which appear frequently in the analysis of apparently unrelated diseases and conditions. Results: We show that the most commonly used functional enrichment test is inappropriate for the analysis of sets of genes targeted by miRNAs. The hypergeometric distribution used by the standard method consistently results in significant P-values for functional enrichment for targets of randomly selected miRNAs, reflecting an underlying bias in the predicted gene targets of miRNAs as a whole. We developed an algorithm to measure enrichment using an empirical sampling approach, and applied this in a reanalysis of the gene ontology classes of targets of miRNA lists from 44 published studies. The vast majority of the miRNA target sets were not significantly enriched in any functional category after correction for bias. We therefore argue against continued use of the standard functional enrichment method for miRNA targets. Availability and implementation: A Python script implementing the empirical algorithm is freely available at http://sgjlab.org/empirical-go/. Contact: sam.griffiths-jones@manchester.ac.uk or janine.lamb@manchester.ac.uk Supplementary information: Supplementary data are available at Bioinformatics online.

1.
Hsu, J. B. et al. miRTar: an integrated system for identifying miRNA-target interactions in human. BMC Bioinformatics 12, 300 (2011).

Abstract: PMID: 21791068

1.
Yang, X. & Li, L. miRDeep-P: a computational tool for analyzing the microRNA transcriptome in plants. Bioinformatics 27, 2614–2615 (2011).

Abstract: Motivation: Ultra-deep sampling of small RNA libraries by next-generation sequencing has provided rich information on the microRNA (miRNA) transcriptome of various plant species. However, few computational tools have been developed to effectively deconvolute the complex information. Results: We sought to employ the signature distribution of small RNA reads along the miRNA precursor as a model in plants to profile expression of known miRNA genes and to identify novel ones. A freely available package, miRDeep-P, was developed by modifying miRDeep, which is based on a probabilistic model of miRNA biogenesis in animals, with a plant-specific scoring system and filtering criteria. We have tested miRDeep-P on eight small RNA libraries derived from three plants. Our results demonstrate miRDeep-P as an effective and easy-to-use tool for characterizing the miRNA transcriptome in plants. Availability: http://faculty.virginia.edu/lilab/miRDP/ Contact: ll4jn@virginia.edu Supplementary information:Supplementary data are available at Bioinformatics online.

1.
Chae, H., Rhee, S., Nephew, K. P. & Kim, S. BioVLAB-MMIA-NGS: microRNA–mRNA integrated analysis using high-throughput sequencing data. Bioinformatics 31, 265–267 (2015).

Abstract: Motivation: It is now well established that microRNAs (miRNAs) play a critical role in regulating gene expression in a sequence-specific manner, and genome-wide efforts are underway to predict known and novel miRNA targets. However, the integrated miRNA–mRNA analysis remains a major computational challenge, requiring powerful informatics systems and bioinformatics expertise. Results: The objective of this study was to modify our widely recognized Web server for the integrated mRNA–miRNA analysis (MMIA) and its subsequent deployment on the Amazon cloud (BioVLAB-MMIA) to be compatible with high-throughput platforms, including next-generation sequencing (NGS) data (e.g. RNA-seq). We developed a new version called the BioVLAB-MMIA-NGS, deployed on both Amazon cloud and on a high-performance publicly available server called MAHA. By using NGS data and integrating various bioinformatics tools and databases, BioVLAB-MMIA-NGS offers several advantages. First, sequencing data is more accurate than array-based methods for determining miRNA expression levels. Second, potential novel miRNAs can be detected by using various computational methods for characterizing miRNAs. Third, because miRNA-mediated gene regulation is due to hybridization of an miRNA to its target mRNA, sequencing data can be used to identify many-to-many relationship between miRNAs and target genes with high accuracy. Availability and implementation: http://epigenomics.snu.ac.kr/biovlab_mmia_ngs/ Contact: sunkim.bioinfo@snu.ac.kr, heechae@cs.indiana.edu

1.
Yue, D., Guo, M., Chen, Y. & Huang, Y. A Bayesian decision fusion approach for microRNA target prediction. BMC Genomics 13, S13 (2012).

Abstract: PMID: 23282032

1.
Artzi, S., Kiezun, A. & Shomron, N. miRNAminer: A tool for homologous microRNA gene search. BMC Bioinformatics 9, 39 (2008).

Abstract: PMID: 18215311

 

Bio-Info