Here we provide a set of command-line software tools written in Java to accompany the chapter entitled MicroRNA Promoter Analysis by Molly Megraw and Artemis G. Hatzigeorgiou, to appear in a volume of Methods in Molecular Biology dedicated to plant microRNAs (ISBN 978-1-60327-004-5, available November 22, 2009), edited by Blake Meyers and Pam Green and published by Humana Press, USA. The abstract for this chapter appears below.
In this chapter we present a brief overview of current knowledge about the promoters of plant miRNAs, and provide a step-by-step guide for predicting plant miRNA promoter elements using known transcription factor binding motifs. A key concept of this method is to use scoring thresholds for potential binding sites that are appropriate to each individual transcription factor. This technique allows the researcher to obtain a set of putative Transcription Factor Binding Sites (TFBS's) in noncoding upstream regions which are at least as likely as any trusted collection of sites within protein coding gene promoters. The method is conceptually simple and easy to use. While the procedure can be applied to search for TFBS's in any pol-II promoter region, it is particularly practical for the case of plant miRNA promoters where upstream sequence regions and binding sites are not readily available in existing databases.
This method was originally developed and implemented as part of the following published research study. All software tools below are made freely available under the GNU Public License. If any of these tools are used for work which results in a publication, we would appreciate citation of the following article:
M. Megraw, V. Baev, V. Rusinov, S. Jensen, K. Kalantidis, and A.G. Hatzigeorgiou (2006). MicroRNA Promoter Element Discovery in Arabidopsis. RNA, 12:1612-1619.
It is first recommended to download the full package of code and instructions here.
Each of the steps 1-5 below corresponds to a section in the book chapter. Each step has its own directory that contains an explanatory README file, along with sample shell scripts and data for running the command-line Java tools.
- Construction of PWM's
- Construction of a background model
- Computing thresholds for each TF
- Promoter sequences
- Scanning for TFBS's
The Java Runtime Environment (JRE) as well as the latest Java Development Kit (JDK) are freely available from Sun Microsystems. Only the JRE is needed to use the command-line tools. If the user wishes to modify/expand and recompile the tools, the JDK is necessary.
All accompanying shell scripts are written in the Bash scripting language which is standard with any Unix or Linux computing environment. The scripts can also be run in a Windows/DOS envinroment, either by using the Cygwin package for Linux emulation (free) or by modifying the Bash shell scripts to conform to DOS scripting syntax.