organismA <tab> chromsomeA <tab> match_ID_A <tab> end5_A <tab> end3_A <tab> organismB <tab> chromsomeB <tab> match_ID_B <tab> end5_B <tab> end3_B <tab> match_score
SyntenyMiner is being developed as an application to visualize and interrogate comparisons among multiple complete genome sequences. The interface provides a navigable view of matches to a reference genome sequence. Sequence matches between chromsomes of different genomes can be further examined in the context of a dot plot, revealing both large and small-scale genome rearragements including inversions, insertions/deletions, and translocations. The software is immature at this stage, but still can serve useful to facilitate research, and there are clear directions for future developments. Developers are welcome to contribute.
Note: this tool is no longer being supported. The code will remain on sourceforge in case it is found useful. 03-31-2010 bhaas.
Download the latest version of SyntenyMiner-ultra-beta from Sourceforge.
SyntenyMiner uses as input a tab-delimited file that describes the coordinates of sequence matches found from pairwise comparisons among multiple complete genomes. The format of the input file is as follows:
organismA <tab> chromsomeA <tab> match_ID_A <tab> end5_A <tab> end3_A <tab> organismB <tab> chromsomeB <tab> match_ID_B <tab> end5_B <tab> end3_B <tab> match_score
An example input file is provided that describes matches found among the three genomes: Aspergillus fumigatus, Aspergillus oryzae, and Aspergillus nidulans. The filename and the first few lines are shown below:
% more testData/fum_nid_oryz.smine fum AF12 1 4331 5926 nid linkage_group_V 2 20575 22155 10 fum AF12 3 122296 122364 nid linkage_group_V 4 1195727 1195795 10 fum AF12 5 122357 122274 nid linkage_group_V 6 1195788 1195705 10 fum AF12 7 5925 4420 nid linkage_group_V 8 22154 20658 10 fum AF6 9 2188611 2187259 nid linkage_group_V 10 24510 25865 10 fum AF6 11 2185721 2185002 nid linkage_group_V 12 27425 28147 10
In this example, the match score was set to 10 arbitrarily. The current version does not use the match_score field adequately, so you can arbitrarily set it to any value for now.
To generate all the matches between genomes, you can use a tool such as nucmer or promer of the MUMMER software suite. Whichever tool you use, be sure to convert the output to the tab-delimited format described above for use with SyntenyMiner.
To launch SyntenyMiner using this data set, run it as follows from the root installation directory:
% ./SyntenyMiner testData/fum_nid_oryz.smine
A simple application window will appear (embarrasingly simple) which has three menu options:
Multi-genome comparison
Single molecule match plot
Two molecule XY-plot
Below, examples will be shown for each.
The SyntenyMiner is not tied to any specific alignment program. As long as you can create the required tab-delimited input format described above, you should be able to benefit from using the SyntenyMiner software. If you decide to use MUMmer to perform your alignments, scripts are included in the SyntenyMiner distribution to provide output conversion facilities.
For example. Given two fasta files that correspond to GenomeA and GenomeB, you can run the MUMmer software like so using promer for example (six-frame genome translation and alignment):
% promer GenomeA GenomeB
This will create, by default, an output file called out.delta Convert the out.delta file to a more human readable tab-delimited output format using the MUMmer utility show-coords like so:
% show-coords -T out.delta > out.coords
The following two steps will convert this output format to one compatible with the SyntenyMiner software. The following scripts are included in the SyntenyMiner distribution.
% mummer_coords_converter.pl Organism_Name_A Organism_Name_B out.coords > out.converted
The Organism_Name values should specify the name of the organism corresponding to GenomeA or GenomeB, respectively, and NOT contain any spaces since it will be considered a single command-line parameter.
Convert this output file to the SyntenyMiner input format like so:
% input_to_smine_fmt.pl < out.converted > out.smine
The out.smine file can be used as input to SyntenyMiner. If you have more than two genomes to compare, repeat this process for each of the pairwise genome comparisons, and then combine all your separate out.smine files into a single output file that includes all genome comparisons.
As in the included example described earlier, you can launch syntenyminer using the output file like so:
% ./SyntenyMiner out.smine
Select the menu option Plots->Multi-genome comparison. You are then prompted with another menu which requires the selection of the reference organism. The reference organism is drawn to scale with matches to the other genomes shown as colored blocks at corresponding match positions on the reference sequence. A screenshot for this example is shown below, with A. fumigatus chosen as the reference:
Each of the A. fumigatus 13 different chromosomes is drawn with matches to A. oryzae shown in the first color bar to the right, followed by matches to the A. nidulans genome. Each match is colored according to the molecule designated in the other genome, as shown by the color key.
Mousing over the colored regions reveals the match coordinates. The knobs at the top and right edges of the panel provide zooming capabilities for each axis.
The single molecule match plot provides a view of all the matches found using a single molecule as the reference. The current view is quite primitive but can still provide valuable insights. To launch this view, select the menu option Plots->Single molecule match plot. An example illustrating matches found to the reference A. nidulans linkage group I molecule.
From the image above, we can see a large region of matches between A. nidulans linkage group I and our A. fumigatus molecule AF5 (chromosome 5). In the context of an X-Y plot, we can further examine the matches and rearrangments that have taken place over the course of the evolutionary divergence between the two organisms. To launch the X-Y plot for these two molecules, select the menu option Plot->"Two molecule XY plot". The plot is shown below:
The plot is zoomable and navigable.