Graphical output
As an illustration, here is the graphical output obtained using the
same GC rich and noisy sequence provided as example 1 on the web
interface, using the correct "Ralstonia" model with default
parameters and the default frameshift penalty set to 8 (for a low
quality sequence).

- The horizontal axis is the sequence (here 1387 nuc. long).
- The vertical axis represents the 6 possible coding phases
(+1,+2,+3 for the direct strand, -1, -2,-3 for the reverse) and
possible non coding (IG stands for intergenic) regions. Each of
these is called a track in the sequel. Each of these possible
prediction is called a track in the sequel.
- On the 6 coding tracks, small red bars represent occurrences of
STOP codons, blue bar represent START codons. The wider the blue
bar, the better the start is (according to the presence of RBS, the
type of the codon and how degenerated the codon is in the sequence
eg. NNN is a possible start, but it is very degenerated).
- The fine black curve is a smoothed, normalized score of the
coding/non coding likelyhoods over a sliding window. The window
width can be controlled in the "Output parameters" to increase or
lower the smoothing (this does not change the prediction)
- The light green curves indicates the GC% and the GC3% of the
sequence. The GC% is indicated in the non coding track (IG) and is
artificially amplified: the track contains variations from 25% to
75% GC. The GC3% indicates the percentage of GC bases in the 3rd
position of the codons in this phase. It is not amplified (the track
contains variation from 0% to 100%).
- The large red blocks indicate the prediction of FrameD. Thin red
lines that connect two blocks indicate predicted frameshifts.
- When activated, light gray and magenta lines will indicate a
mean (expected) prediction beyond the optimal one.
On the above image, one can observe here, between 100 and 150 that
FrameD hesitates between the two possible start codons: a non
negligible part of the predictions would actually use the second
START but the first is preferred. Another position of doubt is in
the 700 region where the uncertainty on coding phase is clear: two
frameshifts around 650 and 750 roughly could explain the increase in
coding score in phase 2. Despite the fact that many suboptimal
prediction actually predict these 2 frameshift, the optimal
prediction does not.
These two possible framehifts are also visible on the central
magenta line which represent an amplified frameshift expectation
based on all possible predictions. We see a possibke insertion
around 650 (up curve) and a possible deletion around 750 (down
curve).
A frameshift is actually predicted around 900, close to the N
stretch, close to where we created it.
We can get back to the initial page and ask for a corrected sequence
and resubmit it for analyze, using the same frameshift penalty of 8
(for low quality).
Start End Ph.
148 1182 1
1255 1388 1 >
No frameshift is predicted anymore.
