| References: | 1Borodovsky M. and Lukashin A. (unpublished) 2Lomsadze A., Ter-Hovhannisyan V., Chernoff Y. and Borodovsky M., "Gene identification in novel eukaryotic genomes by self-training algorithm", Nucleic Acids Research, 2005, Vol. 33, No. 20, 6494-6506 |
| Accuracy comparison | |
GeneMark.hmm (Version 2.2a) Sequence name: Tue Nov 22 13:31:16 EST 2011 Sequence length: 12239 bp G+C content: 63.64% Matrix: Homo sapiens Tue Nov 22 13:31:16 2011 Predicted genes/exons Gene Exon Strand Exon Exon Range Exon Start/End # # Type Length Frame 1 10 - Internal 1396 1459 64 3 3 1 9 - Internal 2392 2456 65 2 1 1 8 - Internal 3127 3252 126 3 1 1 7 - Internal 6455 6547 93 3 1 1 6 - Internal 6587 6736 150 3 1 1 5 - Internal 8837 8945 109 3 3 1 4 - Internal 9176 9220 45 2 3 1 3 - Internal 9696 9861 166 2 2 1 2 - Internal 10548 10737 190 1 1 1 1 - Internal 11113 11228 116 3 2
>Tue Nov 22 13:31:16 EST 2011_1|GeneMark.hmm|gene 1|374_aa CEHVCGGLRVERVCTCVGACVWSVRARVWGPACGACVHGGTVIGSARCKAFTTREGRRAA AYNLVQRGITNLCVIGGDGSLTGANIFRSEWGSLLEELVAEGKISESTARTYSHLNIAGL VGSIDNDFCGTDMTIGTDSALHRIMEVIDAITTTAQSHQRTFVLEVMGRHCGYLALVSAL ASGADWLFIPESPPEDGWENFMCERLGETRSRGSRLNIIIIAEGAIDRNGKPILSSYVKD VRVGLGRPRGRHPARLGQLVVQRLGFDTRVTVLGHVQRGGTPSAFDRVLSSKMGMEAVMA LLEATPDTPACVVSLSGNQSVRLPLMECVQMTKEVQKAMDDKRFDEAIQLRGRSFENNWN IYKLLAHQKLSKEK
GENEMARK PREDICTIONS
Sequence: Tue Nov 22 13:31:16 EST 2011
Sequence file: gm_sequence
Sequence length: 12239
GC Content: 63.64%
Window length: 96
Window step: 12
Threshold value: 0.500
Matrix: human_62.mat
Matrix author: Nikolai Ivanov
Matrix order: 5
List of Regions of interest
(regions from stop to stop codon w/ a signal in between)
LEnd REnd Strand Frame
-------- -------- ----------- -----
123 452 complement fr 2
305 880 direct fr 2
1094 1669 complement fr 1
1282 1473 complement fr 3
1623 1934 complement fr 2
1847 2611 complement fr 1
2388 2561 complement fr 2
2986 3621 complement fr 3
3488 3904 complement fr 1
3624 4034 complement fr 2
6011 7120 complement fr 1
8687 9100 complement fr 1
9692 10093 complement fr 1
10234 11991 complement fr 3
10667 11932 direct fr 2
List of Protein-Coding Exons
(regions between acceptor and donor site w/ coding function >0.500000)
Left Right
End End Strand Frame Prob
------- ------- ----------- ----- ------
324 400 complement fr 2 0.6195
336 379 0.7173
444 588 direct fr 2 0.5444
461 505 0.7139
1354 1464 complement fr 3 0.6182
1463 1624 complement fr 1 0.5178
1486 1619 0.6082
1524 1624 complement fr 1 0.4823
1526 1668 complement fr 3 0.2942
1634 1825 complement fr 2 0.3791
1634 1825 complement fr 2 0.3791
1893 2055 complement fr 1 0.6093
1959 2026 0.9179
2392 2467 complement fr 2 0.8493
2392 2446 0.8013
3125 3276 complement fr 3 0.8187
3125 3227 0.8434
3166 3276 complement fr 3 0.8078
3182 3227 0.9453
3527 3589 complement fr 1 0.8417
3757 3903 complement fr 2 0.6273
3765 3894 0.6945
3867 3903 complement fr 2 0.4176
3870 3894 0.5067
3965 4024 complement fr 1 0.2622
3981 4012 0.2143
4030 4089 complement fr 3 0.2811
4046 4077 0.4583
4527 4581 direct fr 3 0.4962
4541 4565 0.5178
6463 6753 complement fr 1 0.6920
6487 6743 0.6794
8812 9025 complement fr 1 0.8005
8840 9008 0.9197
9589 9693 complement fr 2 0.3592
9608 9681 0.3864
9601 9704 direct fr 1 0.3993
9701 9909 complement fr 1 0.8221
9731 9860 0.9999
10551 10735 complement fr 3 0.9467
10564 10727 0.9941
10877 10972 direct fr 2 0.4679
10926 10965 0.5937
11652 11737 complement fr 2 0.2923
11679 11737 0.2564
ABOUT THE MATRIX USED:
Date: March 6, 2003