Model Internals

Each PhenDB model is trained on sets of bacterial ENOGs (orthologous groups from EggNOG 4.5), which have or have not been identified in the training genomes. Each ENOG is given a weight, with the magnitude of the weight being the importance of that ENOG for the final prediction. The sign of the weight indicates whether the presence (positive weight) or absence (negative weight) of this ENOG is indicative of the trait.

This table lists the 250 highest-ranking ENOGs of this model.

rank in model enog name enog description weight in model
1 COG3527 Alpha-acetolactate decarboxylase 0.065154
2 COG0643 Histidine kinase -0.041826
3 COG3393 -acetyltransferase 0.040926
4 COG0395 ABC-type sugar transport system, permease component -0.039647
5 COG4213 ABC transporter substrate-binding protein -0.038247
6 COG1788 Acyl CoA acetate 3-ketoacid CoA transferase, alpha subunit -0.037968
7 COG2057 Acyl CoA acetate 3-ketoacid CoA transferase beta subunit -0.037968
8 COG3843 relaxase mobilization nuclease domain protein -0.035871
9 COG0738 Major facilitator superfamily 0.035779
10 COG0210 DNA helicase -0.034170
11 COG3250 beta-galactosidase activity -0.033565
12 COG3669 Alpha-L-fucosidase -0.032472
13 COG1199 ATP-dependent helicase activity 0.032190
14 COG1080 General (non sugar-specific) component of the phosphoenolpyruvate-dependent sugar phosphotransferase system (sugar PTS). This major carbohydrate active-transport system catalyzes the phosphorylation of incoming sugar substrates concomitantly with their translocation across the cell membrane. Enzyme I transfers the phosphoryl group from phosphoenolpyruvate (PEP) to the phosphoryl carrier protein (HPr) -0.031737
15 COG4758 membrane 0.031559
16 COG3247 response to pH -0.031490
17 COG0071 Belongs to the small heat shock protein (HSP20) family -0.030368
18 COG1228 amidohydrolase -0.030283
19 COG2524 Domain in cystathionine beta-synthase and other proteins. 0.030273
20 COG3587 Type III restriction enzyme, res subunit -0.030180
21 COG4857 Catalyzes the phosphorylation of methylthioribose into methylthioribose-1-phosphate 0.029968
22 33KDK Ribosomal protein L33 -0.029914
23 COG2409 Drug exporters of the RND superfamily 0.029820
24 COG2259 Doxx family 0.029567
25 COG4977 sequence-specific DNA binding -0.029503
26 32D2I 0.029218
27 COG1501 Belongs to the glycosyl hydrolase 31 family -0.029157
28 COG1134 teichoic acid transport 0.028955
29 COG4225 unsaturated rhamnogalacturonyl hydrolase activity 0.028372
30 2ZB5J Domain of unknown function (DUF4299) -0.028317
31 COG2918 glutathione biosynthetic process 0.028300
32 COG0348 domain, Protein -0.028253
33 COG2873 o-acetylhomoserine -0.028010
34 COG3837 Cupin domain -0.027694
35 2Z7S8 Bacterial cellulose synthase subunit 0.027607
36 COG3275 phosphorelay sensor kinase activity 0.027589
37 COG4732 ThiW protein -0.027583
38 COG0644 oxidoreductase -0.027354
39 33E1E Protein of unknown function (DUF3042) 0.027241
40 COG1182 Catalyzes the reductive cleavage of azo bond in aromatic azo compounds to the corresponding amines. Requires NADH, but not NADPH, as an electron donor for its activity 0.027181
41 COG0239 Important for reducing fluoride concentration in the cell, thus reducing its toxicity -0.027133
42 COG4099 phospholipase Carboxylesterase 0.026953
43 COG4409 exo-alpha-(2->6)-sialidase activity -0.026734
44 COG4894 glucomannan catabolic process 0.026437
45 COG3010 Converts N-acetylmannosamine-6-phosphate (ManNAc-6-P) to N-acetylglucosamine-6-phosphate (GlcNAc-6-P) -0.026334
46 COG4464 protein tyrosine phosphatase activity 0.026291
47 COG1077 Cell shape determining protein MreB Mrl -0.026047
48 COG3410 Uncharacterized conserved protein (DUF2075) -0.025800
49 COG0673 inositol 2-dehydrogenase activity -0.025748
50 COG2213 PTS system mannitol-specific 0.025532
51 COG1021 2,3-dihydroxybenzoate-AMP ligase 0.024915
52 COG0289 4-hydroxy-tetrahydrodipicolinate reductase 0.024863
53 COG3212 peptidase 0.024722
54 33F46 -0.024575
55 COG1061 Type III restriction enzyme res subunit 0.024565
56 345AV Cyclic nucleotide-monophosphate binding domain 0.024547
57 COG1797 cobyrinic acid a,c-diamide synthase activity -0.024405
58 COG2944 sequence-specific DNA binding 0.024398
59 COG3072 Adenylate cyclase 0.024308
60 COG1455 protein-N(PI)-phosphohistidine-lactose phosphotransferase system transporter activity 0.024250
61 COG3559 Exporter of polyketide antibiotics -0.024125
62 COG0610 Subunit R is required for both nuclease and ATPase activities, but not for modification -0.024105
63 2ZCE3 0.024085
64 COG3272 Protein of unknown function (DUF1722) -0.024052
65 COG3878 Protein conserved in bacteria 0.024035
66 32AVZ helix_turn_helix multiple antibiotic resistance protein 0.024014
67 COG4886 Leucine-rich repeat (LRR) protein -0.023955
68 COG4707 Protein conserved in bacteria 0.023916
69 COG1055 Involved in arsenical resistance. Thought to form the channel of an arsenite pump -0.023907
70 COG3214 Protein conserved in bacteria 0.023903
71 COG1443 Catalyzes the 1,3-allylic rearrangement of the homoallylic substrate isopentenyl (IPP) to its highly electrophilic allylic isomer, dimethylallyl diphosphate (DMAPP) -0.023884
72 COG3619 membrane -0.023788
73 COG3081 Nucleoid-associated protein 0.023591
74 2Z844 2-dehydro-3-deoxyphosphooctonate aldolase 0.023585
75 32YHS 0.023561
76 COG3246 L-lysine catabolic process to acetate -0.023542
77 33G6Z Heavy-metal-associated domain 0.023533
78 COG3957 D-xylulose 5-phosphate D-fructose 6-phosphate phosphoketolase -0.023513
79 COG0662 Cupin 2, conserved barrel domain protein 0.023449
80 COG2503 HAD superfamily, subfamily IIIB (Acid phosphatase) 0.023410
81 COG4550 Control of competence regulator ComK, YlbF/YmcA 0.023381
82 COG3768 UPF0283 membrane protein 0.023354
83 COG2456 Psort location CytoplasmicMembrane, score 0.023348
84 COG0655 NAD(P)H dehydrogenase (quinone) activity -0.023342
85 COG1092 Specifically methylates the guanine in position 2445 (m2G2445) and the guanine in position 2069 (m7G2069) of 23S rRNA 0.023340
86 COG3041 Addiction module toxin RelE StbE family -0.023324
87 COG1904 glucuronate isomerase 0.023256
88 COG0368 Joins adenosylcobinamide-GDP and alpha-ribazole to generate adenosylcobalamin (Ado-cobalamin). Also synthesizes adenosylcobalamin 5'-phosphate from adenosylcobinamide-GDP and alpha-ribazole 5'-phosphate -0.023201
89 COG2227 3-demethylubiquinone-9 3-O-methyltransferase activity -0.023189
90 COG0387 Pfam Sodium calcium exchanger 0.023122
91 32ESZ Domain of unknown function (DUF4352) 0.023095
92 30A8Q Bacterial regulatory proteins, tetR family 0.023057
93 COG1878 Catalyzes the hydrolysis of N-formyl-L-kynurenine to L- kynurenine, the second step in the kynurenine pathway of tryptophan degradation -0.023041
94 COG3564 Protein conserved in bacteria -0.023009
95 COG2185 Catalyzes the reversible interconversion of isobutyryl- CoA and n-butyryl-CoA, using radical chemistry. Also exhibits GTPase activity, associated with its G-protein domain (MeaI) that functions as a chaperone that assists cofactor delivery and proper holo-enzyme assembly -0.022938
96 COG0400 carboxylic ester hydrolase activity 0.022853
97 COG0501 Belongs to the peptidase M48B family -0.022758
98 COG0433 COG0433 Predicted ATPase -0.022714
99 COG2031 fatty acid transporter -0.022651
100 COG4529 Protein conserved in bacteria 0.022641
101 COG3404 Formiminotransferase-cyclodeaminase -0.022629
102 COG4300 Cadmium resistance transporter -0.022398
103 COG3507 Belongs to the glycosyl hydrolase 43 family 0.022384
104 COG0560 Phosphoserine phosphatase -0.022372
105 COG1621 beta-fructofuranosidase activity 0.022353
106 COG3378 Phage plasmid primase P4 family -0.022332
107 COG0715 COG0715 ABC-type nitrate sulfonate bicarbonate transport systems periplasmic components 0.022319
108 COG0716 FMN binding -0.022228
109 COG2129 metallophosphoesterase -0.022175
110 32NI4 Domain of unknown function (DUF4355) -0.022142
111 337J0 0.022133
112 COG3775 system Galactitol-specific IIC component -0.022091
113 COG4690 dipeptidase activity -0.022076
114 3399Y 0.022062
115 COG0697 spore germination -0.022060
116 COG4106 trans-aconitate 2-methyltransferase activity -0.021964
117 COG4633 Protein conserved in bacteria -0.021925
118 COG1479 Protein of unknown function DUF262 -0.021889
119 33GUG LPXTG cell wall anchor motif 0.021852
120 COG0253 Catalyzes the stereoinversion of LL-2,6- diaminoheptanedioate (L,L-DAP) to meso-diaminoheptanedioate (meso- DAP), a precursor of L-lysine and an essential component of the bacterial peptidoglycan 0.021751
121 COG1146 Ferredoxins are iron-sulfur proteins that transfer electrons in a wide variety of metabolic reactions -0.021713
122 COG4652 -0.021679
123 COG1554 hydrolase, family 65, central catalytic -0.021644
124 316PQ 0.021638
125 COG3464 Transposase -0.021635
126 COG3475 LICD family -0.021611
127 COG2513 Catalyzes the thermodynamically favored C-C bond cleavage of (2R,3S)-2-methylisocitrate to yield pyruvate and succinate -0.021591
128 COG2318 DinB family 0.021349
129 333U0 Lipoprotein 0.021333
130 339YM Domain of unknown function (DUF1707) 0.021276
131 COG0698 ribose 5-phosphate isomerase -0.021257
132 COG3708 Transcriptional regulator 0.021170
133 COG2150 ACT domain 0.021132
134 COG2087 Catalyzes ATP-dependent phosphorylation of adenosylcobinamide and addition of GMP to adenosylcobinamide phosphate -0.021111
135 COG2066 Belongs to the glutaminase family 0.021079
136 COG4627 Pfam Methyltransferase 0.021074
137 COG3458 cephalosporin-C deacetylase activity -0.021004
138 COG2128 Antioxidant protein with alkyl hydroperoxidase activity. Required for the reduction of the AhpC active site cysteine residues and for the regeneration of the AhpC enzyme activity 0.020989
139 COG3303 Catalyzes the reduction of nitrite to ammonia, consuming six electrons in the process -0.020900
140 COG0619 Transmembrane (T) component of an energy-coupling factor (ECF) ABC-transporter complex. Unlike classic ABC transporters this ECF transporter provides the energy necessary to transport a number of different substrates -0.020726
141 33Y3W 0.020695
142 COG0639 Hydrolyzes diadenosine 5',5'''-P1,P4-tetraphosphate to yield ADP -0.020615
143 COG1082 Xylose isomerase domain protein TIM barrel -0.020601
144 COG1022 Amp-dependent synthetase and ligase -0.020563
145 COG1131 (ABC) transporter 0.020551
146 COG1017 nitric oxide dioxygenase activity 0.020549
147 COG1329 Transcriptional regulator -0.020545
148 COG2764 glyoxalase bleomycin resistance protein dioxygenase -0.020542
149 COG3333 protein conserved in bacteria -0.020520
150 33TTG CAAX protease self-immunity 0.020442
151 COG1478 Catalyzes the GTP-dependent successive addition of multiple gamma-linked L-glutamates to the L-lactyl phosphodiester of 7,8-didemethyl-8-hydroxy-5-deazariboflavin (F420-0) to form polyglutamated F420 derivatives -0.020440
152 COG0019 diaminopimelate decarboxylase activity 0.020393
153 COG2816 NAD+ diphosphatase activity -0.020375
154 COG0460 homoserine dehydrogenase 0.020373
155 2ZUJH Protein of unknown function (DUF3042) -0.020315
156 33E5K An automated process has identified a potential problem with this gene model 0.020247
157 COG1705 Flagellar rod assembly protein muramidase FlgJ 0.020225
158 COG2146 nitrite reductase [NAD(P)H] activity 0.020212
159 COG2440 electron transfer flavoprotein-ubiquinone oxidoreductase -0.020204
160 COG1719 4-vinyl reductase, 4VR -0.020157
161 COG1252 NADH dehydrogenase 0.020148
162 COG1802 Transcriptional regulator 0.020058
163 COG0371 Dehydrogenase 0.020024
164 COG2032 superoxide dismutase activity -0.020023
165 329UX NUDIX domain 0.019985
166 33UC0 Prophage endopeptidase tail -0.019963
167 COG2520 Methyltransferase fkbm family -0.019949
168 COG4589 Belongs to the CDS family -0.019906
169 COG1292 Belongs to the BCCT transporter (TC 2.A.15) family 0.019877
170 COG2309 aminopeptidase activity 0.019850
171 COG4695 Portal protein 0.019820
172 COG1312 Catalyzes the dehydration of D-mannonate 0.019749
173 COG4111 nicotinate-nucleotide adenylyltransferase activity -0.019732
174 32Z18 Protein of unknown function (DUF1471) 0.019706
175 COG3325 Belongs to the glycosyl hydrolase 18 family 0.019688
176 342NQ 0.019670
177 COG1521 Catalyzes the phosphorylation of pantothenate (Pan), the first step in CoA biosynthesis -0.019669
178 2ZI99 Phosphopantetheine attachment site -0.019590
179 COG3684 Belongs to the aldolase LacD family 0.019565
180 COG3629 intracellular signal transduction -0.019536
181 COG1535 isochorismatase 0.019531
182 COG1454 alcohol dehydrogenase -0.019520
183 COG2272 Belongs to the type-B carboxylesterase lipase family 0.019461
184 2ZSWH 0.019454
185 COG4978 Transcriptional regulator -0.019449
186 COG1661 DNA-binding protein with PD1-like DNA-binding motif -0.019434
187 COG5551 CRISPR-associated protein Cas6 -0.019418
188 COG0466 ATP-dependent serine protease that mediates the selective degradation of mutant and abnormal proteins as well as certain short-lived regulatory proteins. Required for cellular homeostasis and for survival from DNA damage and developmental changes induced by stress. Degrades polypeptides processively to yield small peptide fragments that are 5 to 10 amino acids long. Binds to DNA in a double-stranded, site-specific manner -0.019405
189 COG4268 DNA restriction-modification system -0.019391
190 341F6 0.019389
191 COG5504 Zn-dependent protease 0.019375
192 346DU GDSL-like Lipase/Acylhydrolase -0.019328
193 346VS -0.019228
194 30BKD Bacterial regulatory proteins, lacI family 0.019220
195 349T2 Family of unknown function (DUF5322) 0.019210
196 32QV8 0.019178
197 COG4604 ABC transporter, ATP-binding protein 0.019173
198 2ZHSG 0.019169
199 34CA1 Phage uncharacterised protein (Phage_XkdX) -0.019167
200 COG0527 aspartate kinase activity 0.019145
201 COG3538 Metal-independent alpha-mannosidase (GH125) -0.019117
202 COG1539 Catalyzes the conversion of 7,8-dihydroneopterin to 6- hydroxymethyl-7,8-dihydropterin 0.019104
203 COG1296 Branched-chain amino acid permease (Azaleucine resistance) 0.019073
204 COG2866 Carboxypeptidase -0.019068
205 COG4960 Type IV leader peptidase family -0.019062
206 COG4833 Hydrolase 0.019044
207 COG2215 Belongs to the NiCoT transporter (TC 2.A.52) family 0.019018
208 34BY3 Transposase IS200 like 0.019010
209 COG2082 Precorrin-8x methylmutase -0.019010
210 COG1142 4fe-4S ferredoxin, iron-sulfur binding domain protein 0.018996
211 COG2021 Transfers an acetyl group from acetyl-CoA to L- homoserine, forming acetyl-L-homoserine -0.018966
212 COG1230 cation diffusion facilitator family transporter -0.018961
213 COG0363 glucosamine-6-phosphate deaminase activity 0.018928
214 COG1827 regulation of RNA biosynthetic process 0.018916
215 305F1 -0.018900
216 305FJ -0.018900
217 305GS -0.018900
218 305QY -0.018900
219 33A21 -0.018900
220 34BDF -0.018900
221 COG1285 pathogenesis -0.018865
222 COG1857 crispr-associated protein -0.018785
223 COG2910 NAD(P)H-binding 0.018724
224 COG1654 Acts both as a biotin-- acetyl-CoA-carboxylase ligase and a biotin-operon repressor. In the presence of ATP, BirA activates biotin to form the BirA-biotinyl-5'-adenylate (BirA-bio- 5'-AMP or holoBirA) complex. HoloBirA can either transfer the biotinyl moiety to the biotin carboxyl carrier protein (BCCP) subunit of acetyl-CoA carboxylase, or bind to the biotin operator site and inhibit transcription of the operon 0.018717
225 COG2962 Rard protein 0.018714
226 COG3328 transposase activity -0.018711
227 COG1160 GTP binding -0.018675
228 30970 -0.018608
229 COG0067 glutamate synthase 0.018601
230 32SWP Putative HNHc nuclease 0.018593
231 2Z7NQ membrane 0.018556
232 2Z7QV Domain of unknown function (DUF4310) 0.018556
233 2ZPWH Glycine-rich SFCGS 0.018556
234 31AM7 cytoplasmic protein' 0.018556
235 32S2X 0.018556
236 COG0117 Converts 2,5-diamino-6-(ribosylamino)-4(3h)-pyrimidinone 5'-phosphate into 5-amino-6-(ribosylamino)-2,4(1h,3h)- pyrimidinedione 5'-phosphate 0.018551
237 COG2846 Di-iron-containing protein involved in the repair of iron-sulfur clusters 0.018540
238 COG0332 Catalyzes the condensation reaction of fatty acid synthesis by the addition to an acyl acceptor of two carbons from malonyl-ACP. Catalyzes the first condensation reaction which initiates fatty acid synthesis and may therefore play a role in governing the total rate of fatty acid production. Possesses both acetoacetyl-ACP synthase and acetyl transacylase activities. Its substrate specificity determines the biosynthesis of branched- chain and or straight-chain of fatty acids 0.018530
239 COG0307 riboflavin synthase, alpha 0.018527
240 2Z86A Psort location Cytoplasmic, score -0.018491
241 31SP3 -0.018483
242 COG0732 type I restriction modification DNA specificity domain -0.018477
243 COG0427 acetyl-CoA hydrolase -0.018476
244 COG3877 Protein conserved in bacteria -0.018464
245 COG4918 Iron-sulphur cluster biosynthesis 0.018459
246 COG3649 crispr-associated protein -0.018438
247 COG3548 integral membrane protein -0.018406
248 COG1854 S-ribosylhomocysteine lyase activity 0.018398
249 COG3575 Nucleotidyltransferase 0.018391
250 COG1473 amidohydrolase 0.018381