Model Internals

Each PhenDB model is trained on sets of bacterial ENOGs (orthologous groups from EggNOG 4.5), which have or have not been identified in the training genomes. Each ENOG is given a weight, with the magnitude of the weight being the importance of that ENOG for the final prediction. The sign of the weight indicates whether the presence (positive weight) or absence (negative weight) of this ENOG is indicative of the trait.

This table lists the 250 highest-ranking ENOGs of this model.

rank in model enog name enog description weight in model
1 COG3590 peptidase -0.087905
2 COG0729 surface antigen -0.079693
3 32UTB -0.079636
4 33AGD -0.079636
5 33Z2K -0.079636
6 34AYS -0.079610
7 30YFD -0.079415
8 COG0580 Belongs to the MIP aquaporin (TC 1.A.8) family -0.077834
9 COG1266 CAAX protease self-immunity -0.077464
10 COG4977 sequence-specific DNA binding -0.071677
11 COG3226 Transcriptional regulator -0.069302
12 COG1376 ErfK ybiS ycfS ynhG family protein -0.050612
13 COG4326 sporulation control protein 0.028288
14 33BFA 0.027409
15 COG2971 BadF BadG BcrA BcrD 0.027185
16 COG1397 ADP-ribosylglycohydrolase 0.027004
17 COG0448 Catalyzes the synthesis of ADP-glucose, a sugar donor used in elongation reactions on alpha-glucans -0.026520
18 COG4636 protein conserved in cyanobacteria 0.025727
19 COG0746 molybdenum cofactor guanylyltransferase activity -0.025599
20 COG1971 Probably functions as a manganese efflux pump 0.025439
21 COG3469 chitinase activity 0.025093
22 COG3335 DDE superfamily endonuclease 0.024068
23 COG4978 Transcriptional regulator 0.023749
24 2Z7P7 sporulation peptidase YabG 0.023500
25 COG4193 domain, Protein 0.023353
26 COG5529 intein-mediated protein splicing 0.023189
27 32Y4N CotJB protein 0.023053
28 COG3953 SLT domain 0.022696
29 COG1366 Belongs to the anti-sigma-factor antagonist family 0.022516
30 COG1988 membrane-bound metal-dependent 0.022511
31 COG5293 efflux transmembrane transporter activity 0.022421
32 COG4521 taurine ABC transporter 0.022286
33 3395Y Domain of unknown function (DUF1918) 0.022143
34 330BT Protein of unknown function (DUF3243) 0.021958
35 COG1738 Involved in the import of queuosine (Q) precursors, required for Q precursor salvage -0.021527
36 COG3300 MHYT domain 0.021418
37 330Z5 0.021158
38 COG2718 Belongs to the UPF0229 family 0.020926
39 COG5502 conserved protein (DUF2267) 0.020804
40 COG2766 serine protein kinase 0.020717
41 COG4261 Acyltransferase -0.020585
42 COG2169 Transcriptional regulator 0.020462
43 COG4089 membrane 0.020422
44 COG4993 Dehydrogenase 0.020358
45 COG0303 'Molybdopterin -0.020306
46 COG3614 Histidine kinase -0.020225
47 COG2189 Belongs to the N(4) N(6)-methyltransferase family -0.020153
48 COG2361 Protein of unknown function DUF86 -0.020104
49 COG1194 a g-specific adenine glycosylase -0.020080
50 COG2959 enzyme of heme biosynthesis 0.020065
51 COG4968 Prepilin-type N-terminal cleavage methylation domain -0.020024
52 313RY Protein of unknown function (DUF1256) 0.019947
53 2Z9N7 PFAM spore germination B3 GerAC family protein 0.019780
54 COG4111 nicotinate-nucleotide adenylyltransferase activity 0.019773
55 32T7R Protein of unknown function (DUF2500) 0.019680
56 COG5606 Transcriptional regulator 0.019657
57 COG3695 enzyme binding -0.019569
58 COG0432 Pfam Uncharacterised protein family UPF0047 0.019461
59 COG2061 Catalyzes the anaerobic formation of alpha-ketobutyrate and ammonia from threonine in a two-step reaction. The first step involved a dehydration of threonine and a production of enamine intermediates (aminocrotonate), which tautomerizes to its imine form (iminobutyrate). Both intermediates are unstable and short- lived. The second step is the nonenzymatic hydrolysis of the enamine imine intermediates to form 2-ketobutyrate and free ammonia. In the low water environment of the cell, the second step is accelerated by RidA 0.019386
60 32ZA5 YabP family 0.019372
61 COG4943 cyclic-guanylate-specific phosphodiesterase activity -0.019332
62 32RFA Sporulation and cell division protein SsgA 0.019135
63 COG4200 ABC-2 family transporter protein 0.018974
64 COG1809 phosphosulfolactate synthase activity 0.018943
65 COG3325 Belongs to the glycosyl hydrolase 18 family 0.018920
66 COG3593 DNA synthesis involved in DNA repair 0.018888
67 COG0476 Involved in molybdopterin and thiamine biosynthesis, family 2 -0.018727
68 COG1099 with the TIM-barrel fold 0.018618
69 COG1363 Peptidase, m42 -0.018609
70 COG3127 ABC-type transport system involved in lysophospholipase L1, biosynthesis, permease component 0.018594
71 COG1942 4-Oxalocrotonate Tautomerase 0.018588
72 COG0011 TIGRFAM Protein of 0.018585
73 2Z7JI Putative amidase domain 0.018569
74 COG0715 COG0715 ABC-type nitrate sulfonate bicarbonate transport systems periplasmic components 0.018567
75 2Z93X Spore germination protein 0.018553
76 COG0700 Nucleoside recognition 0.018520
77 31K93 stage ii sporulation protein r 0.018515
78 COG3858 chitin binding 0.018457
79 COG1977 Involved in sulfur transfer in the conversion of molybdopterin precursor Z to molybdopterin -0.018451
80 COG0398 Pfam SNARE associated Golgi protein 0.018381
81 COG3551 Protein conserved in bacteria -0.018355
82 COG0464 growth 0.018343
83 COG1897 L-methionine biosynthetic process from homoserine via O-succinyl-L-homoserine and cystathionine 0.018323
84 COG2307 Protein conserved in bacteria -0.018220
85 COG0836 Belongs to the mannose-6-phosphate isomerase type 2 family -0.018212
86 31MZ5 NUDIX domain 0.018176
87 COG2112 serine threonine protein kinase 0.018099
88 COG2221 Nitrite and sulphite reductase 4Fe-4S 0.018076
89 2Z86B Domain of unknown function (DUF4291) 0.018072
90 COG3708 Transcriptional regulator 0.018018
91 COG1556 Lactate utilization protein 0.018010
92 COG2021 Transfers an acetyl group from acetyl-CoA to L- homoserine, forming acetyl-L-homoserine 0.017981
93 32RKH 0.017968
94 COG3773 Cell wall hydrolase 0.017898
95 COG1332 CRISPR-associated RAMP protein, Csm5 family -0.017875
96 COG1421 Csm2 Type III-A -0.017831
97 COG5616 cAMP biosynthetic process -0.017822
98 COG1482 cell wall glycoprotein biosynthetic process 0.017745
99 COG5164 cell wall organization 0.017627
100 COG1043 involved in the biosynthesis of lipid A, a phosphorylated glycolipid that anchors the lipopolysaccharide to the outer membrane of the cell -0.017625
101 COG2719 SpoVR family 0.017622
102 COG3809 Transcription factor zinc-finger 0.017617
103 COG0784 Response regulator, receiver 0.017606
104 COG1234 tRNA 3'-trailer cleavage 0.017590
105 COG3423 Transcriptional regulator 0.017503
106 COG2087 Catalyzes ATP-dependent phosphorylation of adenosylcobinamide and addition of GMP to adenosylcobinamide phosphate 0.017483
107 COG1476 TRANSCRIPTIONal 0.017439
108 COG1430 acr, cog1430 -0.017428
109 COG2747 anti-sigma28 factor FlgM 0.017296
110 32S0Q Stage III sporulation protein AB (spore_III_AB) 0.017253
111 330BE Sporulation protein YtrH 0.017235
112 COG2192 nodulation 0.017217
113 COG3440 Restriction endonuclease -0.017183
114 COG0647 UMP catabolic process -0.017182
115 COG0302 gtp cyclohydrolase 0.017143
116 COG3903 Transcriptional regulator 0.017141
117 COG1620 l-lactate permease 0.017123
118 33AHG Protein of unknown function (DUF1659) 0.017120
119 COG1435 thymidine kinase activity -0.017078
120 COG4188 dienelactone hydrolase 0.017070
121 COG2964 Protein conserved in bacteria -0.017064
122 COG0562 UDP-galactopyranose mutase -0.017024
123 COG3448 diguanylate cyclase activity 0.016967
124 2ZKXY 0.016952
125 COG3937 granule-associated protein 0.016918
126 3315W Domain of unknown function (DUF397) 0.016829
127 COG3703 Catalyzes the cleavage of glutathione into 5-oxo-L- proline and a Cys-Gly dipeptide. Acts specifically on glutathione, but not on other gamma-glutamyl peptides 0.016822
128 COG0665 tRNA (5-methylaminomethyl-2-thiouridylate)-methyltransferase activity 0.016807
129 COG1515 deoxyribonuclease V activity 0.016750
130 COG1469 GTP cyclohydrolase I activity -0.016706
131 32DHJ 0.016654
132 COG1979 alcohol dehydrogenase 0.016605
133 COG2414 aldehyde ferredoxin oxidoreductase -0.016593
134 COG3584 3D domain protein 0.016580
135 COG1432 Conserved Protein -0.016544
136 COG3688 RNA-binding protein containing a PIN domain 0.016543
137 32ZT3 Thioredoxin domain -0.016494
138 COG2163 COG2163 Ribosomal protein L14E L6E L27E 0.016473
139 COG0338 D12 class N6 adenine-specific DNA methyltransferase 0.016446
140 COG4963 Pilus assembly protein -0.016395
141 COG0025 NhaP-type Na H and K H -0.016358
142 COG1794 racemase activity, acting on amino acids and derivatives -0.016340
143 COG1116 anion transmembrane transporter activity 0.016317
144 COG1742 UPF0060 membrane protein 0.016101
145 33D9H Peptidase propeptide and YPEB domain 0.016073
146 33H2J 0.016047
147 COG1804 L-carnitine dehydratase bile acid-inducible protein F 0.015955
148 COG3415 Transposase 0.015936
149 342R0 Trypsin-like peptidase domain -0.015852
150 COG0586 Pfam SNARE associated Golgi protein -0.015849
151 31SDF -0.015841
152 COG2120 A mycothiol (MSH, N-acetylcysteinyl-glucosaminyl- inositol) S-conjugate amidase, it recycles conjugated MSH to the N-acetyl cysteine conjugate (AcCys S-conjugate, a mercapturic acid) and the MSH precursor. Involved in MSH-dependent detoxification of a number of alkylating agents and antibiotics 0.015838
153 COG2115 Belongs to the xylose isomerase family 0.015775
154 32Z9X 0.015760
155 COG2805 Type II/IV secretion system protein -0.015748
156 COG3591 Belongs to the peptidase S1B family 0.015714
157 COG0633 Ferredoxin -0.015656
158 COG3616 Alanine racemase, N-terminal domain -0.015625
159 33BJ4 Flp Fap pilin component -0.015607
160 COG3973 AAA domain 0.015592
161 COG4581 dead DEAH box helicase 0.015585
162 COG4864 UPF0365 protein 0.015572
163 COG0252 asparaginase activity -0.015560
164 COG1519 Transferase -0.015556
165 COG0589 response to stress -0.015545
166 32SJW Stage III sporulation protein AD 0.015506
167 COG4869 propanediol catabolic process 0.015501
168 30PM8 Cysteine-rich secretory protein family 0.015490
169 COG4655 Putative Flp pilus-assembly TadE/G-like -0.015481
170 COG1857 crispr-associated protein 0.015430
171 COG1254 Belongs to the acylphosphatase family -0.015429
172 COG0609 Belongs to the binding-protein-dependent transport system permease family. FecCD subfamily -0.015415
173 COG4767 VanZ like family 0.015385
174 COG2453 phosphatase 0.015382
175 30SGP 0.015362
176 31R7N 0.015354
177 COG3541 Predicted nucleotidyltransferase 0.015336
178 COG2208 phosphoserine phosphatase activity 0.015291
179 COG1686 Belongs to the peptidase S11 family 0.015279
180 COG0663 COG0663 Carbonic anhydrases acetyltransferases, isoleucine patch superfamily -0.015145
181 COG5018 Exonuclease 0.015142
182 COG1929 Belongs to the glycerate kinase type-1 family -0.015126
183 COG0127 Pyrophosphatase that catalyzes the hydrolysis of nucleoside triphosphates to their monophosphate derivatives, with a high preference for the non-canonical purine nucleotides XTP (xanthosine triphosphate), dITP (deoxyinosine triphosphate) and ITP. Seems to function as a house-cleaning enzyme that removes non-canonical purine nucleotides from the nucleotide pool, thus preventing their incorporation into DNA RNA and avoiding chromosomal lesions -0.015120
184 COG2715 membrane protein required for spore maturation 0.015096
185 COG0387 Pfam Sodium calcium exchanger 0.015090
186 333C3 Domain of unknown function (DUF397) 0.015065
187 COG4533 DNA binding -0.015064
188 32SBU Tryptophan transporter 0.015062
189 COG0662 Cupin 2, conserved barrel domain protein 0.015045
190 COG0330 HflC and HflK could -0.015034
191 COG1089 Catalyzes the conversion of GDP-D-mannose to GDP-4- dehydro-6-deoxy-D-mannose -0.015022
192 32Z9Y DNA-templated transcription, termination 0.015008
193 COG0680 Initiates the rapid degradation of small, acid-soluble proteins during spore germination 0.014915
194 COG1134 teichoic acid transport -0.014905
195 COG1032 radical SAM domain protein 0.014858
196 COG3030 protein affecting phage T7 exclusion by the F plasmid 0.014841
197 COG3668 Plasmid stabilization system -0.014825
198 2Z7SK Protein of unknown function (DUF3754) 0.014823
199 COG0395 ABC-type sugar transport system, permease component -0.014774
200 33WXY 0.014736
201 COG0427 acetyl-CoA hydrolase -0.014705
202 COG1322 Protein conserved in bacteria -0.014702
203 COG1468 defense response to virus 0.014692
204 COG1734 Transcription factor that acts by binding directly to the RNA polymerase (RNAP). Required for negative regulation of rRNA expression and positive regulation of several amino acid biosynthesis promoters 0.014676
205 COG4401 Chorismate mutase type I 0.014668
206 COG0112 Catalyzes the reversible interconversion of serine and glycine with tetrahydrofolate (THF) serving as the one-carbon carrier. This reaction serves as the major source of one-carbon groups required for the biosynthesis of purines, thymidylate, methionine, and other important biomolecules. Also exhibits THF- independent aldolase activity toward beta-hydroxyamino acids, producing glycine and aldehydes, via a retro-aldol mechanism -0.014659
207 COG0600 ABC-type nitrate sulfonate bicarbonate transport system permease component 0.014649
208 2ZWHR helix-turn-helix, Psq domain 0.014639
209 COG0239 Important for reducing fluoride concentration in the cell, thus reducing its toxicity -0.014621
210 COG0288 reversible hydration of carbon dioxide 0.014610
211 COG1972 Belongs to the concentrative nucleoside transporter (CNT) (TC 2.A.41) family 0.014595
212 COG4300 Cadmium resistance transporter -0.014582
213 COG1741 Belongs to the pirin family -0.014566
214 COG3236 hydrolase activity, hydrolyzing N-glycosyl compounds 0.014535
215 COG0515 serine threonine protein kinase 0.014466
216 COG1887 Glycosyl glycerophosphate transferases involved in teichoic acid biosynthesis TagF TagB EpsJ RodC -0.014451
217 COG1605 Chorismate mutase 0.014437
218 COG3869 Catalyzes the specific phosphorylation of arginine residues in 0.014436
219 34CD0 Thioredoxin 0.014434
220 COG3961 Belongs to the TPP enzyme family 0.014380
221 2ZBPN Protein of unknown function DUF2625 0.014375
222 COG5460 Uncharacterized conserved protein (DUF2164) 0.014371
223 COG4257 antibiotic catabolic process 0.014362
224 COG1318 COGs COG1318 transcriptional regulator protein 0.014306
225 COG2102 of PP-loop superfamily 0.014302
226 COG3377 Domain of unknown function (DUF1805) 0.014240
227 2ZT6I 0.014233
228 COG2062 phosphohistidine phosphatase, SixA 0.014181
229 COG0296 Catalyzes the formation of the alpha-1,6-glucosidic linkages in glycogen by scission of a 1,4-alpha-linked oligosaccharide from growing alpha-1,4-glucan chains and the subsequent attachment of the oligosaccharide to the alpha-1,6 position -0.014162
230 COG1893 Catalyzes the NADPH-dependent reduction of ketopantoate into pantoic acid -0.014157
231 COG2184 nucleotidyltransferase activity -0.014154
232 COG2128 Antioxidant protein with alkyl hydroperoxidase activity. Required for the reduction of the AhpC active site cysteine residues and for the regeneration of the AhpC enzyme activity 0.014150
233 33X5Q Acetyltransferase (GNAT) domain 0.014119
234 COG1273 With LigD forms a non-homologous end joining (NHEJ) DNA repair enzyme, which repairs dsDNA breaks with reduced fidelity. Binds linear dsDNA with 5'- and 3'- overhangs but not closed circular dsDNA nor ssDNA. Recruits and stimulates the ligase activity of LigD 0.014080
235 COG1561 YicC-like family, N-terminal region 0.013960
236 COG3070 regulator of competence-specific genes -0.013938
237 COG1136 (ABC) transporter 0.013936
238 COG1501 Belongs to the glycosyl hydrolase 31 family 0.013930
239 COG0490 domain, Protein 0.013913
240 COG0365 Catalyzes the conversion of acetate into acetyl-CoA (AcCoA), an essential intermediate at the junction of anabolic and catabolic pathways. AcsA undergoes a two-step reaction. In the first half reaction, AcsA combines acetate with ATP to form acetyl-adenylate (AcAMP) intermediate. In the second half reaction, it can then transfer the acetyl group from AcAMP to the sulfhydryl group of CoA, forming the product AcCoA 0.013883
241 COG2402 Toxic component of a toxin-antitoxin (TA) module. An RNase 0.013873
242 COG1661 DNA-binding protein with PD1-like DNA-binding motif 0.013852
243 COG5423 Predicted metal-binding protein (DUF2284) -0.013849
244 COG4887 Protein of unknown function (DUF1847) 0.013847
245 COG1210 Utp--glucose-1-phosphate uridylyltransferase 0.013844
246 2Z7PW stage III sporulation protein AE 0.013824
247 COG1137 ATPase activity -0.013817
248 COG0594 ribonuclease P activity -0.013780
249 COG1041 phosphinothricin N-acetyltransferase activity 0.013770
250 COG2860 membrane -0.013745