Model Internals

Each PhenDB model is trained on sets of bacterial ENOGs (orthologous groups from EggNOG 4.5), which have or have not been identified in the training genomes. Each ENOG is given a weight, with the magnitude of the weight being the importance of that ENOG for the final prediction. The sign of the weight indicates whether the presence (positive weight) or absence (negative weight) of this ENOG is indicative of the trait.

This table lists the 250 highest-ranking ENOGs of this model.

rank in model enog name enog description weight in model
1 COG1239 Involved in chlorophyll biosynthesis. Catalyzes the insertion of magnesium ion into protoporphyrin IX to yield Mg- protoporphyrin IX 0.061527
2 2ZS4D 2-vinyl bacteriochlorophyllide hydratase 0.058038
3 COG1240 Involved in chlorophyll biosynthesis. Catalyzes the insertion of magnesium ion into protoporphyrin IX to yield Mg- protoporphyrin IX 0.051324
4 2Z7K3 photosynthetic reaction center L subunit 0.050036
5 2Z87P photosynthetic electron transport in photosystem II 0.050035
6 COG1348 The key enzymatic reactions in nitrogen fixation are catalyzed by the nitrogenase complex which has 2 components the iron protein and the molybdenum-iron protein 0.049721
7 COG2710 Component of the dark-operative protochlorophyllide reductase (DPOR) that uses Mg-ATP and reduced ferredoxin to reduce ring D of protochlorophyllide (Pchlide) to form chlorophyllide a (Chlide). This reaction is light-independent. The NB-protein 0.049709
8 2Z7SF The reaction center of purple bacteria contains a tightly bound cytochrome molecule which re-reduces the photo oxidized primary electron donor 0.046040
9 COG4742 methyltransferase activity 0.042987
10 COG1719 4-vinyl reductase, 4VR 0.040284
11 3313P PRC-barrel domain 0.039721
12 331U4 Antenna complexes are light-harvesting systems, which transfer the excitation energy to the reaction centers 0.036799
13 32YH7 Photosynthetic complex assembly protein 0.036300
14 333MC Antenna complexes are light-harvesting systems, which transfer the excitation energy to the reaction centers 0.036300
15 33BSN bacteriochlorophyll binding 0.035154
16 COG1429 ligase activity, forming nitrogen-metal bonds 0.034254
17 COG1233 COG1233 Phytoene dehydrogenase and related 0.033509
18 COG1017 nitric oxide dioxygenase activity -0.033104
19 COG2346 COG2346, Truncated hemoglobins -0.033088
20 3325Y 0.032725
21 COG1032 radical SAM domain protein 0.030531
22 COG3283 Transcriptional regulator of aromatic amino acids metabolism 0.030174
23 COG1035 Coenzyme F420 hydrogenase/dehydrogenase, beta subunit C terminus 0.028307
24 COG5621 secreted hydrolase 0.028195
25 COG1926 phosphoribosyltransferase -0.027427
26 COG3666 COG3666 Transposase and inactivated derivatives 0.027075
27 COG2309 aminopeptidase activity -0.026982
28 COG0644 oxidoreductase 0.026944
29 30PIR Uncharacterized protein conserved in bacteria (DUF2256) 0.026642
30 COG2307 Protein conserved in bacteria 0.026422
31 COG1010 Tetrapyrrole (Corrin/Porphyrin) Methylases 0.026420
32 COG2343 Domain of unknown function (DUF427) 0.025610
33 32SUV 0.025312
34 COG0631 protein serine/threonine phosphatase activity -0.025297
35 COG4277 radical SAM 0.025164
36 COG1140 nitrate reductase beta subunit -0.025126
37 339P0 KaiB 0.024599
38 COG2211 Major facilitator Superfamily 0.023675
39 COG2828 Protein conserved in bacteria -0.023587
40 COG4679 PFAM Phage derived protein Gp49-like (DUF891) -0.023212
41 COG1574 metal-dependent hydrolase with the TIM-barrel fold -0.022839
42 COG2175 clavaminate synthase activity -0.022762
43 COG3046 protein related to deoxyribodipyrimidine photolyase 0.022686
44 COG1407 ICC-like phosphoesterases 0.022623
45 COG4094 NnrU protein 0.022542
46 COG3189 MarR family transcriptional regulator -0.022415
47 COG4409 exo-alpha-(2->6)-sialidase activity -0.022413
48 32SUX NADH ubiquinone oxidoreductase complex i intermediate-associated protein 30 0.022311
49 COG1509 lysine 2,3-aminomutase activity -0.022204
50 COG2308 Evidence 4 Homologs of previously reported genes of 0.022141
51 COG0116 Belongs to the methyltransferase superfamily -0.021874
52 COG3166 PFAM Fimbrial assembly family protein -0.021758
53 COG2243 precorrin-2 c20-methyltransferase 0.021757
54 COG1122 ATPase activity 0.021630
55 COG1251 Belongs to the nitrite and sulfite reductase 4Fe-4S domain family -0.021608
56 COG2818 Glycosylase -0.021597
57 COG2180 nitrate reductase molybdenum cofactor assembly chaperone -0.021343
58 COG1022 Amp-dependent synthetase and ligase 0.021186
59 COG4641 Protein conserved in bacteria -0.021174
60 COG0562 UDP-galactopyranose mutase -0.021168
61 COG3476 COG3476 Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) 0.020989
62 COG5188 Subtilase family 0.020979
63 COG1872 DUF167 -0.020957
64 COG3971 2-oxopent-4-enoate hydratase activity -0.020943
65 COG3435 Gentisate 1,2-dioxygenase -0.020921
66 COG5012 cobalamin binding 0.020759
67 COG0207 Catalyzes the reductive methylation of 2'-deoxyuridine- 5'-monophosphate (dUMP) to 2'-deoxythymidine-5'-monophosphate (dTMP) while utilizing 5,10-methylenetetrahydrofolate (mTHF) as the methyl donor and reductant in the reaction, yielding dihydrofolate (DHF) as a by-product. This enzymatic reaction provides an intracellular de novo source of dTMP, an essential precursor for DNA biosynthesis -0.020699
68 COG2241 protein methyltransferase activity 0.020620
69 COG0698 ribose 5-phosphate isomerase -0.020568
70 COG0393 Putative heavy-metal-binding -0.020550
71 COG3554 Major facilitator Superfamily -0.020517
72 COG3187 Heat shock protein 0.020467
73 COG0775 Catalyzes the irreversible cleavage of the glycosidic bond in both 5'-methylthioadenosine (MTA) and S- adenosylhomocysteine (SAH AdoHcy) to adenine and the corresponding thioribose, 5'-methylthioribose and S-ribosylhomocysteine, respectively -0.020459
74 COG2162 N-hydroxyarylamine O-acetyltransferase activity -0.020443
75 COG1526 Required for formate dehydrogenase (FDH) activity. Acts as a sulfur carrier protein that transfers sulfur from IscS to the molybdenum cofactor prior to its insertion into FDH -0.020423
76 COG1633 Catalyzes the formation of the isocyclic ring in chlorophyll biosynthesis. Mediates the cyclase reaction, which results in the formation of divinylprotochlorophyllide (Pchlide) characteristic of all chlorophylls from magnesium-protoporphyrin IX 13-monomethyl ester (MgPMME) 0.020417
77 COG0586 Pfam SNARE associated Golgi protein -0.020417
78 COG1053 succinate dehydrogenase -0.020224
79 COG2327 Polysaccharide pyruvyl transferase -0.020047
80 COG1918 ferrous iron import across plasma membrane 0.019905
81 COG3552 Protein containing von Willebrand factor type A (VWA) domain 0.019865
82 COG2964 Protein conserved in bacteria 0.019857
83 COG2998 abc-type tungstate transport system, permease component -0.019760
84 3358U Antenna complexes are light-harvesting systems, which transfer the excitation energy to the reaction centers 0.019738
85 COG4327 Domain of unknown function (DUF4212) 0.019737
86 COG1783 phage Terminase large subunit -0.019712
87 COG2172 sigma factor antagonist activity 0.019667
88 COG4757 Alpha beta hydrolase -0.019651
89 COG2875 Belongs to the precorrin methyltransferase family 0.019644
90 2ZA89 Glycosyl transferase family 2 -0.019627
91 COG0310 cobalt ion transport 0.019471
92 COG2910 NAD(P)H-binding 0.019449
93 COG0069 glutamate synthase activity -0.019326
94 COG3004 Na( ) H( ) antiporter that extrudes sodium in exchange for external protons -0.019275
95 COG3577 Aspartyl protease 0.019242
96 COG4662 Binding-protein-dependent transport system inner membrane component -0.019231
97 33KDF F subunit of K+-transporting ATPase (Potass_KdpF) 0.019229
98 COG1105 Belongs to the carbohydrate kinase PfkB family -0.019162
99 2ZA2F 0.019106
100 33APD 0.019058
101 COG3563 Capsule polysaccharide -0.018975
102 COG4691 Plasmid stability protein 0.018953
103 COG5271 translation initiation factor activity 0.018923
104 33EHH Antenna complexes are light-harvesting systems, which transfer the excitation energy to the reaction centers 0.018914
105 COG2169 Transcriptional regulator 0.018781
106 COG4451 ribulose bisphosphate carboxylase, small 0.018751
107 32YT2 Protein of unknown function (DUF2442) 0.018649
108 COG4852 Membrane 0.018648
109 COG4552 Acetyltransferase involved in intracellular survival and related -0.018570
110 COG2363 Small membrane protein -0.018568
111 COG3405 Belongs to the glycosyl hydrolase 8 (cellulase D) family -0.018549
112 30WUD -0.018501
113 COG2840 Smr protein MutS2 -0.018461
114 COG2357 RelA SpoT domain protein -0.018379
115 COG2378 regulation of single-species biofilm formation -0.018359
116 COG2842 Transposition protein 0.018291
117 COG0432 Pfam Uncharacterised protein family UPF0047 -0.018255
118 COG3332 Transport and Golgi organisation 2 -0.018214
119 2ZTDD -0.018214
120 COG2044 PFAM DsrE DsrF-like family 0.018183
121 32ZJ3 0.018166
122 COG5325 Glycosyltransferase like family 0.018115
123 COG3698 Phosphodiester glycosidase 0.018114
124 COG2833 Protein of unknown function (DUF455) -0.018051
125 COG3381 protein complex oligomerization 0.017977
126 COG3945 hemerythrin HHE cation binding domain -0.017975
127 COG1331 Highly conserved protein containing a thioredoxin domain -0.017959
128 COG0689 Phosphorolytic exoribonuclease that removes nucleotide residues following the -CCA terminus of tRNA and adds nucleotides to the ends of RNA molecules by using nucleoside diphosphates as substrates 0.017954
129 32372 T4-like virus tail tube protein gp19 0.017936
130 COG2158 Cysteine-rich small domain 0.017910
131 COG5637 Cyclase dehydrase -0.017905
132 COG1797 cobyrinic acid a,c-diamide synthase activity 0.017871
133 COG3232 5-carboxymethyl-2-hydroxymuconate isomerase -0.017867
134 COG4256 Hemin uptake protein 0.017767
135 COG1850 ribulose-bisphosphate carboxylase activity 0.017738
136 COG2520 Methyltransferase fkbm family 0.017725
137 COG4548 von Willebrand factor, type A 0.017657
138 COG2269 With EpmB is involved in the beta-lysylation step of the post-translational modification of translation elongation factor P (EF-P). Catalyzes the ATP-dependent activation of (R)-beta-lysine produced by EpmB, forming a lysyl-adenylate, from which the beta- lysyl moiety is then transferred to the epsilon-amino group of a conserved specific lysine residue in EF-P -0.017588
139 30G1W spheroidene monooxygenase 0.017571
140 COG2099 Precorrin-6x reductase 0.017444
141 COG1489 DNA binding -0.017437
142 COG0415 Belongs to the DNA photolyase family 0.017413
143 2ZWCV 0.017396
144 COG3608 succinylglutamate desuccinylase aspartoacylase 0.017347
145 COG1366 Belongs to the anti-sigma-factor antagonist family 0.017340
146 COG3735 TraB family -0.017335
147 32H4Q Bacterial regulatory proteins, tetR family -0.017319
148 COG5606 Transcriptional regulator -0.017286
149 COG2132 Multicopper oxidase -0.017282
150 COG1830 Aldolase -0.017269
151 COG1758 RNA polymerase activity -0.017223
152 33AK9 Antitoxin Phd_YefM, type II toxin-antitoxin system 0.017159
153 33DHE 0.017159
154 COG1770 oligopeptidase 0.017113
155 COG4689 acetoacetate decarboxylase -0.017109
156 COG0668 transmembrane transport 0.017071
157 33IKX 0.017047
158 COG4875 SnoaL-like domain 0.017017
159 COG3523 ImpA, N-terminal, type VI secretion system -0.017005
160 COG5343 regulation of RNA biosynthetic process 0.016956
161 COG1063 alcohol dehydrogenase 0.016916
162 COG3320 Male sterility 0.016842
163 COG3128 pkhd-type hydroxylase -0.016840
164 COG5135 pyridoxamine 5-phosphate 0.016837
165 COG2721 sulfolactate sulfo-lyase activity -0.016825
166 COG1139 Iron-sulfur cluster binding protein -0.016814
167 339KW 0.016801
168 COG4585 Histidine kinase 0.016790
169 331C5 0.016764
170 COG5302 Post-segregation antitoxin (ccd killing mechanism protein) encoded by the F plasmid -0.016756
171 COG0447 Converts o-succinylbenzoyl-CoA (OSB-CoA) to 1,4- dihydroxy-2-naphthoyl-CoA (DHNA-CoA) 0.016724
172 COG3913 Uncharacterized protein conserved in bacteria (DUF2094) -0.016688
173 COG3030 protein affecting phage T7 exclusion by the F plasmid -0.016640
174 COG2376 Dihydroxyacetone kinase -0.016638
175 COG3383 formate dehydrogenase (NAD+) activity -0.016623
176 COG5615 integral membrane protein -0.016594
177 COG3911 COG3911 Predicted ATPase 0.016576
178 COG2138 Cobalamin (vitamin B12) biosynthesis CbiX protein -0.016540
179 COG0301 tRNA thio-modification -0.016522
180 COG0619 Transmembrane (T) component of an energy-coupling factor (ECF) ABC-transporter complex. Unlike classic ABC transporters this ECF transporter provides the energy necessary to transport a number of different substrates 0.016504
181 COG1079 Belongs to the binding-protein-dependent transport system permease family 0.016502
182 COG4603 Belongs to the binding-protein-dependent transport system permease family 0.016502
183 COG0213 The enzymes which catalyze the reversible phosphorolysis of pyrimidine nucleosides are involved in the degradation of these compounds and in their utilization as carbon and energy sources, or in the rescue of pyrimidine bases for nucleotide synthesis -0.016485
184 COG3473 Maleate cis-trans isomerase -0.016479
185 COG1763 Mo-molybdopterin cofactor biosynthetic process -0.016476
186 COG1392 Phosphate transport regulator -0.016413
187 2Z7P1 Domain of unknown function (DUF4331) 0.016367
188 COG0634 Belongs to the purine pyrimidine phosphoribosyltransferase family -0.016317
189 COG3378 Phage plasmid primase P4 family -0.016293
190 COG3832 glyoxalase III activity -0.016291
191 33C8H 0.016246
192 COG2146 nitrite reductase [NAD(P)H] activity -0.016196
193 COG2082 Precorrin-8x methylmutase 0.016178
194 COG2331 Regulatory protein, FmdB family 0.016170
195 COG0780 Catalyzes the NADPH-dependent reduction of 7-cyano-7- deazaguanine (preQ0) to 7-aminomethyl-7-deazaguanine (preQ1) 0.016139
196 COG3585 molybdate ion transport 0.016121
197 COG1441 menaquinone biosynthetic process 0.016059
198 332UY Protein of unknown function (DUF1761) -0.016028
199 COG4117 Thiosulfate reductase cytochrome B subunit (Membrane anchoring protein) 0.015982
200 COG1533 DNA photolyase activity -0.015982
201 COG2137 regulation of response to DNA damage stimulus -0.015959
202 COG4118 positive regulation of growth 0.015898
203 COG2336 PFAM SpoVT AbrB 0.015855
204 COG2032 superoxide dismutase activity -0.015850
205 COG3183 HNH endonuclease 0.015822
206 COG3520 type VI secretion protein -0.015798
207 COG2055 Belongs to the LDH2 MDH2 oxidoreductase family 0.015795
208 COG3619 membrane -0.015791
209 COG0386 Belongs to the glutathione peroxidase family 0.015781
210 COG3342 major pilin protein fima -0.015780
211 COG4692 amylo-alpha-1,6-glucosidase activity -0.015661
212 COG3602 ACT domain 0.015659
213 COG4634 DNA integration 0.015575
214 COG1862 Preprotein translocase -0.015566
215 COG3484 0.015513
216 COG3970 fumarylacetoacetate (FAA) hydrolase -0.015512
217 COG3839 Belongs to the ABC transporter superfamily 0.015509
218 33AXN Gamma-glutamyl cyclotransferase, AIG2-like -0.015499
219 COG3676 manually curated -0.015498
220 COG1836 PFAM Integral membrane protein DUF92 0.015491
221 32XD7 0.015481
222 COG3123 guanosine phosphorylase activity -0.015457
223 COG3299 homolog of phage Mu protein gp47 0.015385
224 2Z7K2 Bacterial cellulose synthase subunit -0.015294
225 COG2318 DinB family -0.015284
226 COG1776 CheC inhibitor of MCP methylation -0.015283
227 COG3254 rhamnose metabolic process -0.015261
228 COG0846 NAD-dependent lysine deacetylase and desuccinylase that specifically removes acetyl and succinyl groups on target proteins. Modulates the activities of several proteins which are inactive in their acylated form -0.015244
229 COG1796 DNA polymerase -0.015209
230 COG3623 hexulose-6-phosphate isomerase -0.015175
231 COG3513 CRISPR (clustered regularly interspaced short palindromic repeat) is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA). In type II CRISPR systems correct processing of pre-crRNA requires a trans-encoded small RNA (tracrRNA), endogenous ribonuclease 3 (rnc) and this protein. The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA. Subsequently Cas9 crRNA tracrRNA endonucleolytically cleaves linear or circular dsDNA target complementary to the spacer 0.015174
232 COG4967 type IV pilus modification protein PilV -0.015162
233 COG1744 ABC transporter substrate-binding protein PnrA-like 0.015105
234 COG3077 Addiction module antitoxin, RelB DinJ family -0.015043
235 3483J 0.015042
236 COG2975 iron-sulfur cluster assembly -0.015032
237 COG1487 ribonuclease activity 0.014985
238 32U3U 0.014977
239 COG3618 amidohydrolase -0.014952
240 COG0724 RNA recognition motif 0.014946
241 30CV1 Type II secretion system (T2SS), protein M subtype b -0.014902
242 COG4388 Caudovirus prohead serine protease 0.014887
243 COG0510 thiamine kinase activity -0.014879
244 COG0803 Belongs to the bacterial solute-binding protein 9 family -0.014876
245 COG2379 hydroxypyruvate reductase -0.014841
246 COG3721 Heme iron utilization protein 0.014833
247 COG0798 PFAM Bile acid sodium symporter 0.014824
248 33DC0 0.014808
249 COG3162 membrane -0.014790
250 COG2854 intermembrane phospholipid transfer -0.014767