Model Internals

Each PhenDB model is trained on sets of bacterial ENOGs (orthologous groups from EggNOG 4.5), which have or have not been identified in the training genomes. Each ENOG is given a weight, with the magnitude of the weight being the importance of that ENOG for the final prediction. The sign of the weight indicates whether the presence (positive weight) or absence (negative weight) of this ENOG is indicative of the trait.

This table lists the 250 highest-ranking ENOGs of this model.

rank in model enog name enog description weight in model
1 COG5554 nitrogen fixation 0.129188
2 COG1433 Dinitrogenase iron-molybdenum cofactor 0.125728
3 2ZBN7 nitrogen fixation protein 0.117944
4 330W8 May protect the nitrogenase Fe-Mo protein from oxidative damage 0.104573
5 COG1348 The key enzymatic reactions in nitrogen fixation are catalyzed by the nitrogenase complex which has 2 components the iron protein and the molybdenum-iron protein 0.096548
6 COG2710 Component of the dark-operative protochlorophyllide reductase (DPOR) that uses Mg-ATP and reduced ferredoxin to reduce ring D of protochlorophyllide (Pchlide) to form chlorophyllide a (Chlide). This reaction is light-independent. The NB-protein 0.096548
7 327FW molybdenum ion binding 0.083442
8 COG5420 small protein containing a coiled-coil domain 0.077234
9 2Z8CH SIR2-like domain 0.071741
10 COG1740 oxidoreductase activity, acting on hydrogen as donor, iron-sulfur protein as acceptor 0.042094
11 2ZV1D 0.041405
12 COG0374 Belongs to the NiFe NiFeSe hydrogenase large subunit family 0.036909
13 COG2461 Hemerythrin HHE cation binding domain protein 0.036702
14 COG1355 regulation of microtubule-based process 0.034324
15 330EE PFAM NifZ family protein 0.033995
16 COG3585 molybdate ion transport 0.033115
17 COG1151 hydroxylamine reductase activity 0.032952
18 32SCF MEKHLA domain -0.032379
19 COG2358 TRAP transporter, solute receptor (TAXI family 0.032043
20 COG0068 Along with HypE, it catalyzes the synthesis of the CN ligands of the active site iron of NiFe -hydrogenases using carbamoylphosphate as a substrate. It functions as a carbamoyl transferase using carbamoylphosphate as a substrate and transferring the carboxamido moiety in an ATP-dependent reaction to the thiolate of the C-terminal cysteine of HypE yielding a protein-S-carboxamide 0.031890
21 COG0298 carbon dioxide binding 0.031271
22 2ZXZV Outer membrane efflux protein 0.031234
23 COG2833 Protein of unknown function (DUF455) -0.031217
24 COG0375 protein maturation 0.031215
25 COG4916 TIR domain 0.030900
26 COG1816 Catalyzes the hydrolytic deamination of adenine to hypoxanthine. Plays an important role in the purine salvage pathway and in nitrogen catabolism 0.030433
27 COG2821 murein-degrading enzyme. may play a role in recycling of muropeptides during cell elongation and or cell division -0.030090
28 COG0409 Hydrogenase expression formation protein 0.029604
29 COG0309 Hydrogenase expression formation protein (HypE) 0.029533
30 COG3593 DNA synthesis involved in DNA repair -0.029495
31 2Z8KN PFAM Dinitrogenase reductase ADP-ribosyltransferase 0.029251
32 COG0680 Initiates the rapid degradation of small, acid-soluble proteins during spore germination 0.029000
33 COG1022 Amp-dependent synthetase and ligase 0.028680
34 COG1294 oxidase, subunit 0.028475
35 COG1669 hydrolase activity, acting on ester bonds 0.028236
36 COG5485 Ester cyclase 0.028209
37 COG4268 DNA restriction-modification system -0.028085
38 COG2234 aminopeptidase activity 0.027857
39 COG0589 response to stress -0.027839
40 3287S Domain of unknown function (DUF4334) -0.027766
41 COG5126 Ca2 -binding protein (EF-Hand superfamily 0.027757
42 COG4380 Lipoprotein 0.027751
43 COG2032 superoxide dismutase activity -0.027739
44 COG0340 biotin-[acetyl-CoA-carboxylase] ligase activity -0.027622
45 COG1518 maintenance of DNA repeat elements 0.027603
46 COG0302 gtp cyclohydrolase -0.027589
47 COG1271 oxidase subunit 0.027485
48 COG2129 metallophosphoesterase -0.027319
49 COG1086 Polysaccharide biosynthesis protein -0.027298
50 COG3146 Protein conserved in bacteria -0.026801
51 COG2253 Psort location Cytoplasmic, score -0.026663
52 COG1190 Belongs to the class-II aminoacyl-tRNA synthetase family -0.026436
53 COG1468 defense response to virus 0.026197
54 COG0474 ATPase, P-type transporting, HAD superfamily, subfamily IC 0.026027
55 COG2733 membrane 0.025836
56 330M2 -0.025717
57 COG0588 phosphoglycerate mutase activity -0.025681
58 COG1785 Belongs to the alkaline phosphatase family 0.025612
59 COG1343 CRISPR (clustered regularly interspaced short palindromic repeat), is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain sequences complementary to antecedent mobile elements and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA). Functions as a ssRNA-specific endoribonuclease. Involved in the integration of spacer DNA into the CRISPR cassette 0.025432
60 COG3926 Glycosyl hydrolase 108 0.025276
61 COG1619 proteins homologs of microcin C7 resistance protein MccF -0.025258
62 COG5553 Cysteine dioxygenase type I 0.025219
63 COG4237 Hydrogenase 4 membrane 0.025205
64 32TH1 0.025176
65 COG2837 iron assimilation 0.024989
66 COG2808 Putative FMN-binding domain 0.024982
67 COG1543 Belongs to the glycosyl hydrolase 57 family -0.024864
68 COG3259 Nickel-dependent hydrogenase 0.024861
69 COG5592 hemerythrin HHE cation binding domain 0.024769
70 COG3547 Transposase (IS116 IS110 IS902 family) -0.024420
71 COG3655 Transcriptional regulator 0.024380
72 COG2214 Heat shock protein DnaJ domain protein 0.024107
73 COG3772 lysozyme -0.024103
74 COG3665 Urea carboxylase-associated protein 0.024069
75 COG5184 regulator of chromosome condensation, RCC1 0.024029
76 COG0561 phosphatase activity -0.023987
77 COG0418 dihydroorotase activity -0.023981
78 COG1167 Transcriptional regulators containing a DNA-binding HTH domain and an aminotransferase domain MocR family and their eukaryotic orthologs 0.023969
79 COG3428 Bacterial PH domain -0.023954
80 COG3271 cysteine-type peptidase activity 0.023933
81 COG4308 Limonene-1,2-epoxide hydrolase catalytic domain 0.023703
82 COG1654 Acts both as a biotin-- acetyl-CoA-carboxylase ligase and a biotin-operon repressor. In the presence of ATP, BirA activates biotin to form the BirA-biotinyl-5'-adenylate (BirA-bio- 5'-AMP or holoBirA) complex. HoloBirA can either transfer the biotinyl moiety to the biotin carboxyl carrier protein (BCCP) subunit of acetyl-CoA carboxylase, or bind to the biotin operator site and inhibit transcription of the operon 0.023643
83 COG0208 Provides the precursors necessary for DNA synthesis. Catalyzes the biosynthesis of deoxyribonucleotides from the corresponding ribonucleotides -0.023588
84 COG0861 Integral membrane protein TerC family -0.023461
85 COG0117 Converts 2,5-diamino-6-(ribosylamino)-4(3h)-pyrimidinone 5'-phosphate into 5-amino-6-(ribosylamino)-2,4(1h,3h)- pyrimidinedione 5'-phosphate -0.023405
86 COG0798 PFAM Bile acid sodium symporter 0.023382
87 COG4857 Catalyzes the phosphorylation of methylthioribose into methylthioribose-1-phosphate 0.023342
88 COG1401 restriction endodeoxyribonuclease activity -0.023284
89 COG0863 Belongs to the N(4) N(6)-methyltransferase family -0.023257
90 331CC 0.023129
91 338SB 0.023129
92 COG5527 Initiator Replication protein -0.023017
93 COG2210 Belongs to the sulfur carrier protein TusA family 0.022795
94 COG2018 Roadblock/LC7 domain 0.022627
95 COG0610 Subunit R is required for both nuclease and ATPase activities, but not for modification -0.022607
96 COG1929 Belongs to the glycerate kinase type-1 family 0.022603
97 COG1833 Belongs to the SfsA family 0.022566
98 COG4231 catalyzes the ferredoxin-dependent oxidative decarboxylation of arylpyruvates 0.022562
99 COG3449 Inhibits the supercoiling activity of DNA gyrase. Acts by inhibiting DNA gyrase at an early step, prior to (or at the step of) binding of DNA by the gyrase. It protects cells against toxins that target DNA gyrase, by inhibiting activity of these toxins and reducing the formation of lethal double-strand breaks in the cell 0.022538
100 COG4327 Domain of unknown function (DUF4212) 0.022486
101 30BDS 0.022481
102 COG0670 Belongs to the BI1 family -0.022429
103 COG1209 Catalyzes the formation of dTDP-glucose, from dTTP and glucose 1-phosphate, as well as its pyrophosphorolysis -0.022250
104 COG3847 Flp Fap pilin component -0.022229
105 33AYK 0.022190
106 COG2183 response to ionizing radiation 0.022171
107 332SV 0.022150
108 30A8Q Bacterial regulatory proteins, tetR family 0.022131
109 COG4773 Receptor 0.022114
110 COG1032 radical SAM domain protein 0.022112
111 COG0447 Converts o-succinylbenzoyl-CoA (OSB-CoA) to 1,4- dihydroxy-2-naphthoyl-CoA (DHNA-CoA) 0.022085
112 COG0380 Probably involved in the osmoprotection via the biosynthesis of trehalose. Catalyzes the transfer of glucose from UDP-glucose (UDP-Glc) to D-glucose 6-phosphate (Glc-6-P) to form trehalose-6-phosphate. Acts with retention of the anomeric configuration of the UDP-sugar donor -0.022058
113 COG0388 Catalyzes the ATP-dependent amidation of deamido-NAD to form NAD. Uses L-glutamine as a nitrogen source -0.022014
114 COG0786 glutamate:sodium symporter activity -0.021942
115 COG4704 Protein conserved in bacteria 0.021902
116 COG1969 Ni Fe-hydrogenase, b-type cytochrome subunit 0.021857
117 COG2866 Carboxypeptidase -0.021847
118 COG1145 4fe-4S ferredoxin, iron-sulfur binding domain protein 0.021819
119 COG4929 membrane-anchored protein 0.021783
120 330BP Nitrogen fixation protein NifW 0.021754
121 COG2329 Antibiotic biosynthesis monooxygenase -0.021700
122 32R8H Sulfotransferase domain 0.021658
123 COG2345 Transcriptional regulator 0.021521
124 COG4095 Psort location CytoplasmicMembrane, score 0.021469
125 COG0040 Catalyzes the condensation of ATP and 5-phosphoribose 1- diphosphate to form N'-(5'-phosphoribosyl)-ATP (PR-ATP). Has a crucial role in the pathway because the rate of histidine biosynthesis seems to be controlled primarily by regulation of hisG enzymatic activity -0.021451
126 COG0607 Catalyzes the ATP-dependent transfer of a sulfur to tRNA to produce 4-thiouridine in position 8 of tRNAs, which functions as a near-UV photosensor. Also catalyzes the transfer of sulfur to the sulfur carrier protein ThiS, forming ThiS-thiocarboxylate. This is a step in the synthesis of thiazole, in the thiamine biosynthesis pathway. The sulfur is donated as persulfide by IscS 0.021449
127 COG3835 regulator 0.021308
128 COG0783 Belongs to the Dps family -0.021304
129 COG3916 Acyl-homoserine-lactone synthase -0.021303
130 COG2188 Transcriptional regulator 0.021263
131 COG3405 Belongs to the glycosyl hydrolase 8 (cellulase D) family -0.021211
132 COG1657 PFAM Prenyltransferase squalene oxidase 0.021185
133 COG3023 N-Acetylmuramoyl-L-alanine amidase -0.021106
134 COG2907 Flavin containing amine oxidoreductase 0.021104
135 COG2425 protein containing a von Willebrand factor type A (vWA) domain -0.020989
136 COG3330 Domain of unknown function (DUF4912) -0.020862
137 COG1694 Mazg nucleotide pyrophosphohydrolase -0.020854
138 COG1658 Required for correct processing of both the 5' and 3' ends of 5S rRNA precursor. Cleaves both sides of a double-stranded region yielding mature 5S rRNA in one step 0.020749
139 COG2856 Zn peptidase 0.020711
140 COG0141 Catalyzes the sequential NAD-dependent oxidations of L- histidinol to L-histidinaldehyde and then to L-histidine -0.020702
141 COG0186 One of the primary rRNA binding proteins, it binds specifically to the 5'-end of 16S ribosomal RNA -0.020642
142 COG3658 cytochrome 0.020639
143 COG1793 dna ligase -0.020583
144 COG3558 Protein conserved in bacteria -0.020581
145 COG3647 Membrane 0.020552
146 COG3556 membrane -0.020542
147 COG3550 peptidyl-serine autophosphorylation 0.020434
148 COG4485 Bacterial membrane protein, YfhO 0.020411
149 COG3419 Tfp pilus assembly protein tip-associated adhesin 0.020301
150 COG3113 response to antibiotic 0.020211
151 COG1993 acr, cog1993 0.020202
152 COG4447 cellulose binding -0.020128
153 COG3249 protein conserved in bacteria 0.020128
154 COG4382 Mu-like prophage protein Gp16 0.020099
155 COG1715 Restriction endonuclease 0.020088
156 COG5470 NIPSNAP family containing protein -0.020030
157 COG2078 PFAM AMMECR1 domain protein 0.019973
158 COG1523 belongs to the glycosyl hydrolase 13 family 0.019960
159 COG1051 belongs to the nudix hydrolase family -0.019875
160 COG0389 Poorly processive, error-prone DNA polymerase involved in untargeted mutagenesis. Copies undamaged DNA at stalled replication forks, which arise in vivo from mismatched or misaligned primer ends. These misaligned primers can be extended by PolIV. Exhibits no 3'-5' exonuclease (proofreading) activity. May be involved in translesional synthesis, in conjunction with the beta clamp from PolIII -0.019860
161 COG0199 Binds 16S rRNA, required for the assembly of 30S particles and may also be responsible for determining the conformation of the 16S rRNA at the A site -0.019837
162 COG0257 Belongs to the bacterial ribosomal protein bL36 family -0.019689
163 COG0267 Belongs to the bacterial ribosomal protein bL33 family -0.019689
164 COG2801 Transposase and inactivated derivatives 0.019661
165 COG4976 Methyltransferase 0.019660
166 COG2246 polysaccharide biosynthetic process 0.019634
167 COG1003 The glycine cleavage system catalyzes the degradation of glycine. The P protein binds the alpha-amino group of glycine through its pyridoxal phosphate cofactor 0.019611
168 COG1451 nucleotide metabolic process -0.019591
169 COG4623 carbon-oxygen lyase activity, acting on polysaccharides -0.019591
170 2ZITU Cupin domain 0.019577
171 COG4118 positive regulation of growth -0.019567
172 COG1168 Bifunctional PLP-dependent enzyme with beta-cystaTHIonase and maltose regulon repressor activities 0.019550
173 COG1397 ADP-ribosylglycohydrolase 0.019539
174 COG4960 Type IV leader peptidase family -0.019532
175 COG2824 Alkylphosphonate utilization operon protein PhnA 0.019505
176 32YE8 Psort location Cytoplasmic, score -0.019502
177 COG4597 amino acid ABC transporter 0.019475
178 COG2215 Belongs to the NiCoT transporter (TC 2.A.52) family -0.019425
179 COG2172 sigma factor antagonist activity 0.019354
180 COG0757 Catalyzes a trans-dehydration via an enolate intermediate -0.019341
181 COG1352 Methylation of the membrane-bound methyl-accepting chemotaxis proteins (MCP) to form gamma-glutamyl methyl ester residues in MCP 0.019310
182 COG0662 Cupin 2, conserved barrel domain protein -0.019294
183 COG2309 aminopeptidase activity 0.019289
184 COG0752 glycyl-tRNA synthetase alpha subunit -0.019232
185 339W8 Tautomerase enzyme 0.019220
186 COG3086 response to oxidative stress 0.019214
187 COG4961 PFAM TadE family protein -0.019181
188 COG1035 Coenzyme F420 hydrogenase/dehydrogenase, beta subunit C terminus -0.019177
189 COG3221 ABC-type phosphate phosphonate transport system periplasmic component 0.019174
190 COG4679 PFAM Phage derived protein Gp49-like (DUF891) -0.019162
191 COG4712 double-strand break repair protein -0.019161
192 COG3090 Trap-type c4-dicarboxylate transport system, small permease component 0.019140
193 COG3521 Type VI secretion 0.019140
194 COG3189 MarR family transcriptional regulator -0.019133
195 COG1393 Belongs to the ArsC family 0.019128
196 COG3945 hemerythrin HHE cation binding domain 0.019124
197 COG4660 Part of a membrane complex involved in electron transport 0.019056
198 COG1113 amino acid transport -0.019026
199 COG3712 iron ion homeostasis 0.019023
200 COG2610 gluconate transmembrane transporter activity 0.018980
201 32WCJ 0.018936
202 COG5464 transposase or invertase -0.018936
203 COG3946 virulence factor family protein -0.018850
204 COG4469 Competence protein -0.018843
205 COG3393 -acetyltransferase 0.018822
206 COG3505 Type IV secretory pathway VirD4 -0.018769
207 COG1803 Methylglyoxal synthase -0.018763
208 COG1539 Catalyzes the conversion of 7,8-dihydroneopterin to 6- hydroxymethyl-7,8-dihydropterin -0.018750
209 COG0698 ribose 5-phosphate isomerase -0.018743
210 COG1533 DNA photolyase activity 0.018689
211 COG3123 guanosine phosphorylase activity 0.018649
212 COG1941 NADH ubiquinone oxidoreductase 20 kDa subunit 0.018615
213 COG1979 alcohol dehydrogenase 0.018548
214 COG3830 Belongs to the UPF0237 family -0.018540
215 COG3504 Conjugal transfer protein -0.018473
216 COG5002 protein histidine kinase activity 0.018460
217 33FR9 -0.018425
218 COG5617 Psort location CytoplasmicMembrane, score 0.018408
219 33B1F -0.018404
220 COG3548 integral membrane protein 0.018384
221 2ZS4Q ankyrin repeats -0.018372
222 31B87 Bacterial regulatory proteins, tetR family -0.018371
223 COG4551 Low molecular weight phosphotyrosine protein phosphatase -0.018360
224 332E9 -0.018354
225 COG1305 Transglutaminase-like 0.018326
226 COG3336 cytochrome c oxidase 0.018282
227 COG3272 Protein of unknown function (DUF1722) 0.018271
228 COG1525 nuclease -0.018266
229 COG0788 formyltetrahydrofolate deformylase activity -0.018249
230 COG3592 Divergent 4Fe-4S mono-cluster 0.018242
231 COG0510 thiamine kinase activity 0.018236
232 COG0369 Component of the sulfite reductase complex that catalyzes the 6-electron reduction of sulfite to sulfide. This is one of several activities required for the biosynthesis of L- cysteine from sulfate. The flavoprotein component catalyzes the electron flow from NADPH - FAD - FMN to the hemoprotein component 0.018222
233 2ZC6N Domain of unknown function (DUF4272) -0.018213
234 COG3921 Protein conserved in bacteria 0.018212
235 COG2366 antibiotic biosynthetic process -0.018202
236 COG2264 protein methyltransferase activity -0.018179
237 COG2143 COG2143 Thioredoxin-related protein -0.018117
238 COG2187 AAA domain 0.018108
239 COG2317 Broad specificity carboxypetidase that releases amino acids sequentially from the C-terminus, including neutral, aromatic, polar and basic residues 0.018080
240 COG2258 MOSC domain -0.018075
241 COG3825 VWA domain containing CoxE-like protein 0.018059
242 COG1324 tolerance protein -0.018045
243 COG1697 DNA topoisomerase VI subunit A -0.018027
244 COG1371 PFAM Archease protein family (DUF101 UPF0211) 0.018008
245 COG3797 Protein conserved in bacteria -0.017997
246 COG4641 Protein conserved in bacteria 0.017992
247 COG0144 Specifically methylates the cytosine at position 967 (m5C967) of 16S rRNA -0.017991
248 COG1331 Highly conserved protein containing a thioredoxin domain 0.017963
249 COG0327 Belongs to the GTP cyclohydrolase I type 2 NIF3 family -0.017942
250 COG2850 peptidyl-arginine hydroxylation -0.017937