Model Internals

Each PhenDB model is trained on sets of bacterial ENOGs (orthologous groups from EggNOG 4.5), which have or have not been identified in the training genomes. Each ENOG is given a weight, with the magnitude of the weight being the importance of that ENOG for the final prediction. The sign of the weight indicates whether the presence (positive weight) or absence (negative weight) of this ENOG is indicative of the trait.

This table lists the 250 highest-ranking ENOGs of this model.

rank in model enog name enog description weight in model
1 COG0025 NhaP-type Na H and K H -0.046601
2 COG4677 pectinesterase activity 0.041532
3 COG0670 Belongs to the BI1 family -0.041145
4 COG1875 ATPase related to phosphate starvation-inducible protein PhoH 0.040448
5 COG0599 Antioxidant protein with alkyl hydroperoxidase activity. Required for the reduction of the AhpC active site cysteine residues and for the regeneration of the AhpC enzyme activity -0.040141
6 COG0379 Catalyzes the condensation of iminoaspartate with dihydroxyacetone phosphate to form quinolinate 0.039854
7 COG1048 Catalyzes the isomerization of citrate to isocitrate via cis-aconitate 0.039162
8 COG3021 interspecies interaction between organisms -0.037896
9 30J2J 0.036477
10 COG0535 radical SAM domain protein -0.036256
11 COG0616 signal peptide processing -0.036216
12 COG3500 Late control gene D protein 0.036004
13 COG0433 COG0433 Predicted ATPase -0.035797
14 COG1586 Catalyzes the decarboxylation of S-adenosylmethionine to S-adenosylmethioninamine (dcAdoMet), the propylamine donor required for the synthesis of the polyamines spermine and spermidine from the diamine putrescine 0.035567
15 COG3547 Transposase (IS116 IS110 IS902 family) -0.035373
16 COG5627 protein ubiquitination 0.035149
17 COG0790 COG0790 FOG TPR repeat, SEL1 subfamily 0.035090
18 COG4225 unsaturated rhamnogalacturonyl hydrolase activity 0.035011
19 COG3958 Transketolase 0.034926
20 COG0553 Transcription regulator that activates transcription by stimulating RNA polymerase (RNAP) recycling in case of stress conditions such as supercoiled DNA or high salt concentrations. Probably acts by releasing the RNAP, when it is trapped or immobilized on tightly supercoiled DNA. Does not activate transcription on linear DNA. Probably not involved in DNA repair -0.034818
21 COG2096 Adenosyltransferase -0.034567
22 COG3723 Recombinational DNA repair protein, rece pathway 0.034532
23 COG1712 Catalyzes the reversible NADPH-dependent reductive amination of L-2-amino-6-oxopimelate, the acyclic form of L- tetrahydrodipicolinate, to generate the meso compound, D,L-2,6- diaminopimelate 0.034218
24 COG2253 Psort location Cytoplasmic, score -0.034141
25 COG1243 radical SAM domain protein 0.034136
26 COG4469 Competence protein -0.033607
27 COG3959 Transketolase, thiamine diphosphate binding domain 0.033195
28 COG1985 Converts 2,5-diamino-6-(ribosylamino)-4(3h)-pyrimidinone 5'-phosphate into 5-amino-6-(ribosylamino)-2,4(1h,3h)- pyrimidinedione 5'-phosphate 0.033096
29 COG3587 Type III restriction enzyme, res subunit -0.033019
30 COG1506 peptidase -0.032311
31 338B2 -0.031612
32 COG0679 Auxin Efflux Carrier -0.031561
33 COG2610 gluconate transmembrane transporter activity 0.031373
34 COG0549 Belongs to the carbamate kinase family 0.031352
35 COG2872 Ser-tRNA(Ala) hydrolase activity -0.031300
36 COG0607 Catalyzes the ATP-dependent transfer of a sulfur to tRNA to produce 4-thiouridine in position 8 of tRNAs, which functions as a near-UV photosensor. Also catalyzes the transfer of sulfur to the sulfur carrier protein ThiS, forming ThiS-thiocarboxylate. This is a step in the synthesis of thiazole, in the thiamine biosynthesis pathway. The sulfur is donated as persulfide by IscS -0.031015
37 COG1227 inorganic diphosphatase activity -0.030845
38 COG0863 Belongs to the N(4) N(6)-methyltransferase family 0.030577
39 COG3394 polysaccharide catabolic process 0.030096
40 332HC 0.029709
41 32U5Q MarR family 0.029596
42 COG0716 FMN binding -0.029450
43 COG1307 EDD domain protein, DegV family -0.029398
44 31SA0 0.029080
45 COG4134 Bacterial extracellular solute-binding protein 0.028575
46 COG2862 Uncharacterized protein family, UPF0114 0.028547
47 COG5434 Belongs to the glycosyl hydrolase 28 family 0.028543
48 COG5586 Nucleotidyl transferase AbiEii toxin, Type IV TA system -0.028543
49 COG4732 ThiW protein 0.028378
50 COG4478 integral membrane protein -0.028298
51 COG2337 Toxic component of a toxin-antitoxin (TA) module -0.028124
52 32UUW Protein of unknown function (DUF2554) 0.028095
53 COG1296 Branched-chain amino acid permease (Azaleucine resistance) -0.028076
54 COG3267 Type II secretory pathway component ExeA 0.027987
55 COG5293 efflux transmembrane transporter activity 0.027865
56 COG2738 Putative neutral zinc metallopeptidase 0.027677
57 COG2407 Converts the aldose L-fucose into the corresponding ketose L-fuculose 0.027618
58 COG0855 Catalyzes the reversible transfer of the terminal phosphate of ATP to form a long-chain polyphosphate (polyP) -0.027612
59 COG0404 The glycine cleavage system catalyzes the degradation of glycine -0.027581
60 COG2076 Multidrug Resistance protein -0.027563
61 COG1954 Regulates expression of the glpD operon. In the presence of glycerol 3-phosphate (G3P) causes antitermination of transcription of glpD at the inverted repeat of the leader region to enhance its transcription. Binds and stabilizes glpD leader mRNA 0.027485
62 COG2077 Thiol-specific peroxidase that catalyzes the reduction of hydrogen peroxide and organic hydroperoxides to water and alcohols, respectively. Plays a role in cell protection against oxidative stress by detoxifying peroxides 0.027479
63 COG4584 PFAM Integrase catalytic 0.027408
64 COG4804 nuclease activity 0.027383
65 COG0076 glutamate decarboxylase activity 0.027231
66 COG5011 Protein conserved in bacteria 0.027155
67 COG3464 Transposase -0.027140
68 COG0833 amino acid 0.027103
69 COG1775 2-hydroxyglutaryl-CoA dehydratase, D-component 0.026777
70 30BKD Bacterial regulatory proteins, lacI family 0.026683
71 COG3481 metal-dependent phosphohydrolase, HD sub domain -0.026601
72 2ZMYT Psort location Cytoplasmic, score 0.026507
73 COG0428 transporter -0.026475
74 COG1145 4fe-4S ferredoxin, iron-sulfur binding domain protein 0.026471
75 COG3968 glutamine synthetase -0.026454
76 COG2423 ornithine cyclodeaminase activity -0.026370
77 COG1611 Belongs to the LOG family -0.026349
78 COG3738 Protein conserved in bacteria 0.026316
79 32ZJK Nitrous oxide-stimulated promoter 0.026190
80 COG4823 Abi-like protein -0.026178
81 2ZAA7 RteC protein 0.026049
82 COG1813 peptidyl-tyrosine sulfation -0.026026
83 2Z7PZ Psort location Cytoplasmic, score 0.025903
84 COG1297 OPT oligopeptide transporter protein 0.025898
85 COG2065 Also displays a weak uracil phosphoribosyltransferase activity which is not physiologically significant 0.025874
86 2ZBND Psort location CytoplasmicMembrane, score 0.025785
87 COG4607 ABC-type enterochelin transport system periplasmic component -0.025761
88 COG0579 oxidoreductase 0.025758
89 COG0619 Transmembrane (T) component of an energy-coupling factor (ECF) ABC-transporter complex. Unlike classic ABC transporters this ECF transporter provides the energy necessary to transport a number of different substrates -0.025747
90 3368N 0.025692
91 COG4806 L-rhamnose isomerase 0.025687
92 COG3955 Domain of unknown function (DUF1919) 0.025685
93 COG2158 Cysteine-rich small domain 0.025593
94 COG3853 Toxic anion resistance protein (TelA) -0.025546
95 COG4637 Psort location Cytoplasmic, score -0.025507
96 COG4915 5-bromo-4-chloroindolyl phosphate hydrolysis protein -0.025482
97 COG1113 amino acid transport -0.025387
98 2ZA26 Psort location CytoplasmicMembrane, score 10.00 0.025245
99 33VTB Sigma-70 region 2 0.025191
100 COG0476 Involved in molybdopterin and thiamine biosynthesis, family 2 -0.025095
101 COG3149 Involved in a type II secretion system (T2SS, formerly general secretion pathway, GSP) for the export of proteins 0.025077
102 COG1055 Involved in arsenical resistance. Thought to form the channel of an arsenite pump -0.025058
103 COG0675 Transposase, IS605 OrfB family -0.025015
104 COG2388 GCN5-related N-acetyl-transferase 0.024996
105 COG1032 radical SAM domain protein 0.024915
106 COG0298 carbon dioxide binding 0.024733
107 2ZFPG -0.024708
108 COG4653 Phage capsid family 0.024641
109 COG1607 acyl-coa hydrolase -0.024575
110 COG1525 nuclease -0.024574
111 COG1305 Transglutaminase-like -0.024571
112 COG1236 Exonuclease of the beta-lactamase fold involved in RNA processing -0.024532
113 COG3740 Phage prohead protease, HK97 family 0.024496
114 COG0114 fumarate hydratase activity -0.024476
115 COG3393 -acetyltransferase -0.024404
116 33A5I Psort location CytoplasmicMembrane, score 0.024379
117 COG4200 ABC-2 family transporter protein -0.024296
118 COG4392 branched-chain amino acid -0.024276
119 COG4264 IucA IucC family -0.024242
120 2ZE9X Clostripain family 0.024234
121 COG5617 Psort location CytoplasmicMembrane, score -0.024225
122 COG4477 modulates the frequency and position of FtsZ ring formation. Inhibits FtsZ ring formation at polar sites. Interacts either with FtsZ or with one of its binding partners to promote depolymerization -0.024217
123 COG2358 TRAP transporter, solute receptor (TAXI family -0.024211
124 COG0536 An essential GTPase which binds GTP, GDP and possibly (p)ppGpp with moderate affinity, with high nucleotide exchange rates and a fairly low GTP hydrolysis rate. Plays a role in control of the cell cycle, stress response, ribosome biogenesis and in those bacteria that undergo differentiation, in morphogenesis control -0.024198
125 30BYH Psort location Cytoplasmic, score 8.96 0.024125
126 COG2115 Belongs to the xylose isomerase family 0.024037
127 COG2339 peptidase activity -0.024022
128 30YS7 PD-(D/E)XK nuclease superfamily 0.024002
129 2Z7JT 2-keto-3-deoxygluconate:proton symporter activity 0.023983
130 COG0095 Lipoate-protein ligase -0.023939
131 COG2186 Transcriptional regulator 0.023861
132 COG0312 modulator of DNA gyrase 0.023852
133 2ZCFF TerB N-terminal domain 0.023823
134 COG2206 PFAM metal-dependent phosphohydrolase, HD sub domain 0.023808
135 COG3201 nicotinamide mononucleotide transporter -0.023756
136 COG1578 Protein of unknown function DUF89 -0.023652
137 COG1674 ftsk spoiiie -0.023643
138 COG4099 phospholipase Carboxylesterase 0.023641
139 COG1122 ATPase activity -0.023593
140 COG4475 Protein of unknown function (DUF436) 0.023572
141 COG1526 Required for formate dehydrogenase (FDH) activity. Acts as a sulfur carrier protein that transfers sulfur from IscS to the molybdenum cofactor prior to its insertion into FDH -0.023533
142 COG3288 NAD(P)+ transhydrogenase (AB-specific) activity -0.023496
143 COG1338 Plays a role in the flagellum-specific transport system 0.023424
144 COG1256 bacterial-type flagellum assembly 0.023387
145 COG1951 Catalyzes the reversible hydration of fumarate to (S)- malate 0.023386
146 COG0757 Catalyzes a trans-dehydration via an enolate intermediate -0.023366
147 COG5301 cellulose 1,4-beta-cellobiosidase activity 0.023335
148 2ZKMW 0.023315
149 COG0388 Catalyzes the ATP-dependent amidation of deamido-NAD to form NAD. Uses L-glutamine as a nitrogen source -0.023254
150 31J2F Fimbrial protein 0.023251
151 32SK6 positive regulation of bacterial-type flagellum assembly 0.023230
152 COG1502 Catalyzes the reversible phosphatidyl group transfer from one phosphatidylglycerol molecule to another to form cardiolipin (CL) (diphosphatidylglycerol) and glycerol -0.023220
153 COG4624 Iron only hydrogenase large subunit, C-terminal domain 0.023196
154 COG4660 Part of a membrane complex involved in electron transport 0.023194
155 COG4977 sequence-specific DNA binding -0.023171
156 COG0071 Belongs to the small heat shock protein (HSP20) family -0.023169
157 COG2160 L-arabinose isomerase activity 0.023138
158 COG1381 Involved in DNA repair and RecF pathway recombination -0.023135
159 COG4581 dead DEAH box helicase 0.023113
160 COG3769 mannosylglycerate metabolic process 0.023091
161 COG1315 Flagellar Assembly Protein A 0.023027
162 2ZCC8 Protein of unknown function (DUF1097) 0.023002
163 COG1373 ATPase (AAA superfamily -0.022930
164 COG1804 L-carnitine dehydratase bile acid-inducible protein F -0.022927
165 COG4249 Peptidase C14 caspase catalytic subunit p20 0.022865
166 2ZC58 0.022852
167 33Q7U COG NOG25193 non supervised orthologous group 0.022813
168 COG1699 Binds to the C-terminal region of flagellin, which is implicated in polymerization, and participates in the assembly of the flagellum 0.022810
169 COG0509 glycine decarboxylation via glycine cleavage system -0.022786
170 2ZAJ2 0.022712
171 COG1638 TRAP-type C4-dicarboxylate transport system periplasmic component -0.022706
172 COG4268 DNA restriction-modification system 0.022670
173 COG4594 ABC-type Fe3 -citrate transport system, periplasmic component -0.022664
174 COG1070 Carbohydrate kinase 0.022633
175 COG4476 Belongs to the UPF0223 family -0.022614
176 COG0350 Involved in the cellular defense against the biological effects of O6-methylguanine (O6-MeG) and O4-methylthymine (O4-MeT) in DNA. Repairs the methylated nucleobase in DNA by stoichiometrically transferring the methyl group to a cysteine residue in the enzyme. This is a suicide reaction the enzyme is irreversibly inactivated -0.022603
177 COG1410 methionine synthase 0.022600
178 COG2062 phosphohistidine phosphatase, SixA -0.022583
179 2Z7N4 PFAM KilA, N-terminal APSES-type HTH, DNA-binding -0.022575
180 2Z7JK Psort location Cytoplasmic, score 0.022519
181 COG0255 Belongs to the universal ribosomal protein uL29 family -0.022505
182 COG1594 Catalyzes the reduction of ribonucleotides to deoxyribonucleotides. May function to provide a pool of deoxyribonucleotide precursors for DNA repair during oxygen limitation and or for immediate growth after restoration of oxygen 0.022469
183 32Y15 Psort location CytoplasmicMembrane, score 0.022457
184 COG3328 transposase activity -0.022426
185 COG4521 taurine ABC transporter 0.022397
186 COG4739 protein containing a ferredoxin domain 0.022396
187 315RV Domain of unknown function (DUF4373) 0.022348
188 COG2151 metal-sulfur cluster biosynthetic enzyme -0.022329
189 COG5567 small periplasmic lipoprotein 0.022317
190 COG2891 Involved in formation of the rod shape of the cell. May also contribute to regulation of formation of penicillin-binding proteins 0.022312
191 COG4859 Psort location Cytoplasmic, score 0.022288
192 COG2855 membrane 0.022280
193 COG3039 Transposase 0.022270
194 COG1970 mechanosensitive ion channel activity -0.022247
195 COG1387 Histidinol phosphatase and related hydrolases of the PHP family 0.022246
196 COG1610 YqeY-like protein 0.022185
197 COG4990 cell redox homeostasis -0.022180
198 COG1300 Membrane 0.022165
199 COG1201 RNA secondary structure unwinding 0.022156
200 COG2120 A mycothiol (MSH, N-acetylcysteinyl-glucosaminyl- inositol) S-conjugate amidase, it recycles conjugated MSH to the N-acetyl cysteine conjugate (AcCys S-conjugate, a mercapturic acid) and the MSH precursor. Involved in MSH-dependent detoxification of a number of alkylating agents and antibiotics 0.022101
201 COG1618 nucleotide phosphatase activity, acting on free nucleotides -0.022091
202 COG1376 ErfK ybiS ycfS ynhG family protein -0.022077
203 COG3682 Transcriptional regulator -0.021954
204 COG1781 'de novo' pyrimidine nucleobase biosynthetic process -0.021938
205 COG0673 inositol 2-dehydrogenase activity -0.021935
206 30S7H 0.021935
207 2Z7TX DNA replication 0.021892
208 COG3605 phosphoenolpyruvate-protein phosphotransferase activity 0.021870
209 COG2421 Acetamidase/Formamidase family -0.021850
210 COG1198 Involved in the restart of stalled replication forks. Recognizes and binds the arrested nascent DNA chain at stalled replication forks. It can open the DNA duplex, via its helicase activity, and promote assembly of the primosome and loading of the major replicative helicase DnaB onto DNA -0.021827
211 COG0324 Catalyzes the transfer of a dimethylallyl group onto the adenine at position 37 in tRNAs that read codons beginning with uridine, leading to the formation of N6-(dimethylallyl)adenosine (i(6)A) -0.021816
212 31FRY -0.021810
213 COG4766 ethanolamine catabolic process 0.021781
214 COG1585 Membrane protein implicated in regulation of membrane protease activity -0.021732
215 COG4976 Methyltransferase 0.021650
216 33HBZ -0.021644
217 COG1835 transferase activity, transferring acyl groups other than amino-acyl groups -0.021634
218 COG0210 DNA helicase -0.021625
219 33MZS COG NOG38865 non supervised orthologous group 0.021551
220 COG1687 branched-chain amino acid -0.021457
221 COG1267 Lipid phosphatase which dephosphorylates phosphatidylglycerophosphate (PGP) to phosphatidylglycerol (PG) -0.021438
222 COG4799 Acetyl-CoA carboxylase, carboxyltransferase component subunits alpha and beta -0.021360
223 COG0685 methylenetetrahydrofolate reductase (NAD(P)H) activity 0.021326
224 COG0475 glutathione-regulated potassium exporter activity -0.021295
225 COG4659 FMN binding 0.021286
226 COG2351 hydroxyisourate hydrolase activity 0.021263
227 COG4468 UDPglucose--hexose-1-phosphate uridylyltransferase -0.021254
228 32Q1X Protein of unknown function (DUF3106) 0.021189
229 COG3855 fructose 1,6-bisphosphate 1-phosphatase activity 0.021171
230 COG0393 Putative heavy-metal-binding 0.021164
231 COG2502 aspartate--ammonia ligase -0.021150
232 COG4096 Type I site-specific restriction-modification system, R (Restriction) subunit and related -0.021024
233 COG2113 Glycine betaine 0.021016
234 COG5280 Phage tail tape measure protein TP901 0.021014
235 31G17 0.020907
236 33GEW 0.020875
237 2Z9JQ Part of the ecpRABCDE operon, which encodes the E.coli common pilus (ECP). ECP is found in both commensal and pathogenic strains and plays a dual role in early-stage biofilm development and host cell recognition. Major subunit of the fimbria 0.020874
238 COG0412 Dienelactone hydrolase -0.020817
239 COG3486 L-lysine 6-monooxygenase (NADPH) activity -0.020788
240 COG3867 Arabinogalactan endo-beta-1,4-galactanase 0.020746
241 COG1914 H( )-stimulated, divalent metal cation uptake system 0.020726
242 COG1015 Phosphotransfer between the C1 and C5 carbon atoms of pentose 0.020718
243 2Z95P Psort location CytoplasmicMembrane, score 10.00 0.020713
244 COG0414 Catalyzes the condensation of pantoate with beta-alanine in an ATP-dependent reaction via a pantoyl-adenylate intermediate 0.020687
245 COG3837 Cupin domain -0.020668
246 33GDE This gene contains a nucleotide ambiguity which may be the result of a sequencing error 0.020639
247 33BI0 Prokaryotic dksA/traR C4-type zinc finger 0.020611
248 2ZVF6 Psort location Cytoplasmic, score -0.020592
249 33CUC Cro/C1-type HTH DNA-binding domain 0.020591
250 COG4570 Endonuclease that resolves Holliday junction intermediates made during homologous genetic recombination and DNA repair. Exhibits sequence and structure-selective cleavage of four-way DNA junctions, where it introduces symmetrical nicks in two strands of the same polarity at the 5' side of dinucleotides. Corrects the defects in genetic recombination and DNA repair associated with inactivation of ruvAB or ruvC 0.020574