Model Internals

Each PhenDB model is trained on sets of bacterial ENOGs (orthologous groups from EggNOG 4.5), which have or have not been identified in the training genomes. Each ENOG is given a weight, with the magnitude of the weight being the importance of that ENOG for the final prediction. The sign of the weight indicates whether the presence (positive weight) or absence (negative weight) of this ENOG is indicative of the trait.

This table lists the 250 highest-ranking ENOGs of this model.

rank in model enog name enog description weight in model
1 COG3520 type VI secretion protein 0.093477
2 COG3455 Type IV VI secretion system protein, DotU family 0.091595
3 COG3517 type VI secretion protein 0.090988
4 COG3523 ImpA, N-terminal, type VI secretion system 0.090538
5 COG3516 type VI secretion protein 0.087710
6 COG3522 type VI secretion protein 0.084804
7 COG3518 anti-sigma factor antagonist activity 0.079786
8 COG3515 Protein conserved in bacteria 0.074923
9 COG3519 Type VI secretion system, TssF 0.073947
10 COG3157 Type VI secretion system effector, Hcp 0.072735
11 COG3521 Type VI secretion 0.059650
12 COG3501 Rhs element vgr protein 0.057794
13 COG3913 Uncharacterized protein conserved in bacteria (DUF2094) 0.054289
14 COG4104 PAAR repeat-containing protein 0.050529
15 COG3456 conserved protein contains FHA domain 0.036269
16 COG4455 Protein of avirulence locus involved in temperature-dependent protein secretion 0.033480
17 COG4249 Peptidase C14 caspase catalytic subunit p20 0.031837
18 COG2268 Band 7 protein 0.030176
19 COG2603 Catalyzes the transfer of selenium from selenophosphate for conversion of 2-thiouridine to 2-selenouridine at the wobble position in tRNA 0.029360
20 COG2216 Part of the high-affinity ATP-driven potassium transport (or Kdp) system, which catalyzes the hydrolysis of ATP coupled with the electrogenic transport of potassium into the cytoplasm. This subunit is responsible for energy coupling to the transport system 0.029017
21 COG4968 Prepilin-type N-terminal cleavage methylation domain 0.028732
22 COG3299 homolog of phage Mu protein gp47 0.027918
23 COG2055 Belongs to the LDH2 MDH2 oxidoreductase family -0.027557
24 COG3034 peptidoglycan biosynthetic process 0.027278
25 COG1113 amino acid transport 0.026977
26 COG2156 Part of the high-affinity ATP-driven potassium transport (or Kdp) system, which catalyzes the hydrolysis of ATP coupled with the electrogenic transport of potassium into the cytoplasm. This subunit acts as a catalytic chaperone that increases the ATP- binding affinity of the ATP-hydrolyzing subunit KdpB by the formation of a transient KdpB KdpC ATP ternary complex 0.026674
27 COG1262 PFAM Formylglycine-generating sulfatase enzyme -0.026461
28 COG0300 oxidoreductase activity 0.026261
29 COG4799 Acetyl-CoA carboxylase, carboxyltransferase component subunits alpha and beta -0.026206
30 COG4695 Portal protein -0.025614
31 COG1213 nucleotidyl transferase 0.025449
32 COG2045 Belongs to the ComB family -0.024774
33 COG0392 lysyltransferase activity -0.024610
34 COG5435 protein kinase activity 0.024394
35 COG1285 pathogenesis 0.023963
36 COG4461 4-amino-4-deoxy-alpha-L-arabinopyranosyl undecaprenyl phosphate biosynthetic process 0.023194
37 COG3448 diguanylate cyclase activity 0.022822
38 COG0591 Belongs to the sodium solute symporter (SSF) (TC 2.A.21) family 0.022806
39 COG1027 Aspartate ammonia-lyase 0.022750
40 COG4579 [isocitrate dehydrogenase (NADP+)] phosphatase activity -0.022581
41 COG3072 Adenylate cyclase 0.022444
42 COG4555 ABC transporter -0.022329
43 COG0538 isocitrate dehydrogenase activity -0.022170
44 COG1135 Part of the ABC transporter complex MetNIQ involved in methionine import. Responsible for energy coupling to the transport system 0.022118
45 COG4970 Tfp pilus assembly protein FimT 0.021813
46 COG0672 )-iron permease -0.021679
47 COG2838 Isocitrate dehydrogenase 0.021580
48 COG3600 Phage-associated protein -0.021523
49 COG5615 integral membrane protein -0.021476
50 COG4252 Chase2 domain -0.021242
51 COG4607 ABC-type enterochelin transport system periplasmic component 0.021132
52 COG2011 ABC-type metal ion transport system permease component 0.021003
53 COG3831 ATPase activity 0.020952
54 COG1607 acyl-coa hydrolase -0.020935
55 COG4307 Protein conserved in bacteria 0.020654
56 COG0724 RNA recognition motif 0.020642
57 COG3619 membrane 0.020612
58 COG4857 Catalyzes the phosphorylation of methylthioribose into methylthioribose-1-phosphate -0.020608
59 COG2461 Hemerythrin HHE cation binding domain protein -0.020545
60 COG2831 hemolysin activation secretion protein 0.020491
61 COG0388 Catalyzes the ATP-dependent amidation of deamido-NAD to form NAD. Uses L-glutamine as a nitrogen source 0.020439
62 COG2801 Transposase and inactivated derivatives 0.020425
63 COG1232 Catalyzes the 6-electron oxidation of protoporphyrinogen-IX to form protoporphyrin-IX -0.020309
64 COG3011 Protein conserved in bacteria 0.020242
65 COG3915 nitrate assimilation 0.020212
66 COG4253 self proteolysis 0.020144
67 COG2895 sulfate adenylyltransferase (ATP) activity 0.019944
68 COG0715 COG0715 ABC-type nitrate sulfonate bicarbonate transport systems periplasmic components -0.019933
69 COG4924 0.019863
70 COG1636 Catalyzes the conversion of epoxyqueuosine (oQ) to queuosine (Q), which is a hypermodified base found in the wobble positions of tRNA(Asp), tRNA(Asn), tRNA(His) and tRNA(Tyr) -0.019670
71 32H4Q Bacterial regulatory proteins, tetR family 0.019490
72 COG2104 thiamine diphosphate biosynthetic process 0.019453
73 COG0428 transporter 0.019400
74 COG1434 Gram-negative-bacterium-type cell wall biogenesis -0.019356
75 COG3033 Beta-eliminating lyase -0.019303
76 32AXC Prokaryotic dksA/traR C4-type zinc finger -0.019205
77 COG4675 tail collar domain protein 0.019203
78 COG0402 S-adenosylhomocysteine deaminase activity 0.019159
79 COG3172 ATPase kinase involved in NAD metabolism -0.019139
80 COG3154 lipid carrier protein 0.018930
81 COG4269 membrane -0.018865
82 COG0627 Serine hydrolase involved in the detoxification of formaldehyde 0.018860
83 COG1416 DNA-binding transcription factor activity -0.018846
84 COG5654 RES domain protein -0.018845
85 COG0601 transmembrane transport 0.018762
86 COG0618 3'(2'),5'-bisphosphate nucleotidase activity -0.018757
87 COG2862 Uncharacterized protein family, UPF0114 0.018633
88 COG4692 amylo-alpha-1,6-glucosidase activity 0.018579
89 COG2978 secondary active p-aminobenzoyl-glutamate transmembrane transporter activity -0.018573
90 COG4254 PFAM FecR protein -0.018545
91 COG1635 Involved in the biosynthesis of the thiazole moiety of thiamine. Catalyzes the conversion of NAD and glycine to adenosine diphosphate 5-(2-hydroxyethyl)-4-methylthiazole-2-carboxylate (ADT), an adenylated thiazole intermediate, using free sulfide as a source of sulfur -0.018524
92 COG1984 allophanate hydrolase subunit 2 0.018521
93 COG0553 Transcription regulator that activates transcription by stimulating RNA polymerase (RNAP) recycling in case of stress conditions such as supercoiled DNA or high salt concentrations. Probably acts by releasing the RNAP, when it is trapped or immobilized on tightly supercoiled DNA. Does not activate transcription on linear DNA. Probably not involved in DNA repair -0.018506
94 COG1423 dna ligase 0.018504
95 COG3004 Na( ) H( ) antiporter that extrudes sodium in exchange for external protons 0.018457
96 COG3192 ethanolamine utilization protein 0.018444
97 32Y12 ParD-like antitoxin of type II bacterial toxin-antitoxin system -0.018366
98 COG0826 peptidase U32 0.018324
99 COG3485 protocatechuate 3,4-dioxygenase 0.018229
100 COG3469 chitinase activity 0.018181
101 2ZNS7 0.018175
102 COG1419 protein localization to endoplasmic reticulum 0.018102
103 COG2020 methyltransferase activity -0.018080
104 COG0683 leucine binding -0.018068
105 COG2026 Addiction module toxin, RelE StbE family -0.018062
106 COG2704 Responsible for the transport of C4-dicarboxylates from the periplasm across the inner membrane 0.017995
107 COG2210 Belongs to the sulfur carrier protein TusA family 0.017990
108 COG1765 OsmC-like protein -0.017898
109 COG3969 phosphoadenosine phosphosulfate 0.017892
110 COG0530 calcium, potassium:sodium antiporter activity -0.017865
111 32TJX 0.017850
112 COG2610 gluconate transmembrane transporter activity 0.017825
113 COG0154 amidase activity -0.017735
114 COG1352 Methylation of the membrane-bound methyl-accepting chemotaxis proteins (MCP) to form gamma-glutamyl methyl ester residues in MCP 0.017723
115 COG0327 Belongs to the GTP cyclohydrolase I type 2 NIF3 family -0.017703
116 COG3797 Protein conserved in bacteria 0.017689
117 COG4567 Response regulator consisting of a CheY-like receiver domain and a Fis-type HTH domain 0.017597
118 COG4978 Transcriptional regulator 0.017589
119 COG3760 aminoacyl-tRNA metabolism involved in translational fidelity -0.017588
120 COG2110 phosphatase homologous to the C-terminal domain of histone macroH2A1 0.017561
121 COG2060 Part of the high-affinity ATP-driven potassium transport (or Kdp) system, which catalyzes the hydrolysis of ATP coupled with the electrogenic transport of potassium into the cytoplasm. This subunit binds and transports the potassium across the cytoplasmic membrane 0.017559
122 COG2356 endonuclease I -0.017540
123 COG2040 homocysteine 0.017531
124 COG1816 Catalyzes the hydrolytic deamination of adenine to hypoxanthine. Plays an important role in the purine salvage pathway and in nitrogen catabolism 0.017525
125 COG3476 COG3476 Tryptophan-rich sensory protein (mitochondrial benzodiazepine receptor homolog) 0.017500
126 COG2323 membrane 0.017491
127 COG2854 intermembrane phospholipid transfer 0.017448
128 COG4658 electron transport chain -0.017427
129 COG0308 aminopeptidase N 0.017385
130 COG1989 Cleaves type-4 fimbrial leader sequence and methylates the N-terminal (generally Phe) residue 0.017336
131 COG1173 ABC-type dipeptide oligopeptide nickel transport systems, permease components 0.017319
132 COG3073 An anti-sigma factor for extracytoplasmic function (ECF) sigma factor sigma-E (RpoE). ECF sigma factors are held in an inactive form by an anti-sigma factor until released by regulated intramembrane proteolysis (RIP). RIP occurs when an extracytoplasmic signal triggers a concerted proteolytic cascade to transmit information and elicit cellular responses. The membrane-spanning regulatory substrate protein is first cut periplasmically (site-1 protease, S1P, DegS), then within the membrane itself (site-2 protease, S2P, RseP), while cytoplasmic proteases finish degrading the anti-sigma factor, liberating sigma-E 0.017265
133 COG1773 rubredoxin 0.017111
134 COG0004 ammonium transporteR -0.017091
135 COG2826 transposase and inactivated derivatives, IS30 family 0.017072
136 2Z9D2 Psort location CytoplasmicMembrane, score -0.017051
137 COG1459 type II secretion system 0.017049
138 COG3534 alpha-L-arabinofuranosidase 0.017016
139 COG2962 Rard protein 0.016985
140 COG2912 Transglutaminase-like superfamily -0.016981
141 COG3178 'Phosphotransferase -0.016968
142 COG0252 asparaginase activity 0.016947
143 33968 0.016918
144 33VRK GDSL-like Lipase/Acylhydrolase family -0.016873
145 COG3572 ergothioneine biosynthetic process 0.016867
146 COG1540 5-oxoprolinase (ATP-hydrolyzing) activity 0.016849
147 COG1823 symporter activity 0.016836
148 COG1030 water channel activity 0.016827
149 COG1021 2,3-dihydroxybenzoate-AMP ligase 0.016801
150 COG1464 Belongs to the NlpA lipoprotein family 0.016780
151 COG2115 Belongs to the xylose isomerase family 0.016747
152 COG3465 dGTPase activity 0.016733
153 COG1457 Belongs to the purine-cytosine permease (2.A.39) family 0.016721
154 COG3093 addiction module antidote protein HigA -0.016699
155 COG3876 Protein conserved in bacteria -0.016634
156 34AQQ 0.016605
157 COG2808 Putative FMN-binding domain 0.016581
158 COG3284 Transcriptional regulator 0.016578
159 COG1929 Belongs to the glycerate kinase type-1 family 0.016527
160 COG0520 Catalyzes the removal of elemental sulfur and selenium atoms from L-cysteine, L-cystine, L-selenocysteine, and L- selenocystine to produce L-alanine 0.016523
161 COG2135 peptidase activity 0.016513
162 349CY COG1670 acetyltransferases, including N-acetylases of ribosomal proteins -0.016461
163 33JYX 0.016415
164 COG2804 Type II secretory pathway, ATPase PulE Tfp pilus assembly pathway, ATPase PilB 0.016407
165 33BQB cellular response to cell envelope stress 0.016402
166 32YM4 -0.016371
167 32SEB 0.016341
168 COG4133 protoheme IX ABC transporter activity -0.016273
169 COG4742 methyltransferase activity 0.016269
170 COG2853 Lipoprotein 0.016234
171 COG3975 serine-type endopeptidase activity -0.016204
172 COG0709 Synthesizes selenophosphate from selenide and ATP 0.016119
173 COG1378 Sugar-specific transcriptional regulator TrmB 0.016070
174 COG0347 Belongs to the P(II) protein family -0.016024
175 30JKP 0.015969
176 COG0797 Lytic transglycosylase with a strong preference for naked glycan strands that lack stem peptides -0.015963
177 COG1518 maintenance of DNA repeat elements 0.015872
178 COG3540 Alkaline phosphatase -0.015856
179 COG1473 amidohydrolase 0.015799
180 COG3919 ATP-grasp 0.015766
181 COG5361 Conserved Protein 0.015758
182 COG0555 PFAM binding-protein-dependent transport systems inner membrane component 0.015754
183 COG4392 branched-chain amino acid 0.015666
184 COG1592 Rubrerythrin -0.015650
185 COG3710 Transcriptional regulator 0.015585
186 COG1708 protein C-terminal S-isoprenylcysteine carboxyl O-methyltransferase activity 0.015579
187 32RR9 Type VI secretion-associated protein, VC_A0118 family 0.015577
188 COG1796 DNA polymerase -0.015575
189 COG3668 Plasmid stabilization system 0.015571
190 COG4653 Phage capsid family -0.015565
191 COG1961 COG1961 Site-specific recombinases, DNA invertase Pin homologs 0.015549
192 COG0213 The enzymes which catalyze the reversible phosphorolysis of pyrimidine nucleosides are involved in the degradation of these compounds and in their utilization as carbon and energy sources, or in the rescue of pyrimidine bases for nucleotide synthesis 0.015531
193 COG4459 periplasmic nitrate reductase 0.015523
194 COG1166 arginine decarboxylase activity 0.015496
195 COG0529 Catalyzes the synthesis of activated sulfate 0.015484
196 COG4390 Protein conserved in bacteria -0.015474
197 2ZUKB PAS domain 0.015464
198 32ZJ3 -0.015460
199 COG3049 Linear amide C-N hydrolases, choloylglycine hydrolase family 0.015451
200 COG3039 Transposase 0.015442
201 COG1237 beta-lactamase domain protein 0.015407
202 COG0027 Involved in the de novo purine biosynthesis. Catalyzes the transfer of formate to 5-phospho-ribosyl-glycinamide (GAR), producing 5-phospho-ribosyl-N-formylglycinamide (FGAR). Formate is provided by PurU via hydrolysis of 10-formyl-tetrahydrofolate 0.015405
203 COG3310 Protein conserved in bacteria 0.015385
204 COG2022 thiamine diphosphate biosynthetic process 0.015372
205 COG4421 COG4421 Capsular polysaccharide biosynthesis protein -0.015367
206 COG0634 Belongs to the purine pyrimidine phosphoribosyltransferase family 0.015361
207 COG3751 2OG-Fe(II) oxygenase superfamily -0.015345
208 COG0689 Phosphorolytic exoribonuclease that removes nucleotide residues following the -CCA terminus of tRNA and adds nucleotides to the ends of RNA molecules by using nucleoside diphosphates as substrates 0.015339
209 COG4605 iron ion homeostasis 0.015324
210 COG5018 Exonuclease 0.015311
211 COG1288 antiporter activity -0.015267
212 COG0369 Component of the sulfite reductase complex that catalyzes the 6-electron reduction of sulfite to sulfide. This is one of several activities required for the biosynthesis of L- cysteine from sulfate. The flavoprotein component catalyzes the electron flow from NADPH - FAD - FMN to the hemoprotein component 0.015260
213 32DP5 -0.015257
214 COG3916 Acyl-homoserine-lactone synthase 0.015243
215 COG0157 Belongs to the NadC ModD family 0.015243
216 COG3513 CRISPR (clustered regularly interspaced short palindromic repeat) is an adaptive immune system that provides protection against mobile genetic elements (viruses, transposable elements and conjugative plasmids). CRISPR clusters contain spacers, sequences complementary to antecedent mobile elements, and target invading nucleic acids. CRISPR clusters are transcribed and processed into CRISPR RNA (crRNA). In type II CRISPR systems correct processing of pre-crRNA requires a trans-encoded small RNA (tracrRNA), endogenous ribonuclease 3 (rnc) and this protein. The tracrRNA serves as a guide for ribonuclease 3-aided processing of pre-crRNA. Subsequently Cas9 crRNA tracrRNA endonucleolytically cleaves linear or circular dsDNA target complementary to the spacer -0.015224
217 32THI PFAM SapC family protein -0.015210
218 30TX1 -0.015210
219 33492 0.015197
220 COG4706 dehydratase 0.015196
221 COG4258 3-demethylubiquinone-9 3-O-methyltransferase activity 0.015150
222 COG3549 Plasmid maintenance system killer -0.015138
223 32YCB Plasmid maintenance protein CcdB 0.015123
224 COG0371 Dehydrogenase -0.015120
225 COG3055 Converts alpha-N-acetylneuranimic acid (Neu5Ac) to the beta-anomer, accelerating the equilibrium between the alpha- and beta-anomers. Probably facilitates sialidase-negative bacteria to compete sucessfully for limited amounts of extracellular Neu5Ac, which is likely taken up in the beta-anomer. In addition, the rapid removal of sialic acid from solution might be advantageous to the bacterium to damp down host responses 0.015114
226 32XFJ 0.015100
227 COG3019 metal-binding protein 0.015073
228 COG4941 Belongs to the sigma-70 factor family -0.015059
229 COG3166 PFAM Fimbrial assembly family protein 0.015014
230 COG3271 cysteine-type peptidase activity 0.015001
231 COG0235 Class ii aldolase 0.014906
232 COG3506 Protein of unknown function (DUF1349) 0.014863
233 COG2821 murein-degrading enzyme. may play a role in recycling of muropeptides during cell elongation and or cell division -0.014851
234 COG0739 heme binding -0.014849
235 COG5505 Protein of unknown function (DUF819) -0.014843
236 COG5270 sulfate reduction -0.014815
237 COG2240 Pyridoxal kinase involved in the salvage pathway of pyridoxal 5'-phosphate (PLP). Catalyzes the phosphorylation of pyridoxal to PLP 0.014812
238 COG1574 metal-dependent hydrolase with the TIM-barrel fold -0.014798
239 33NXX Transposase 0.014783
240 COG4763 Acyl-transferase -0.014777
241 COG4425 Alpha/beta-hydrolase family 0.014767
242 COG4566 intracellular signal transduction -0.014749
243 COG2375 cellular response to nickel ion 0.014739
244 COG3555 Aspartyl Asparaginyl beta-hydroxylase 0.014724
245 COG1522 sequence-specific DNA binding -0.014688
246 COG3395 kinase activity 0.014684
247 COG4737 Cytotoxic translational repressor of toxin-antitoxin stability system -0.014671
248 COG5330 Evidence 4 Homologs of previously reported genes of -0.014663
249 COG1754 DNA topoisomerase type I activity -0.014644
250 COG2332 Heme chaperone required for the biogenesis of c-type cytochromes. Transiently binds heme delivered by CcmC and transfers the heme to apo-cytochromes in a process facilitated by CcmF and CcmH -0.014624