banner
News center
Top-performing employees are rewarded.

In silico discovery of druggable targets in Citrobacter koseri using echinoderm metabolites and molecular dynamics simulation | Scientific Reports

Nov 05, 2024

Scientific Reports volume 14, Article number: 26776 (2024) Cite this article

Metrics details

Citrobacter koseri causes infection in people who are immunocompromised. Without effective antibiotics, these infections can become severe and life-threatening, so effective drugs are essential to treat these infections. Utilizing subtractive genomics, 2699 ORFs were predicted and translated into amino acid sequences. Metabolic pathway analysis and subcellular localization helped define the roles of key bacterial proteins. Two druggable proteins, WP_012000829.1 and WP_275157394.1, were discovered as promising targets. Alpha Fold provided 3D structures, and a library of 1600 echinoderm metabolites was docked against these proteins, with Ampicillin, Levofloxacin, and Doxycycline as controls. Notably, CMNPD13085 and CMNPD15632 exhibited the highest binding affinities for WP_012000829.1 and WP_275157394.1, respectively. Molecular dynamics simulations and MM-GBSA binding free energy complemented docking results. However, acknowledging the reliance on computational validations, the study emphasizes the need for essential in-vitro research to transform these potential inhibitors into therapeutic drugs.

Citrobacter koseri, formerly known as Citrobacter diversus, is a bacterium of the Citrobacter genus. It is a common rod-shaped, gram-negative, facultative anaerobic bacillus. It is a typical inhabitant of the human digestive tract and may be found in a variety of foods, soil, and water in the environment. Despite being categorized as an opportunistic pathogen, C. koseri can cause devastating infections, especially in immunocompromised and neonatal patients1,2. The bacteria can spread from person to person, particularly in hospital environments, by contaminated food or drink, the fecal-oral pathway, or interpersonal contact3,4.

The isolation of C. koseri from elderly patients with severe underlying medical disorders was common in the past. Urinary tract infections are the most common cause of it5,6. C. koseri infections induce a wide range of symptoms depending on the site of infection. In the urinary system, infections can cause painful urination (dysuria), frequent urination, urgency, blood in the urine (hematuria), and lower abdomen discomfort7. The bacterium is especially harmful in newborns, producing meningitis and brain abscesses, which can be accompanied by fever, irritability, poor feeding, convulsions, and fatigue. These infections in babies can result in serious brain damage8. C. koseri can also cause septicemia, which is characterized by fever, chills, hypotension, and indications of systemic infection and affects primarily immunocompromised people9. Although less prevalent, it can cause respiratory infections, such as pneumonia, especially in hospitalized patients9.

Treatment for C. koseri infections is determined by the intensity and site of the infection and the bacteria’s resistance profile. Antibiotic treatment must be adapted to the susceptibility results to guarantee effectiveness. Supportive therapy is critical, especially for serious diseases such as newborn meningitis or septicemia, which necessitate hospitalization and extensive monitoring10,11.

Prevention frequently includes appropriate hygiene standards, particularly in healthcare settings where the bacterium can be a source of hospital-acquired diseases. This involves handwashing on a regular basis, adequate sanitation of medical equipment, and infection control procedures12,13. Infection prevention also heavily depends on ensuring access to clean water and food supplies.

In most cases, antibiotics are used to treat C. koseri infections. However, antibiotic selection should be guided by antimicrobial susceptibility testing results to ensure that the antibiotic chosen is effective against the specific strain of C. koseri causing the infection. Antibiotic resistance can be a problem, so antibiotic sensitivity testing is necessary to determine the most effective treatment. Antibiotic resistance influences infection incidence and pathogen dissemination, as well as increasing treatment costs and death rates. C. koseri has recently developed resistance to a variety of antibiotics, including cephalosporins of the third generation, beta-lactamases, aminoglycosides, and quinolones14,15. Hence, having effective drugs to treat C. koseri infections is crucial for patient care, public health, and the ongoing fight against antibiotic resistance.

The need for novel drugs or treatment alternatives against C. koseri stems from the growing problems faced by antibiotic resistance and the severity of infections caused by it. Developing efficient drugs is critical for safeguarding vulnerable groups, such as infants and immunocompromised adults, who are more likely to suffer from severe sickness and consequences. Novel therapeutic drugs may also assist minimize dependence on current antibiotics, delaying the spread of resistance. Novel drugs that prevent or cure infections more effectively will not only improve patient outcomes but also lessen the pressure on healthcare systems, emphasizing the necessity of expanding treatment choices in the fight against C. koseri16.

The current study focuses primarily on discovering potential targets in C. koseri using a subtractive proteomics approach, as well as selecting inhibitors for identified novel proteins to avoid cross-reactivity with the human proteome. The goal was to identify therapeutic targets that are exclusively found in C. koseri, are essential, are cytoplasmic, have unique pathways, and are also absent in humans and human gut microbiota.

The genome for Citrobacter Koseri NCTC 11,075 were downloaded from NCBI (https://www.ncbi.nlm.nih.gov/)17.

ORF prediction helps in identifying genes within a genome. By predicting ORFs, we can translate these regions into protein sequences and subsequently assign putative functions to these proteins based on homology searches against known protein databases. Predicting ORFs provides insights into the organization of genes within the genome, including the location of coding sequences and regulatory regions. A statistical model called the Interpolated Context Model (ICM) is utilized in gene prediction algorithms, especially in the context of the Glimmer gene prediction tool (https://ccb.jhu.edu/software/glimmer/index.shtml). The Interpolated Context Model has been incorporated into Glimmer3 to increase the precision of gene prediction. Delcher et al. (2007) described how the Interpolated Context Model was originally used in the context of Glimmer218. So, the ICM was used to predict ORFs from the Citrobacter Koseri genome using Glimmer3. Start codons were supplied using the GenBank translation table entry, and stop codons were specified using the standard GenBank translation table “atg, gtg, ttg” as a comma separated list.

The predicted genes were represented by nucleotide sequences in the Glimmer3 output. The “Transeq” program from the EMBOSS suite (https://www.ebi.ac.uk/jdispatcher/st/emboss_transeq)19 was used to clarify the related protein sequences. “Transeq” converts nucleotide sequences into the matching protein sequences, aiding functional investigation further down the line. The predicted genes from “Glimmer3” were translated into amino acid sequences using “Transeq,” resulting in a sizable protein dataset for the downstream analysis. For characterizing gene products and speculating on their functional functions in the Citrobacter Koseri genome, this dataset is an invaluable resource.

After translating the predicted genes from Glimmer3 using the “Transeq” tool19, the goal was to lessen sequence redundancy and identify a protein dataset that was non-redundant. “CD-HIT Tools” suite (https://github.com/weizhongli/cdhit)20, a potent software program frequently used for sequence clustering and redundancy elimination, for this purpose. There were several important parameters in the CD-HIT program were considered. The sequence identity threshold was initially set at 0.6, meaning that sequences with a similarity of at least 60% would be grouped together. Additionally, to strike the best balance between sensitivity and accuracy, word size of 4 for sequences falling between the similarity range of “0.6 to 0.7” was used. Additionally, the redundancy tolerance of 2 was used, meaning that if a sequence shared more than two common sequences, it would be included in the same cluster. Twenty amino acids were chosen as the alignment bandwidth, allowing for more adaptable alignment during clustering. Finally, all the sequences were eliminated that were shorter than 100 amino acids to ensure the quality of the final dataset. A non-redundant protein dataset was created by running the CD-HIT clustering procedure using the predetermined settings, which allowed us to conduct subsequent studies more effectively and with less computing complexity.

Several bioinformatic investigations were conducted to determine the pathogen’s human non-homologous proteins and to eliminate any host-human homologous proteins. First, a human protein database was generated from the RefSeq NCBI database, which contains a vast array of human protein sequences21. Then, Blastp algorithm was used for the comparative sequence analysis of the pathogen protein sequences against the human protein database using the DIAMOND software (https://github.com/bbuchfink/diamond)22 version 2.3.0+. Due to DIAMOND’s reputation for quick and effective sequence alignment, it is appropriate for extensive comparisons. To prevent repetition in the results, identical self-hits between sequences were masked during this process. Composition-based statistics in DIAMOND were used to evaluate statistical significance and sequence similarity23. The sequence alignments were assessed using the BLOSUM62 score matrix, which effectively depicts the similarity in amino acids between proteins. To ensure accurate comparisons, the “Tantan” masking technique was considered to filter low-complexity regions. An E-value criterion of 0.001 for determining substantial homology. Below this cutoff, alignments were deemed statistically significant and suggestive of probable homologous connections. This resulted in a pathogen protein dataset having only those proteins which were not homologous to its host with the given parameters used in this analysis as all the homologous proteins were identified and discarded leaving only unique proteins.

The Geptop 2.0 (http://guolab.whu.edu.cn/geptop/) server was used to identify critical genes required for the pathogen’s survival. A web-based application called Geptop 2.0 was created to predict the essential genes that bacteria needs to survive24. To find possible essential genes within the input proteome, it uses a Blastp-based technique with a user-defined threshold value, in this case 1e-5. To identify genes that are conserved across related bacterial species and are expected to have vital roles, the program employs a comparative genomics approach24.

The KEGG (Kyoto Encyclopedia of Genes and Genomes) database (https://www.genome.jp/kegg/) and the KAAS (KEGG Automatic Annotation Server) (https://www.genome.jp/kegg/kaas/) were used for comparative pathway analysis between the pathogen and humans25,26. According to Kanehisa and Goto (2000), KEGG offers a comprehensive collection of route data and functional annotations for genes and proteins. In order to identify the genes involved in unique pathways, KAAS, an annotation tool, automates the assignment of KEGG Ortholog (KO) numbers to genes27. A comparative study of the pathways from the two organisms was performed to uncover the distinct pathways of the pathogen that are not present in the human proteome. Pathways also gave the number of genes that were ingesting the metabolic pathways of the pathogen using these servers.

The PSORTb online server (https://www.psort.org/psortb/) was used to forecast the subcellular distribution of the resultant proteins28. A popular technique for predicting bacterial protein subcellular localization is PSORTb. The subcellular location of a protein within the bacterial cell is predicted using a machine-learning approach and a comprehensive collection of localization features (Yu et al., 2010). To anticipate & determine the subcellular localizations, the PSORTb server was used on genes that were engaged in distinct pathways. Proteins can be present in the outer membrane, periplasm, plasma membrane, extracellular space, and cytoplasm of bacteria. These are the five critical locations29. It has been suggested to use membrane proteins as targets for vaccines and cytoplasmic proteins as targets for medications30.

The VFDB (Virulence Factor Database) database (http://www.mgc.ac.cn/VFs/), which is intended to find virulence factors in bacterial pathogens, was used to predict the virulence of proteins31. The identified proteins’ possible virulence-related functions were further described using VFDB analysis, and their cellular location within the bacterial cell was shown by PSORTb.

The Drug Bank 3.0 database (https://go.drugbank.com/structures/search/bonds/sequence#results) was used to identify prospective therapeutic targets. Wishart et al.32 describe Drug Bank as a comprehensive resource that unifies information on medications, drug targets, and drug interactions. The identified proteins were Blast searched against Drug Bank using the following criteria: E-value 0.00001, gapped alignment, drug type “approved,” and protein type “target.” This study sought to identify proteins with the potential to function as therapeutic targets and that may be receptive to repurposing existing FDA-approved medications.

The human gut microbiome is critical to the host’s health and metabolism. A sequence similarity search was performed to investigate potential connections between the identified pathogen proteins and gut flora. The previously identified proteins were used as query sequences in a BLASTP search against a vast collection of human gut microbiome/metagenome sequences. The goal of this search was to see if any of the pathogen proteins shared sequence similarities with known components of the gut microbial ecosystem. For this investigation, BLASTP version 2.10.1+ (https://github.com/ncbi/docker/tree/master/blast/2.10.1) was used33.

A library of echinoderms metabolites containing 1600 compounds were obtained from the Comprehensive Marine Natural Products Database (https://www.cmnpd.org/) and prepared by using the LigPrep tool for the virtual screening against the identified drug target protein34. The target proteins’ 3D structures were retrieved from the AlphaFold database. The protein structures were prepared for further study using the Protein Preparation Wizard in the Schrödinger Maestro software package35. There were several processes involved in the protein preparation process. Disulfide bonds were created, bond orders were set, and zero-order metal bonds were allocated. Additionally, hydrogen was added to the protein structures. In the crystal structures, all unnecessary ligands and water molecules were removed as well. Using the PROPKA program, we computed the pKa values of the ionizable groups in proteins36. , and proteins’ hydrogen bond networks were optimized at pH 7.0. Finally, to minimize the energy of the protein structures, an empirical forcefield called the OPLS_2005 forcefield which is frequently used in molecular simulations was used. After the protein was prepared, 3D grids were built at the predicted site. The protein binding sites were predicted using the CastP webtool. Using these grids, ligands were docked into pre-specified binding sites on proteins to study the interactions and binding patterns. This process is known as site-specific docking. With the Glide docking module in SP (Standard Precision) mode, the prepared ligands were docked at specific sites on the prepared protein structures15. At last, the gliding scores of the docked ligands were analyzed and selected.

Desmond was employed to conduct Molecular Dynamics (MD) simulations lasting 200 ns for selected compounds37. The protein-ligand complexes were submitted to MD simulations to assess their stability. The stability of protein-ligand complexes was assessed through MD simulations, following a series of steps, including preprocessing, optimization, and minimization. The OPLS_2005 force field was utilized for the minimization process38. The compounds were solvated in a periodic box with a 10 Å size containing the TIP3P water molecules39. The systems were neutralized by adding counter ions and 0.15 M NaCl salt as needed to mimic physiological conditions. The NPT ensemble was set to a temperature of 300 K and a pressure of 1 atm. Prior to the simulation initiation, the systems underwent a relaxation phase. Trajectories were recorded and saved at 40 ps intervals during the simulation, enabling subsequent analysis of the obtained results.

The C. Koseri NCTC 11,075 genome was obtained from NCBI. In the first step, ORFs were predicted from the genome of C. Koseri, which comprises 2699 ORFs. The accuracy of ORF prediction is crucial for comprehending an organism’s genomic makeup and finding potential genes and functional elements. The predicted genes were then translated into amino acid sequences using the software “Transeq,” resulting in a complete protein dataset for future research. After paralogs were deleted, a CD-HIT suite with a 60% threshold identified 2212 out of 2699 non-redundant proteins. The selected 2212 sequences were subjected to Blast-p analysis to find non-homologous proteins, and 1809 non-homologous proteins were found. Furthermore, the Geptop 2.0 server was utilized to identify essential genes for pathogen survival, and 175 essential genes were identified from 1809 non-homologous proteins. Table 1 depicts each step of the study’s approach, as well as the number of proteins engaged in each phase.

Comparative metabolic pathway research of non-homologous essential proteins was also performed. The proteins that were chosen were employed to determine the metabolic pathways with which they are associated. The purpose of this work is to employ significant and often used enzymes involved in bacterial pathways to identify targets for therapy. C. Koseri-specific metabolic pathways that have not been identified in humans were chosen. Thus, 58 proteins with distinct metabolic pathways were chosen for further investigation. Table 2 provides the details of these distinct pathways and proteins involved.

When creating a therapeutic target, localization is a crucial consideration since proteins can be in a variety of locations. The subcellular localization of the 58 proteins chosen for this study was investigated further. According to the findings, 40 of the 58 proteins were found in the cytoplasmic region, two were unknown, and the rest were found in the cytoplasmic membrane region (Table 3). Since cytoplasmic proteins can be used as therapeutic targets as suggested by Wright, G. D. (2005) that many antibiotics target cytoplasmic proteins in bacteria, such as ribosomes or enzymes involved in DNA replication, transcription, and translation and Spillantini, M. G., & Goedert, M. (2013) review findings informs that in neurodegenerative diseases like Alzheimer’s and Parkinson’s, targeting cytoplasmic proteins such as tau or alpha-synuclein is a therapeutic strategy40,41, 40 cytoplasmic proteins were included in the study, whereas the other proteins were excluded. These proteins were subsequently analyzed using the VFDB, and 13 of them were shown to be virulent (Table 3).

Another essential need for therapeutic targets is druggability. It is described as the probability that a small-molecule medication will affect a therapeutic target protein’s function. By comparing sequence similarities with drug targets in the Drug Bank database, the druggability of C. koseri pathogenic proteins was identified. This resulted in the discovery of two C. koseri proteins that have striking similarities to FDA-approved small-molecule drugs (Table 4).

The library of echinoderms metabolites containing 1600 compounds was docked against drug target proteins using the Glide tool’s standard precision protocol. The top 10 compounds that docked at each receptor were selected after the docking findings were evaluated using the glide gscore. (Table 5). The compounds Doripenem, and Imipenem42 were used as controls during the docking. In WP_012000829.1 docking analysis, the controls showed binding affinities − 5.353 and 4.644, while the selected natural products showed affinities in the range of -7.251 to -6.718 kcal/mol. For WP_275157394.1, the docking scores of controls were − 4.552 and − 3.566 and the docking scores of selected products were − 7.887 and − 7.401 kcal/mol.

The plausible binding modes of the selected compounds were analyzed and based on the glide scores, the compound exhibiting the highest potential for each target was selected and subsequently subjected to further analysis to investigate its molecular interactions and the durability of its binding interactions. CMNPD13085 was selected against WP_012000829.1 protein and it was observed that the ligand was well aligned with binding pocket of the protein (Fig. 1a). Further, the molecular interaction between protein and ligand showed that the compound made 3 hydrogen bonds with Asp302, Arg324, and Asp339. It also made six hydrophobic interactions with Phe149, Val321, Val299, Phe313, Leu304, and Leu338 as shown in Fig. 1b. In the case of WP_275157394.1 protein, CMNPD15632 showed the highest binding affinity among the selected ligands. The plausible binding mode of the compound showed well aligned mode in the binding pocket (Fig. 1c). The molecular interactions showed that the compound made hydrogen bonds with six residues Arg187, Gly190, Arg161, Asn487, Asp255, and Gly319. It also made one hydrophobic interaction with Pro145 as shown in Fig. 1d.

The plausible binding modes and molecular interactions of the selected compounds with the proteins. (a) The binding mode of CMNPD13085 against WP_012000829.1 (b) The molecular interactions of CMNPD13085 with protein. (c) The binding mode of CMNPD15632 against WP_275157394.1 (d) The molecular interactions of CMNPD15632 with protein. The molecular interactions are represented with the green (hydrogen bonds) and Magenta (Hydrophobic) spheres.

The stability of the complexes along with the control compound Doripenem and the selected ligand was analyzed by conducting a 200 ns simulation. The RMSD of the C-alpha atoms was calculated and plotted to observe the deviations in the values. The RMSD plot of WP_012000829.1 and control complex gradually increased to 2.4 Å at 50 ns and then maintained a range of 2.4–2.8 Å till the end of simulation (Fig. 2a) while the hit complex showed the same trend in terms of RMSD from the start of simulation i.e., the RMSD remained in the range of 2.4–2.8 Å throughout the simulation (Fig. 2b). On the other hand, the RMSD of WP_275157394.1and control complex increased to 5.6 Å at 20 ns and remained in this range till the end of simulation (Fig. 2c) similarly, the RMSD of hit complex showed the increased RMSD of 5.6 Å at 20 ns and then remained in this range till the end of simulation (Fig. 2d). In contrast, the RMSD of ligand atoms in WP_012000829.1 complexes were aligned with the protein while it showed differences in WP_275157394.1 complexes where the RMSD of ligands remained lower than protein RMSD values. The RMSD analysis showed that the ligands remained stably bound with protein and proteins did not undergo conformational changes upon binding of the ligands.

The MD trajectory analysis to calculate the RMSD of carbon alpha atoms of protein complexes. (a) WP_012000829.1 complexed with control, (b) WP_012000829.1 complexed with CMNPD13085. (c) WP_275157394.1 complexed with control (d) WP_275157394.1 complexed with CMNPD15632.

RMSF analysis, which demonstrates the flexibility of protein residues in response to this ligand’s binding, was conducted to examine the residual flexibility during simulation. The higher RMSF values show flexibility, and lower values indicate the rigidity of the residues. From the RMSF values, it was observed that most of the WP_012000829.1 protein residues in both complexes remained rigid during the simulation except for the residues ranging from 110 to 120, 290 to 310, and 380 to 400 which showed higher fluctuations than other residues (Fig. 3a-b). The RMSF analysis of WP_275157394.1 protein revealed that most of the residues in both complexes remained rigid except the residues ranging from 390 to 420 where RMSF values reached ~ 5.6 Å (Fig. 3c-d). The analysis showed that the ligand did not exert fluctuations to the protein residues during simulation.

The MD trajectory analysis to calculate the RMSF of carbon alpha atoms of protein complexes. (a) WP_012000829.1 complexed with control, (b) WP_012000829.1 complexed with CMNPD13085. (c) WP_275157394.1 complexed with control (d) WP_275157394.1 complexed with CMNPD15632.

The most important interactions between the protein and ligands discovered during MD simulation analysis were hydrophobic, hydrogen, and ionic bonds. These interactions are essential for maintaining the protein-ligand complex’s stability and controlling its functional properties. The residues involved in hydrogen bonding in WP_012000829.1 and control complex were Gly 151, Asp302, Asp303, Leu304, Ser323, and Asp339, where Ser323 showed highest tendency of interacting with ligand with interactions being observed in 46% of the total frames (Fig. 4a). In WP_012000829.1 and hit complex, Ala152, Pro300, Asp302, Asp303, Ser323, Arg324, Gln325, Asp336, and Asp339 showed hydrogen bonding with ligand where Asp336 and Asp339 were also involved in ionic interactions. Among the interacting residues, Leu322 showed the highest tendency to bind out of all the interacting residues, with interactions being seen in 73% of the total frames (Fig. 4b). Similarly, in WP_275157394.1 and control complex, the residues involved in hydrogen bonding were Asp143, Gln144, Gln147, Arg161, Asn185, Ser189, Gln382, Lys384, Asn487, and Ser489 with Asn185 showed maximum tendency of interaction in 90% of the frames (Fig. 4c). While in WP_275157394.1 and hit complex, the residues showed the hydrogen bonding were Val21, His26, Gln144, Arg161, Asp188, Lys192, Glu254, Asp255, Arg303, Gly319, Lys320, Thr321, Glu485, Gln488, Ser489, and Gly490. Among these interacting residues, Asp255 showed the highest tendency to bind out of all the interacting residues, with interactions being seen in 32% of the total frames (Fig. 4d).

Protein-ligand contacts calculated during simulation. (a) WP_012000829.1 complexed with control, (b) WP_012000829.1 complexed with CMNPD13085. (c) WP_275157394.1 complexed with control (d) WP_275157394.1 complexed with CMNPD15632.

The prime-MMGBSA module was used to calculate the binding energy of selected complexes43. The binding energies of the WP_012000829.1 complexes with control and hit were − 72.10 and − 118.85 Kcal/mol respectively. While the WP_275157394.1 complexes with control and hit showed binding free energies as -60.05 and − 83.02 respectively. Gbind resulted from non-bonded interactions, GCoulomb, GPacking, GHbond, GLipo, and GvdW (Table 6). GbindLipo, GbindvdW, and GbindCoulomb affected the average binding free energies among all interaction types. Conversely, the final average binding energies were least affected by the GbindSolvGB and GbindCovalent energies. Furthermore, stable hydrogen bonds were observed between the ligands and amino acid residues indicated by GbindHbond interaction values. Thus, the binding energies calculated during simulation supported the binding affinities of ligands obtained during docking studies42,44.

Citrobacter koseri is a highly adaptable, opportunistic bacterium with exceptional survival and transmission mechanisms within the host. C. koseri infections can be dangerous if left untreated or treated insufficiently. Untreated bacteremia, for example, can lead to sepsis, a potentially fatal illness. Antibiotics that are effective in preventing these consequences can help minimize the overall morbidity and mortality associated with C. koseri infections45.

In this study, a subtractive proteomics technique was employed to evaluate therapeutic candidates against C. koseri. This method is used to find targets by identifying essential and non-homologous proteins in pathogenic organisms. When applied to the genome, subtractive proteomics can identify proteins that are uniquely linked to or regulated by specific genes or genomic areas. Finding therapeutic targets is an essential step in computer-based drug design techniques46. Advancements in bioinformatics and computational biology have yielded several methods for drug design and in silico analysis, reducing the expenses and duration associated with the iterative process of developing new drugs47.

The genome of C. koseri NCTC 11,075 was obtained from NCBI. The first step was to identify 2699 ORFs from the C. koseri genome, which were then translated into amino acid sequences. Prediction of ORFs in a genome is a critical step in genomics and bioinformatics. ORF prediction accuracy has a significant impact on downstream analyses like as comparative genomics and functional annotation48. Reducing redundancy is necessary since protein sequence databases have a lot of them. Redundancy occurs when a data collection contains one or more homologous sequences. Such sequences will introduce unneeded biases into a certain analysis49. Hence, these 2699 translated proteins were analyzed with CD-HIT, which eliminated all redundant proteins and produced 2212 non-redundant proteins. These proteins could be homologs to human proteins. As a result, targeting such proteins has the potential to alter human metabolism and be deadly. The selection of non-homologous proteins can limit the potential of cross-reactivity as well as undesirable outcomes47. Hence, in 1809 non-homologous proteins were screened. Essential proteins are necessary for bacterial life. Damage or modification to these vital proteins renders bacteria incapable of surviving. By focusing on these proteins, we can eradicate germs and treat illnesses. Bacterial essential gene research aids in understanding the nature of life and identifying novel drug targets for treating pathogenic diseases. Bacterial essential gene research contributes to a better understanding of the nature of life and the identification of potential therapeutic targets for treating pathogenic diseases50,51. The development of vaccines and antibacterial drugs preferentially targets essential proteins. Shilpa S. et al. discovered 807 essential proteins in Eubacterium nodatum, Sakharkar et al. discovered 306 essential genes in Pseudomonas aeruginosa, and Chan-Eng Chong et al. discovered 312 essential proteins in Burkholderia pseudomallei using this method52,53,54.

A comparative analysis of metabolic pathways between humans and pathogens using the KEGG database identified 23 pathways exclusively present in pathogens. These unique pathways encompass diverse processes such as photosynthesis, biosynthesis of antibiotics, and various bacterial signaling mechanisms. Notably, the identification of pathogen-specific pathways aligns with previous findings in other pathogens, including L. interrogans, A. baumannii, and S. saprophyticus. This highlights the distinct metabolic signatures that contribute to the pathogenicity of these microorganisms41,55,56. Hence, 58 proteins with distinct metabolic pathways were chosen for further study. Subcellular localization prediction is a quick and low-cost method of determining a protein’s function. Protein drug discovery, protein function, and genomic annotation prediction all rely on subcellular localization57. The subcellular localization of the 58 proteins implicated in unique pathways was investigated further. Because membrane proteins are harder to purify and analyze, cytoplasmic proteins are more appealing as therapeutic targets58. Hence, 40 cytoplasmic proteins were screened. These proteins were subsequently evaluated for virulent factors, and 13 were confirmed to be virulent. Additionally, the druggability of pathogenic proteins was investigated for all drug targets in the Drug Bank database. WP_012000829.1 and WP_275157394.1 were identified as potential drug targets.

The target protein’s 3D structure was obtained using Alpha Fold. The library of echinoderms metabolites containing 1600 compounds was docked against therapeutic target proteins. The approach of predicting the orientation of small molecules in relation to their protein targets is known as molecular docking. These computational methods offer data on a compound’s binding affinity and binding activity against its target protein59. During docking, the compounds Ampicillin, Levofloxacin, and Doxycycline were used as controls. The top 10 compounds that docked at each protein receptor were selected after the docking findings were evaluated using the glide gscore. To ascertain the chemical interactions of the compounds with the highest binding affinities against the target proteins, their binding positions were determined. Among the selected compounds, CMNPD13085 showed the highest binding affinity against WP_012000829.1 protein and CMNPD15632 showed the highest binding affinity against WP_275157394.1 protein.

Using MD simulation and MMGBSA analysis of docked complexes, one may investigate the dynamics, stability, and binding affinity of protein-ligand interactions. The best docked complexes with two inhibitors per protein were subjected to molecular dynamics (MD) simulations and (MMGBSA) since these ligands had a strong binding affinity as indicated by a high dock score and a strong molecular interaction network. They aid in the refinement of docking predictions, the ranking of ligand candidates, the direction of future optimization efforts, and the expansion of our understanding of the molecular mechanisms behind protein-ligand binding. According to These substances remained stable as effective inhibitors within the protein binding pocket according to MMGBSA analysis and MD simulations. Consequently, the identified novel drug targets hold significant potential for drug therapeutic applications against C. koseri infections, pending further experimental validation.

Recognizing the importance of identifying effective pharmaceutical targets, the purpose of this study was to undertake a computational assessment of the human microbial pathogen C. koseri to identify potential targets using a variety of computational software and techniques. Based on distinct pathways and druggability, two proteins were identified as drug targets in the initial phase of the investigation. The structural study of the potential targets was undertaken in the second phase, which was then employed for molecular docking and simulation investigations. As a result, this work represents a significant step forward in the development of novel, effective anti- C. koseri drugs.

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Pennington, K., Van Zyl, M. & Escalante, P. Citrobacter koseri pneumonia as initial presentation of underlying pulmonary adenocarcinoma. Clinical Medicine Insights: Case Reports 9, CCRep. S40616 (2016).

Hua, D. T., Lo, J., Do, H. Q. & Pham, C. D. A case of Citrobacter koseri renal abscess and review of the literature. SAGE Open. Med. Case Rep. 10, 2050313X221135347 (2022).

Article PubMed PubMed Central Google Scholar

Hauser, L. et al. Fatal transfusion-transmitted infection due to Citrobacter koseri. Transfusion. 56, 1311–1313 (2016).

Article PubMed Google Scholar

Wang, J.-T. & Chang, S.-C. (2016).

Lipsky, B. A., Hook, I. I. I., Smith, E. W., Plorde, J. J. & A. A. & Citrobacter infections in humans: Experience at the seattle veterans administration medical center and a review of the literature. Rev. Infect. Dis. 2, 746–760 (1980).

Article PubMed Google Scholar

Shih, C. C., Chen, Y. C., Chang, S. C., Luh, K. T. & Hsieh, W. C. Bacteremia due to Citrobacter species: Significance of primary intraabdominal infection. Clin. Infect. Dis. 23, 543–549 (1996).

Article PubMed Google Scholar

Nair, G., Lakshminarayana, M., Nagaraj, L., Kumar, M. & Thilak, S. Urinary infections and Citrobacter: An unpleasant scenario. Int. J. Adv. Community Med. 3, 249–253 (2020).

Article Google Scholar

Liu, H. W., Chang, C. J. & Hsieh, C. T. Brain abscess caused by Citrobacter koseri infection in an adult. Neurosciences J. 20, 170–172 (2015).

Article Google Scholar

Marecos, C. V., Ferreira, M., Ferreira, M. M. & Barroso, M. R. Sepsis, meningitis and cerebral abscesses caused by Citrobacter koseri. Case Reports bcr1020114941 (2012). (2012).

Samonis, G. et al. Citrobacter infections in a general hospital: characteristics and outcomes. Eur. J. Clin. Microbiol. Infect. Dis. 28, 61–68 (2009).

Article PubMed Google Scholar

Jabeen, I., Islam, S., Hassan, A. I., Tasnim, Z. & Shuvo, S. R. A brief insight into Citrobacter species-a growing threat to public health. Front. Antibiot. 2, 1276982 (2023).

Article Google Scholar

Yuan, C. et al. Comparative genomic analysis of Citrobacter and key genes essential for the pathogenicity of Citrobacter koseri. Front. Microbiol. 10, 2774 (2019).

Article PubMed PubMed Central Google Scholar

Deveci, A. & Coban, A. Y. Optimum management of Citrobacter koseri infection. Expert Rev. anti-infective Therapy. 12, 1137–1142 (2014).

Article PubMed Google Scholar

Ando, T. et al. Infectious aneurysm caused by Citrobacter koseri in an immunocompetent patient. Intern. Med. 58, 813–816 (2019).

Article PubMed Google Scholar

Townsend, S. M., Pollack, H. A., Gonzalez-Gomez, I., Shimada, H. & Badger, J. L. Citrobacter koseri brain abscess in the neonatal rat: survival and replication within human and rat macrophages. Infect. Immun. 71, 5871–5880 (2003).

Article PubMed PubMed Central Google Scholar

Fatima, I. et al. Revolutionizing and identifying novel drug targets in Citrobacter koseri via subtractive proteomics and development of a multi-epitope vaccine using reverse vaccinology and immuno-informatics. J. Biomol. Struct. Dynamics, 1–14 (2024).

Sayers, E. W. et al. Database resources of the national center for biotechnology information. Nucleic Acids Res. 49, D10 (2021).

Article PubMed Google Scholar

Delcher, A. L., Bratke, K. A., Powers, E. C. & Salzberg, S. L. Identifying bacterial genes and endosymbiont DNA with glimmer. Bioinformatics. 23, 673–679. https://doi.org/10.1093/bioinformatics/btm009 (2007).

Article PubMed Google Scholar

Rice, P., Longden, I. & Bleasby, A. EMBOSS: the European Molecular Biology Open Software suite (2000). Trends Genet. 16, 276–277. 10.1016 (2000). S0168-9525 (00) 2 (2024).

Article PubMed Google Scholar

Huang, Y., Niu, B., Gao, Y., Fu, L. & Li, W. CD-HIT suite: a web server for clustering and comparing biological sequences. Bioinformatics. 26, 680–682 (2010).

Article PubMed PubMed Central Google Scholar

O’Leary, N. A. et al. Reference sequence (RefSeq) database at NCBI: Current status, taxonomic expansion, and functional annotation. Nucleic Acids Res. 44, D733–D745 (2016).

Article PubMed Google Scholar

Buchfink, B., Xie, C. & Huson, D. H. Fast and sensitive protein alignment using DIAMOND. Nat. Methods. 12, 59–60 (2015).

Article PubMed Google Scholar

Hauser, M. Taxonomic and functional marker genes for viruses. Curr. Opin. Microbiol. 31, 82–89. https://doi.org/10.1016/j.mib.2016.03.009 (2016).

Article Google Scholar

Wen, Q. F., Wei, W. & Guo, F. B. Geptop 2.0: Accurately select essential genes from the list of protein-coding genes in prokaryotic genomes. Essent. Genes Genomes: Methods Protocols, 423–430 (2022).

Aoki-Kinoshita, K. F. & Kanehisa, M. In Comparative Genomics71–91 (Springer, 2007).

Google Scholar

Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. J. N r. 28, 27–30 (2000).

Google Scholar

Moriya, Y., Itoh, M., Okuda, S., Yoshizawa, A. C. & Kanehisa, M. KAAS: An automatic genome annotation and pathway reconstruction server. Nucleic Acids Res. 35, W182–W185 (2007).

Article PubMed PubMed Central Google Scholar

Yu, N. Y. et al. PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. Bioinformatics. 26, 1608–1615 (2010).

Article PubMed PubMed Central Google Scholar

Maurya, S., Akhtar, S., Siddiqui, M. H. & Khan, M. K. A. Subtractive proteomics for identification of drug targets in bacterial pathogens: A review. Int. J. Eng. Res. Technol. 9 (2020).

Hema, K. et al. 202 subunit vaccine design against pathogens causing atherosclerosis. J. Biomol. Struct. Dynamics. 33, 135–136 (2015).

Chen, L. et al. VFDB: A reference database for bacterial virulence factors. Nucleic Acids Res. 33, D325–D328 (2005).

Article PubMed Google Scholar

Wishart, D. S. et al. DrugBank 5.0: A major update to the DrugBank database for 2018. Nucleic Acids Res. 46, D1074–D1082 (2018).

Article PubMed Google Scholar

Camacho, C. et al. BLAST+: architecture and applications. BMC Bioinform. 10, 1–9 (2009).

Article Google Scholar

LigPrep & (Schrödinger LLC, (2018).

Schrödinger, L. J. S. S. & Schrödinger LLC; New York, NY: 2, 2017 – 2011 (2017). (2017).

Kim, M. O., Nichols, S. E., Wang, Y. & McCammon, J. A. J. J. o. c.-a. m. d. Effects of histidine protonation and rotameric states on virtual screening of M. tuberculosis RmlC. 27, 235–246 (2013).

Bowers, K. J. et al. in Proceedings of the 2006 ACM/IEEE Conference on Supercomputing. 84-es.

Shivakumar, D. et al. Improving the prediction of absolute solvation free energies using the next generation OPLS force field. 8, 2553–2558 (2012).

Price, D. J. & Brooks, I. I. I. C. L. J. T. J. o. c. p. A modified TIP3P water potential for simulation with Ewald summation. 121, 10096–10103 (2004).

Uddin, R., Siddiqui, Q. N., Sufian, M., Azam, S. S. & Wadood, A. Proteome-wide subtractive approach to prioritize a hypothetical protein of XDR-Mycobacterium tuberculosis as potential drug target. Genes Genomics. 41, 1281–1292 (2019).

Shahid, F. et al. In silico subtractive proteomics approach for identification of potential drug targets in Staphylococcus saprophyticus. Int. J. Environ. Res. Public Health. 17, 3644 (2020).

Article PubMed PubMed Central Google Scholar

Halder, S. K. et al. Oxa-376 and Oxa-530 variants of β-lactamase: computational study uncovers potential therapeutic targets of Acinetobacter baumannii. 12, 24319–24338 (2022).

Godschalk, F., Genheden, S., Söderhjelm, P. & Ryde, U. J. P. C. C. P. Comparison of MM/GBSA calculations based on explicit and implicit solvent simulations. 15, 7731–7739 (2013).

Decherchi, S. & Cavalli, A. Thermodynamics and kinetics of drug-target binding by molecular simulation. J. C R. 120, 12788–12833 (2020).

Google Scholar

Sadeq, A. Genotypic detection of qnrA and qnrC genes in Citrobacter koseri isolated from patients with urinary tract infection. Arch. Razi Inst. 77, 675 (2022).

Hosen, M. I. et al. Application of a subtractive genomics approach for in silico identification and characterization of novel drug targets in Mycobacterium tuberculosis F11. Interdisciplinary Sciences: Comput. Life Sci. 6, 48–56 (2014).

Barh, D. et al. In silico subtractive genomics for target identification in human bacterial pathogens. Drug Dev. Res. 72, 162–177 (2011).

Brent, M. R. Genome annotation past, present, and future: How to define an ORF at each locus. Genome Res. 15, 1777–1786 (2005).

Article PubMed Google Scholar

Sikic, K. & Carugo, O. Protein sequence redundancy reduction: comparison of various method. Bioinformation. 5, 234 (2010).

Article PubMed PubMed Central Google Scholar

Chen, W. H., Lu, G., Chen, X., Zhao, X. M. & Bork, P. OGEE v2: An update of the online gene essentiality database with special focus on differentially essential genes in human cancer cell lines. Nucleic Acids Res. gkw1013 (2016).

Dickerson, J. E., Zhu, A., Robertson, D. L. & Hentges, K. E. defining the role of essential genes in human disease. PloS One 6 (2011).

Sakharkar, K. R., Sakharkar, M. K. & Chow, V. T. A novel genomics approach for the identification of drug targets in pathogens, with special reference to Pseudomonas aeruginosa. In Silico Biol. 4, 355–360 (2004).

Shiragannavar, S. S., Shettar, A. K., Madagi, S. B. & Sarawad, S. Subtractive genomics approach in identifying polysacharide biosynthesis protein as novel drug target against Eubacterium nodatum. Asian J. Pharm. Pharmacol. 5, 382–392 (2019).

Article Google Scholar

Chong, C. E., Lim, B. S., Nathan, S. & Mohamed, R. In silico analysis of Burkholderia pseudomallei genome sequence for potential drug targets. In Silico Biol. 6, 341–346 (2006).

Goyal, M., Citu, C. & Singh, N. In silico identification of novel drug targets in acinetobacter baumannii by subtractive genomic approach. Asian J. Pharm. Clin. Res. 11, 230. https://doi.org/10.22159/ajpcr.2018.v11i3.22105 (2018).

Article Google Scholar

Amineni, U., Pradhan, D. & Marisetty, H. In silico identification of common putative drug targets in Leptospira interrogans. J. Chem. Biol. 3, 165–173. https://doi.org/10.1007/s12154-010-0039-1 (2010).

Article PubMed PubMed Central Google Scholar

Su, E. C. Y. et al. Protein subcellular localization prediction based on compartment-specific features and structure conservation. BMC Bioinform. 8, 330 (2007).

Article Google Scholar

Mondal, S. I. et al. Identification of potential drug targets by subtractive genome analysis of Escherichia coli O157: H7: An in silico approach. Adv. Appl. Bioinf. Chemistry: AABC. 8, 49 (2015).

Google Scholar

Qamar, M. et al. In-silico identification and evaluation of plant flavonoids as dengue NS2B/NS3 protease inhibitors using molecular docking and simulation approach. Pak J. Pharm. Sci. 30, 2119–2137 (2017).

PubMed Google Scholar

Download references

Authors extend their appreciation to researchers supporting project Number (RSPD2024R885) at King Saud University Riyadh Saudi Arabia for funding this research.

Department of Pharmacognosy, College of Pharmacy, King Saud University, PO Box 2457, Riyadh, 11451, Saudi Arabia

Bayan A. Alhaidhal, Fatimah M. Alsulais, Ramzi A. Mothana & Abdullah R. Alanzi

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

You can also search for this author in PubMed Google Scholar

Conceptualization, A.A; Methodology, B.A; Supervision, A.A; Writing – original draft, F.A and R.M; All authors reviewed the manuscript.

Correspondence to Abdullah R. Alanzi.

The authors declare no competing interests.

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

Alhaidhal, B.A., Alsulais, F.M., Mothana, R.A. et al. In silico discovery of druggable targets in Citrobacter koseri using echinoderm metabolites and molecular dynamics simulation. Sci Rep 14, 26776 (2024). https://doi.org/10.1038/s41598-024-77342-5

Download citation

Received: 27 April 2024

Accepted: 22 October 2024

Published: 05 November 2024

DOI: https://doi.org/10.1038/s41598-024-77342-5

Anyone you share the following link with will be able to read this content:

Sorry, a shareable link is not currently available for this article.

Provided by the Springer Nature SharedIt content-sharing initiative