Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.
Scientific Reports volume 12, Article number: 21070 (2022 ) Cite this article
Developing a common medication strategy for disease control and management could be greatly beneficial. Investigating the differences between diseased and healthy states using differentially expressed genes aids in understanding disease pathophysiology and enables the exploration of protein-drug interactions. This study aimed to find the most common genes in diarrhea-causing bacteria such as Salmonella enterica serovar Typhimurium, Campylobacter jejuni, Escherichia coli, Shigella dysenteriae (CESS) to find new drugs. Thus, differential gene expression datasets of CESS were screened through computational algorithms and programming. Subsequently, hub and common genes were prioritized from the analysis of extensive protein–protein interactions. Binding predictions were performed to identify the common potential therapeutic targets of CESS. We identified a total of 827 dysregulated genes that are highly linked to CESS. Notably, no common gene interaction was found among all CESS bacteria, but we identified 3 common genes in both Salmonella-Escherichia and Escherichia-Campylobacter infections. Later, out of 73 protein complexes, molecular simulations confirmed 5 therapeutic candidates from the CESS. We have developed a new pipeline for identifying therapeutic targets for a common medication strategy against CESS. However, further wet-lab validation is needed to confirm their effectiveness.
Microarrays have revolutionized biotechnology, allowing researchers to track down the expression of tens of thousands of genes simultaneously1. In most cases, any microarray experiment results in a list of genes found to be differentially expressed. The analysis of these large-scale gene expressions has become a fundamental approach to the identification of clinical diagnostic factors as well as potential drug targets2. The common challenge here is translating such lists of gene expression data into a better understanding of the underlying disease phenomena. The first solution in this direction can be to translate the gene expression pattern into a functional profile, which will offer insight into the cellular mechanisms relevant to the given disease condition3. Over the last decade, high-throughput in silico genomics, transcriptomics, and proteomics technologies have allowed researchers to rapidly acquire and analyze several thousand gene expression profiles in any experiment4.
Enteric bacterial pathogens and parasites are the leading cause of infectious diarrhea in developing countries5. Common bacteria that cause diarrhea include Salmonella, Escherichia, Shigella, and Campylobacter. The virulence of Salmonella enterica serovar Typhimurium (S. Typhimurium) greatly depends on two types III secretion systems (T3SSs) which are encoded in pathogenicity islands 1 (SPI1) and 2 (SPI2), respectively6. These systems translocate proteins called effectors into a eukaryotic host cell, where they interfere with certain host signal transduction pathways to allow the internalization of pathogens and their survival and proliferation inside vacuoles6. Escherichia coli (E. coli) is the primary cause of watery diarrhea in infants, often accompanied along with causing low-grade fever and vomiting7. Compared with other pathogens such as Shigella and Salmonella, E. coli is typically considered non-invasive. However, it encodes a T3SS producing a characteristic attaching and effacing (A/E) lesion8. The effacement of microvilli on the epithelial surface induced by A/E lesions contributes to the diarrheal phenotype owing to the loss of overall absorptive surface8. Out of the four major Shigella species that cause diarrheal disease, Shigella sonnei (S. sonnei) and Shigella flexneri (S. flexneri) are the most common species in the U.S. and other developed countries9. However, the changing scenario was observed in a 2013 study that shows the sudden emergence of S. sonnei in Bangladesh10. The two other Shigella species, Shigella dysenteriae (S. dysenteriae), and Shigella boydii (S. boydii) have a generally low infection rate and are found very rarely in developed countries. S. dysenteriae produces Shiga toxin, making it the most life-threatening of all of these infections, which can also lead to hemolytic uremic syndrome (HUS). Releasing the exotoxin by S. dysenteriae compromises the central nervous system and the gut, while enterotoxin causes the diarrhea11. The primary cause of inflammation by Shigella involves various steps of the invasion process. An initial release of IL-1β by the destruction of macrophages after emergence from M-cells attracts polymorphonuclear leukocytes (PMNs) that release a precursor to the secretagogue adenosine, ultimately activating Cl − secretion. The presence of free bacteria on the basolateral side of cells aggravates this early step in inflammation12,13. Campylobacter jejuni (C. jejuni) initiates infection by penetrating the gastrointestinal mucus using its high motility and spiral shape14. Then they adhere to the gut enterocytes and induce diarrhea by toxin release. C. jejuni releases several different enterotoxins and cytotoxins varying from strain to strain, and the severity of enteritis correlates with these toxins14. These four bacterial infections in the human body have the common disease-causing phenomena which is diarrhea. A significant number of host (human) genes get upregulated and downregulated simultaneously during these pathogenesis events by four diarrhea-causing bacteria. Thus, it is important to decipher the molecular mechanisms underlying these dysregulated gene networks in the pathogenesis of diarrhea.
Previously, several investigations have been conducted individually on the four bacteria to deduce how their gene expression events contribute to various intestinal and systemic infections15,16,17,18. However, to date, no work has been done to reveal the gene–gene associations and their dysregulation due to the pathogenesis of these four bacteria. Therefore, we have addressed all of these four bacteria in our study and disclosing the significant gene networks of dysregulated genes due to these bacterial infections would reveal plausible drug targets and shed light on the possibility of common therapy.
All the molecular events in the cell are controlled primarily by changes in the expression of key genes. Gene transcription is pivotal in regular events such as cell division, proliferation, differentiation, and cell death. Much interest is therefore focused on depicting gene expression profiles to identify the key gene clusters whose expression is changed in disease states. Gene–gene interactions provide us the information to what extent and how the gene share the relation with the other genes. One gene can interact with other genes by several ways including domain, motif, co-localization, pathways etc. For this reason, one gene can have the great influence on another gene’s function and regulation. Therefore, the goal of this study was to reveal the commonly dysregulated genes and the significant gene networks associated with these four bacterial infections. We have analyzed the gene expression pattern, hub genes identification pathways, regulatory biomarkers and structural associations of interacted proteins involved in these disease progressions. Furthermore, we investigated the functional associations of final products of these dysregulated genes to scrutinize the major drug targets for common therapy.
Four gene expression profiles were used in this investigation, namely (GSE51043, GSE18810, GSE19315, and GSE36701). A total of 215 DEGs (Differentially Expressed Genes) were screened from GSE51043 with 72 upregulated genes and 143 downregulated genes. When the GSE18810 dataset was examined, it yielded 187 DEGs, 50 of which were upregulated and 137 of which were downregulated. In the GSE19315 gene chip, 214 DEGs were discovered, with 109 upregulated genes and 105 downregulated genes. Finally, the GSE36701 dataset yielded 213 DEGs, with 67 upregulated and 146 downregulated genes. All the upregulated and downregulated DEGs resulting from four bacterial pathogenesis are enlisted in Supplementary files 1–2.
In a complex disease state, a varying range of signaling pathways and GO terms are involved in the progression of diseases. In this process, we used all the DEGs (298 upregulated and 529 downregulated genes) to determine significant pathways and gene ontologies that may link diarrhea pathogenesis. GO terms and pathways were selected based on the number of genes involved and having a p-value less than or equal to 0.05. Top GO terms were identified as regulation of cell population proliferation (27), RNA polymerase II cis-regulatory region sequence-specific DNA binding (25) in upregulated genes and nervous system development (20), cis-regulatory region sequence-specific DNA binding (39) in downregulated genes. In addition to GO terms, over-presented signaling pathways were predicted for DEGs. The top pathways are cancer pathways (12), cytokine-cytokine receptor interaction (10) in upregulated genes, Chemical carcinogenesis (12), and Rap1 signaling pathway (11) in downregulated genes. Supplementary File 3–5 contains the list of all GO terms and pathways of DEGs based p-value less than or equal to 0.05. Furthermore, we identified the commonly altered pathways (both due to upregulation and downregulation) from the top 30 pathways of each DEG set resulting from the pathogenesis of the four bacteria shown in Table 1.
The Cytoscape software v3.8 and the InteractiVenn tool (http://www.interactivenn.net/) were used to identify the common up and downregulated genes from the pathogenesis of four bacterial species. There was no single common DEG found in all four categories. The highest number (3) of commonly downregulated genes were found from S. Typhimurium-E. coli and E. coli-C. jejuni infections. The result is depicted in Figs. 1, 2, 3.
Schematic workflow of identifying the major druggable targets of diarrheal pathogens.
The common upregulated DEGs (black colored diamond boxes) found in the pathogenesis of four bacterial species. DEGs from Salmonella, E. coli, Shigella, and Campylobacter infection are shown in red, green, yellow, and blue colors, respectively.
The common downregulated DEGs (black colored diamond boxes) found in the pathogenesis of four bacterial species. DEGs from Salmonella, E. coli, Shigella, and Campylobacter infection are shown in red, green, yellow, and blue colors, respectively.
Two PPI networks have been built from all the up-and downregulated gene interactions, shown in Supplementary file 6 (supplementary Fig. 1–2). Genes that often interact with other genes are known as hub genes in the gene networks. Hub genes typically play an essential function in a biological system due to these interactions. The protein–protein interaction (PPI) network, which is composed of highly connected (hub) genes, has a biological role demonstrated by the centrality-lethality rule19. Further, we employed four methods to determine the hub proteins within each group of proteins that differentially regulate pathogenesis by four bacteria. Each method identified the top 10 hub nodes within the PPI network. Except for the upregulated proteins resulting from Campylobacter and Salmonella pathogenesis, more than one hub protein was found to be common in all four methods within each group. The list of these hub proteins is tabulated in Table 2.
Using the common and hub DEGs, we found 276 TFs (Transcriptional Factors) and 959 miRNAs (micro RNAs) that might influence the expression pattern of those genes and lead to the progression of diseases, as depicted in Fig. 4 and Supplementary File 8–9. Out of 276 TFs, the top TFs (i.e., ZNF354C, FOXC1, GATA2, FOXL1, YY1, MEF2A, NFIC, TFAP2A, SREBF1) were identified with betweenness centrality ≥ 45 as shown in Fig. 4A. Among all the miRNAs, we identified seventeen miRNAs (i.e., hsa-mir-17-5p, hsa-mir-20a-5p, hsa-mir-92a-3p, hsa-mir-93-5p, hsa-mir-122-5p, hsa-mir-155-5p, hsa-mir-106b-5p, hsa-mir-373-3p, hsa-mir-20b-5p, hsa-mir-329-3p, hsa-mir-520a-3p, hsa-mir-520c-3p, hsa-mir-519d-3p, hsa-mir-603, hsa-mir-362-3p, hsa-mir-6778-3p, hsa-mir-8485) with betweenness centrality ≥ 100 (Fig. 4B).
Gene regulatory networks associated with the dysregulated common and hub genes. The figure showing (A) gene–TF interacting network and (B) gene–miRNA interacting network. The interacting network of miRNAs and TFs were filtered with betweenness centrality ≥ 100 and 45, respectively.
NCBI’s (National Center for Biotechnology Information) conserved domain search tool revealed that several common and hub proteins were found to have multiple domains. In these cases, we selected one domain from each protein based on having a lower E-value, and the proteins that did not contain any functional domains predicted by this tool were excluded from this study. Out of the 26 domains, 3D structures of 8 domains were available in the RCSB PDB (Protein Data Bank) database, 12 of them were modeled through MODELLER 9.22, and 5 structures were modeled using the trROSETTA server due to having less than 40% query coverages. Table 3 contains the domain names, sequence length, e-values, and the method of 3D structure modeling of each domain. Later, all the modeled structures were refined using the GalaxyRefine server. The summary of quality assessment results (Ramachandran plot analysis, ERRAT server, and ProSA-Web analysis) of the refined structure are shown in Supplementary file 6 (Supplementary table 1), while Supplementary file 7 contains the Ramachandran plots of all the modeled structures.
The domain-specific protein–protein docking was performed to anticipate their binding affinity and interactions. The docking protocol is shown in Fig. 5 as a schematic representation. In doing so, the ClusPro v2.0 server provided up to 30 docked complexes with different poses. The complex with the least energy score and binding pose with functional interactions were selected from each docking process. It was found that the alpha-1 adrenergic receptors subtype D domain-zinc dependent metalloprotease domain complex showed the highest docking energy score of –1300 kcal/mol. The highest number (29) of hydrogen bonds were present in FOXP coiled-coil domain-PH domain complex. In contrast, the highest number (9) of salt-bridges were present in the interacting plane of core-2/I-branching enzyme-ribosomal protein s6e domain complex. The docking energies, several formed hydrogen bonds, and salt bridges of the complexes are enlisted in Supplementary file 6 (supplementary table 2).
This study's schematic representation of domain-specific molecular docking protocol. The figure shows docking between (A) common upregulated protein domains, (B) common downregulated protein domains, (C) domains of upregulated hub proteins, and (D) domains of downregulated hub proteins. The black line and the numbers represent molecular docking and the numbers of possible docking combinations among them, respectively. From this combination, a total number of 73 molecular dockings were performed to elucidate their binding affinity.
We employed molecular dynamics simulation to verify the stability of the docked protein complexes and to identify the common drug targets having a high association with significant proteins dysregulated from the pathogenesis of all four bacteria. A 7.3 microseconds (µs) production run was performed to simulate the 73 complexes. Out of them, 20 complexes remained stable while the others either got unstable several times during the simulation period or remained completely disassociated at the end of the simulation. From these analyses, we identified 5 proteins (Fig. 6) that might be targeted for single therapy as each has an association with significant proteins dysregulated from the pathogenesis of other bacteria. The first identified protein is RAPH1, a common protein that downregulates from S. Typhimurium and E. coli infections. RAPH1 showed the stable interactions with MCF2L and AP1G2 proteins, which was downregulated from S. dysenteriae-C. jejuni and S. Typhimurium-C. jejuni infections, respectively (Fig. 6A). The second group of drug targets identified was GCNT7 and ADAMTS1, upregulated from E. coli and S. Typhimurium infection, respectively. Both had stable interaction with HIST1H2BC and RPS6 proteins which were upregulated from S. dysenteriae and C. jejuni infection (Fig. 6B). Further stable association of GCNT7 was found with ADAMTS1 and JAZF1 also, which are the upregulated hub proteins resulting from S. Typhimurium infection. The following 2 plausible drug targets found were GLI1 and SEC31B, downregulated in the human body from E. coli and C. jejuni infection, respectively (Fig. 6C). GLI1 had a stable association with ADCY2, RAB3IP, TIAM1, and SEC31B that downregulates from S. dysenteriae, S. Typhimurium and C. jejuni, respectively, while SEC31B had stability with PIK3R1, AAK1, and TAS2R14 which was downregulated from S. dysenteriae, S. Typhimurium and E. coli pathogenesis (Fig. 6C). Thus, targeting these 5 drug targets with a single therapy might be a remarkable solution to prevent diarrheal disease from any four bacteria species.
The 5 potential drug targets (shown in dotted circle) for single therapy. These 5 proteins have stable interaction with other proteins resulting from dysregulation by the pathogenesis of any of the four bacteria. The figure showing (A) the first drug target, RAPH1, which has stable interaction with MCF2L and AP1G2 proteins that downregulates from Shigella-Campylobacter and Salmonella-Campylobacter pathogenesis, respectively. (B) The second group of drug targets, GCNT7, and ADAMTS1, which upregulates from E. coli and Salmonella infections, had stable interaction with HIST1H2BC and RPS6 proteins that upregulates from Shigella and Campylobacter infection. GCNT7 was also found to be stable with ADAMTS1 and JAZF1, which are the upregulated hub proteins resulting from Salmonella pathogenesis. (C) The third group of drug targets, GLI1 and SEC31B, downregulates from E. coli and Campylobacter infection. GLI1 showed a stable association with ADCY2, RAB3IP, TIAM1, and SEC31B that downregulates from Shigella, Salmonella, and Campylobacter, respectively, while SEC31B had stability with PIK3R1, AAK1, and TAS2R14 downregulating from Salmonella, Shigella, and E. coli infection.
The dynamic behavior of the protein complexes was analyzed by RMSD, atomic distances of the interacting planes, number of hydrogen bonds, Rg, and SASA analysis (Fig. 7).
MD simulation results of protein complexes. The figure shows RMSD analysis (A1, B1, C1), minimum distances in the interacting residues of the complexes (A2, B2, C2), number of hydrogen bonds (A3, B3, C3), Rg analysis (A4, B4, C4), and SASA analysis (A5, B5, C5).
All protein complex configuration changes were analyzed in terms of RMSD during the simulation periods. Fig. 7(A1, B1, C1) shows that rather than some fluctuations in the AnTD-RA_PH, C2/Ibe-RPS6e, and FOXP-PH complexes, the RMSD values of the remaining complexes were quite stable. Although some fluctuations were observed in the C2/Ibe and FOXP bound complexes, they tend to stabilize after 75 ns (Figs. 7B1 and 7C1). We measured the changes in the minimum distances between the residues of interacting planes of the complexes during the simulation (Fig. 7A2, 7B2, 7C2). The high distance was observed in the RPS6e bound C2/IBe and ZDM complexes as well as in the C2/IBe-ZDM complex (Fig. 7B2). Rather than these, all the complexes showed a significantly less distance (< 0.3 nm) between the interacting residues throughout the simulation. We also calculated the number of hydrogen bonding interactions formed during the simulations between the domains of the protein complexes, as shown in Fig. 7 (A3, B3, C3). Hydrogen bonding is one of the primary components responsible for molecular interactions in any biological system. In the FOXP-Rab11 complex, the highest number of conformations formed up to 30 hydrogen bonds during the simulation (Fig. 7C3). Very few conformations formed less than five hydrogen bonds in the rest of the complexes. Further, we measured the radius of gyration (Rg) for the protein complexes contributing to their compactness shown in Fig. 7(A4, B4, C4). It can be inferred that all the complexes had approximately similar compactness as to their starting structure except the AnTD-RA_PH complex, where a higher Rg score was observed (Fig. 7A4). Finally, the solvent-accessible surface areas (SASAs) were analyzed to investigate the changes in the protein volumes upon association of complexes (Fig. 7A5, 7B5, 7C5). Interesting results were observed as almost all the complexes showed slightly decreased SASA values compared to the starting point of the simulation. These decreased SASA values in the protein complexes denote a relatively shrunken nature upon their association. Supplementary movies 1–15 visualizes the 100 ns MD simulation of these 15 complexes related to 5 identified drug targets (the color codes of proteins are the same as shown in Fig. 6). The simulation analysis of the remaining 5 stable complexes and the RMSD analysis of the 53 unstable complexes are shown in supplementary file 6 (supplementary Fig. 4).
We calculated the binding energy of the last 20 ns of MD production run of the protein complexes (associated with 5 identified drug targets) with an interval of 50 ps from MD trajectories using the MM/PBSA method. Further, we utilized the MmPbSaStat.py script for calculating the average free binding energy and its standard deviation/error from the output files obtained from the g_mmpbsa package (Table 4). The interaction between the two proteins is shown in the form of binding energy, where the lesser the binding energy, the better the binding of the two proteins. The final binding energy is the result of the cumulative sum of van der Wall, electrostatic, polar solvation, and SASA energy. The majority of the complexes showed favorable binding energy between them, which further validates their stability, and the ZDM-RPS6E complex (ADAMTS1-RPS6) showed the least binding free energy (-3060.937 kJ/mol) among all the complexes.
Examining the molecular mechanisms that drive illness onset and progression is generally focused on biomolecules such as genes and proteins, whose abnormal expression contributes to changes in cellular function and, eventually, disease. By concentrating on illness molecular pathways, researchers can uncover crucial events that can be addressed with novel therapy techniques. Identifying these disease-associated targets is thus an important first step in disease mechanism research.
We focused on the causal agents of diarrheal disease in this study, specifically S. Typhimurium, C. jejuni, E. coli, S. dysenteriae (Fig. 1). After being infected by these bacteria, a large number of genes in the human body become dysregulated, either individually or as a group. From four datasets available through the Gene Expression Omnibus database, we found 298 upregulated and 529 downregulated genes (Supplementary files 1–2). The pathways that may play a critical role in illness development were chosen. Notably, the majority of these genes and pathways have been linked to cancer (Supplementary File 3–4). Cancer pathways, cytokine-cytokine receptor interactions, chemical carcinogenesis, Rap1 signaling pathway, and nuclear receptors meta-pathway are among the predicted pathways (Table 1 and supplementary file 5). Although diarrhea is a common and often dose-limiting complication associated with cancer chemotherapy treatment, it is underappreciated and poorly handled20. CS diarrhea affects 80 percent of Carcinoid syndrome (CS) patients, who experience diarrhea and flushing, necessitating considerable modifications in daily activities and lifestyle21. Proinflammatory cytokines are known to increase in diarrhea-predominant irritable bowel syndrome patients, which could explain why cytokine-cytokine receptor interactions are occurring in expected pathways22. Ras-associated protein 1 (Rap1) is triggered by various stimuli in the Rap1 signaling pathway. It then recruits several effectors, resulting in its involvement in essential physiological processes such as integrin signalling and ERK activation23. Epac, a family of intracellular cAMP sensors, activates Rap1 by accelerating the conversion of GDP-Rap1 to GTP-Rap1. In contrast, active GTP-Rap1 may have a role in the pathophysiology of secretory diarrhea via the RhoA-Rho-associated kinase (ROCK) pathway24. Like pathways, GO terms pathways include the cell population proliferation, RNA polymerase II cis-regulatory region sequence-specific DNA binding, nervous system development, and cis-regulatory region sequence-specific DNA binding, which are relevant to diarrhea.
Based on the comparative analysis, we identified the common genes among the upregulated and downregulated DEGs (Fig. 2, 3). It was found that the highest number of three genes were common among the downregulated DEGs resulting from S. Typhimurium-E. coli and E. coli-C. jejuni infections (Fig. 2). Two PPI networks were built using the DEGs to display their relationship and identify the key disease modulators in diarrhea (supplementary Fig. 1–2). The centrality-lethality rule states that deleting a protein node that is highly connected (a "hub") is more likely to be fatal to an organism than deleting a node that is weakly connected (a "non-hub"). The centrality-lethality rule is commonly regarded to reflect the role of network design in defining network function since hubs are more important than non-hubs in organizing the global network structure. Hub proteins have eight or more interactions, whereas non-hub proteins have four or fewer interactions25. Because they have many interacting partners within a network, hub proteins are considered functionally significant26. We identified eighteen hub proteins (OLR1, JAZF1, ADAMTS1, PIK3R1, TIAM1, GCNT7, GNB5, CD44, GLI1, TAS2R14, HIST1H2BC, ADCY2, AAK1, RPS6, ADRA1D, RAB3IP, SYNRG, SEC31B) implicated in diarrheal pathogenesis by each of the four bacteria using various approaches (Table 2). TFs also influence the rate of transcription27, and miRNA is involved in RNA silencing and gene expression regulation at the post-transcriptional level28. As a result, both are necessary to comprehend the progression of a certain disease. We identified several TFs, such as NFIC, FOXC1, FOXL1, and ZNF345, which are known to be involved in DNA-binding transcription factor activity29,30,31,32 (Fig. 4 and supplementary file 8). The remaining transcription factors, such as YY2, TFAP2A, GATA2, MEF2A, and SREBF1, are implicated in positive and negative regulation of the transcription of several target genes, branchiooculofacial syndrome (BOFS), development and proliferation of hematopoietic and endocrine cell lineages, muscle development, neuronal differentiation, cell growth control, and apoptosis as well as sterol biosynthesis33,34,35,36. Moreover, YY1, GATA2, MEF2A, FOXC1, and SREBF1 were found to involve in Irritable Bowel Syndrome (IBS)37,38,39,40,41. Mir-92a-3p, mir-122-5p, mir-143-3p, mir-106b-5p, and mir-6826-3p were found to be linked to lupus erythematosus42, lipoprotein metabolism43, acute ischemic stroke44, chronic thromboembolic pulmonary hypertension45, and neuronal function loss46 among the 24 miRNAs studied (supplementary file 9). Gastric, cervical, pancreas, lung, colon, colorectal, thyroid, ovarian, prostate, hepatocellular carcinoma, osteosarcoma, and testicular germ cell malignancies were all involved in the remaining 19 miRNAs47,48,49,50. Also, previous research showed that mir-15, miR-16, miR-125b, and mir-106b were involved in Irritable Bowel Syndrome (IBS) with diarrhea51,52 while miR-143, miR-145, miR-21, miR-155, miR-21, miR-92a, miR-122, miR-17, mir-106a and mir-362-3p were found to involved in ulcerative colitis and other gastrointestinal tract diseases53,54,55,56,57,58,59.
Identifying common proteins and hub proteins in each of the four protein groups dysregulated by four bacterial infections prompted us to investigate the possibility of a functional relationship between them to understand their co-regulation better. We identified the domains inside each protein and used multiple up-to-date modeling tools to model their 3D structures before subjecting them to molecular docking utilizing an integrated procedure (Figs. 5, 6, Table 3 and supplementary file 7). Among the protein complexes, we found a wide range of binding energies, hydrogen bonds, and salt bridges (supplementary table 1). So, in order to confirm the docking complexes' stability and get a better understanding of their possible interaction and co-regulation, we performed molecular dynamics simulations on all 73 protein complexes and observed that just 20 of them are stable in the solvated state. This explains that only 20 of these 73 protein complexes might co-regulate (either upregulation or downregulation) during diarrhea by four bacterial pathogenesis. Following these investigations, we identified five proteins (RAPH1, GCNT7, ADAMTS1, GLI1, and SEC31B) that have a stable, functional relationship with the other hub proteins resulting from the pathogenesis of four bacteria (Fig. 6).
The RMSD analysis revealed that all of the complexes were stable and showed reduced changes during the simulation period, based on the interpretation of post-MD data (Fig. 7A1, 7B1, 7C1). Except for the C2/Ibe-RPS6e and C2/IBe-ZDM complexes, measurements of changes in the minimum distances between the residues of interaction planes of the complexes during simulation revealed that the remaining complexes had a distance of less than 0.3 nm between the interacting residues (Fig. 7A2, 7B2, 7C2). The stability of the protein complexes was further validated by a sufficient number of H-bond estimations during MD simulation (Fig. 7A3, 7B3, 7C3). The gyration radius revealed that the protein complexes maintained a consistent level of compactness across time (Fig. 7A4, 7B4, 7C4). The SASA study revealed that the complexes obtained less volume as a result of their interaction, which could be the cause of protein functional alterations (Fig. 7A5, 7B5, 7C5). The MM/PBSA results revealed that the majority of the proteins bind efficiently among themselves, as evidenced by good binding free energies (Table 4). In particular, the ADAMTS1 protein exhibited the most efficient binding energies with the GCNT7, RPS6, and HIST1H2BC proteins, which are upregulated hub proteins resulting from pathogenesis by E. coli, C. jejuni and S. dysenteriae respectively, while the ADAMTS1 protein itself is upregulated by S. Typhimurium (Table 4). These analyses make ADAMTS1 the most plausible therapeutic target of all the five identified proteins for the development of common drugs against diarrhea.
The biological function and activity of a cell are driven by switching on and off gene expression. Conversely, gene transcription is a facilitator of the pathogenic events that drive the evolution and progression of the disease, as well as directing the response to therapy. By comparing gene expression profiles under different disease conditions, individual genes or their corresponding protein products can be identified as therapeutic targets. Our study reports five such proteins, RAPH1, GCNT7, ADAMTS1, GLI1, and SEC31B, which are strong binding partners of other significant proteins and could thus be targeted for the discovery of a common medication system against diarrheal disease.
A step-wise protocol consisting of two approaches was followed to identify the major druggable targets against the four diarrheal pathogens. The workflow is depicted in Fig. 1.
Gene Expression Omnibus (GEO) is an internationally acclaimed online database60 (https://www.ncbi.nlm.nih.gov/geo/) by National Center for Biotechnology Information (NCBI) for high-throughput sequencing data, microarray, and hybridization array data. We downloaded four datasets from this database with accession numbers GSE51043 (GSM1236481-GSM1236489)6 for S. Typhimurium, GSE18810 (GSM466514-GSM466519)61 forE. coli, GSE19315 (GSM479983-GSM479991)62 for S. dysenteriae, and GSE36701 (GSM899034-GSM899254)63 for C. jejuni for this study. Further, the limma64 and DESeq265 package of R was used to analyze these datasets. We used the False Discovery Rate (FDR) to find the dysregulated genes from the RNA-seq data analysis. The amount of gene expression between the control and sample was determined using the statistical criterion of log2fold change (bacteria-infected patients). The log2fold change parameters (2 and − 2) were adjusted to reflect the higher significant upregulated and downregulated genes, which indicate that the genes are dysregulated (upregulated and downregulated). Following that, we also employed a statistically significant P (probability) value (P ≤ 0.05) to identify genes that were dysregulated because a P value larger than 0.05 indicates that no difference between the control and the sample was seen.
The signaling pathways and gene ontologies associated with the up-and downregulated genes were predicted using different databases via the Enrichr enrichment analysis online tool66 (https://maayanlab.cloud/Enrichr/). For pathways, we considered KEGG67,68 (2021), while biological process (2021) and molecular function (2021) were evaluated for gene ontologies. The significant pathways were filtered using the p-value with a cutoff score set to 0.05.
We identified the common dysregulated (both upregulated and downregulated) genes from four bacteria using the Cytoscape software v3.869 the InteractiVenn tool (http://www.interactivenn.net/). The name of the bacterial species was set as a node, and the DEGs were set as target nodes to generate the network. Further, the resulting network showed the common genes among the four species.
Protein–protein interaction (PPI) of the DEGs was analyzed using the STRING database70 with a confidence score of ≥ 0.4. The organism was specified as H. sapiens, and the generated PPI network was visualized using the Cytoscape software. We also generated a PPI network of all the up-and downregulated DEGs of four bacteria separately and determined potential hubs within these networks by applying different local-based methods using the cytoHubba71 plugin in Cytoscape v3.8. Based on the relationship between the node and its direct neighbor, the local method ranked the hub proteins. In total, four local rank methods were considered, i.e., maximal clique centrality (MCC), maximum neighborhood component (MNC), the density of maximum neighborhood component (DMNC), and degree method.
Transcription factors (TFs) and microRNAs (miRNAs) are regulatory molecules responsible for significant changes in transcription and expression results. Therefore, we deployed experimentally verified JASPAR72 and miRTarbase v6.073 datasets to anticipate TF–gene and miRNA–gene interactions via the NetworkAnalyst v3.074 web tool. Both networks were visualized with Cytoscape v3.8.
The distinct functional and structural units in a protein are domains responsible for a particular function contributing to the overall role of a protein75. The domains of analyzed significant hub proteins were predicted using NCBI’s CD-Search tool (https://www.ncbi.nlm.nih.gov/Structure/cdd/wrpsb.cgi))76. Further, the domains’ three-dimensional (3D) were searched and retrieved from the PDB database (https://www.rcsb.org/). The structures that were not present in the database were modeled using MODELLER 9.2277 and the trROSETTA78 sever based on query coverage and further refined through the GalaxyRefine79 server. The best structure from MODELLER was chosen using the DOPE and GA341 objective functions, where a higher GA341 and/or lower DOPE score indicates a high quality of a generated model. Further, the modeled structures were assessed using PROCHECK80 and ERRAT80 tools from SAVES 6.0 server and ProSA-web81 analysis program.
Molecular docking analysis was performed to study the stable domain-domain interactions among all the proteins and related sub-cellular functions. The ClusPro 2.0: protein–protein docking server82 was used for this purpose. In this server, the PIPER docking program83 is used by the rigid body docking phase, which relies on the Fast Fourier Transform (FFT)84 correlation approach. PIPER depicts the interaction energy between two protein molecules using an expression of form E; E = w1Erep + w2Eattr + w3Eelec + w4EDARS, where, Erep and Eattr represent the attractive and repulsive contributions to the van der Waals interaction energy, E elec is electrostatic energy, while EDARS85 is a pairwise structure-based potential that primarily represents desolvation contributions. The ClusPro 2.0 server can differentiate thousands of conformations of the protein on the basis of different desolvation and electrostatic potentials. Following the docking process, each complex with the least binding energy was submitted to the PDBsum86 server (http://www.ebi.ac.uk/thornton-srv/databases/pdbsum/Generate.html) to view the residues involved in the interacting planes.
Molecular dynamics simulation of the docked protein complexes was performed using GROMACS 5.1.487,88 version on Linux 5.4 package. The GROMOS96 54a789 was the selected force field as this parameter set has the enhanced capacity of the backbone NH and CO groups to form hydrogen bonds with each other resulting in reproducing the folding equilibria slightly better and sampling more 314-helical or hairpin conformations than the previous 53A6 or 45A3 force fields90. The protein complexes were solvated using simple point charge (SPC) water molecules in a rectangular box where the required number of Na + and Cl − ions were added to electrically neutral the simulation system. Upon setting the salt concentrations to 0.15 mol/L, the solvated systems were subjected to energy minimization for 5000 steps using the steepest descent method. Afterward, an NVT (constant number of particles, volume, and temperature) ensemble and an NPT (constant number of particles, pressure, and temperature) ensemble were conducted at 300 K temperature and 1 atm for a duration of 100 picoseconds (ps) to equilibrate the systems. Throughout the simulation, V‐rescale and Parrinello‐Rahman were selected as the thermostat and barostat, respectively. Finally, the production runs of all the protein complexes were performed at 300 K for a duration of 100 ns (nanoseconds) in a GPU (Graphics processing unit) accelerated supercomputing system that was provided by the Bioinformatics Division of the National Institute of Biotechnology (NIB), Bangladesh. Thereafter, in order to evaluate the stability of the complexes, root mean square deviation (RMSD), root mean square fluctuation (RMSF), a number of hydrogen bonds, the radius of gyration (Rg), atomic distances, and solvent accessible surface area (SASA) were analyzed and represented in the form of plots using the Qtgrace program.
Further, to calculate the binding energies through the MM/PBSA (Molecular Mechanics/Poisson Boltzmann Surface Area) method, the g_mmpbsa91 package of GROMACS was used, followed by the final MD run to get a more detailed overview of the biomolecular interactions between the two domains in every protein complexes. The total ΔGbind of each protein–protein complex was determined from the free solvation energy (polar and nonpolar solvation energies) and potential energy (electrostatic and Van der Waals interactions). The binding energies were calculated using the following equation in this method:
Here, the ΔGbinding = the total binding energy of the protein–protein complex, Gprotein1 = the binding energy of the first protein, and Gprotein2 = the binding energy of the second protein.
The identification of therapeutic targets is critical for the development of novel medications to treat pathogen-related disorders. According to our findings, five crucial genes from the CESS are likely candidates for common drug discovery against the CESS. The pharmaceutical and scientific communities may be interested in this innovative method through differential gene expression for identifying targets for future therapeutic development research.
All data generated and analyzed during this study are included in this article.
Khatri, P. & Drăghici, S. Ontological analysis of gene expression data: current tools, limitations, and open problems. Bioinformatics 21, 3587–3595 (2005).
Jafari, P. & Azuaje, F. An assessment of recently published gene expression data analyses: reporting experimental design and statistical factors. BMC Med. Informatics Decis. Mak. 61(6), 1–8 (2006).
Dopazo, J., Zanders, E., Dragoni, I., Amphlett, G. & Falciani, F. Methods and approaches in the analysis of gene expression data a b b b. J. Immunol. Methods 250, 93–112 (2001).
Wu, Y.-H. et al. Severe acute respiratory syndrome coronavirus (SARS-CoV)-2 infection induces dysregulation of immunity: in silico gene expression analysis. Int. J. Med. Sci. 18, 1143 (2021).
Hodges, K. & Gill, R. Infectious diarrhea: Cellular and molecular mechanisms. Gut Microbes 1, 4 (2010).
Cardenal-Muñoz, E., Gutiérrez, G. & Ramos-Morales, F. Global impact of Salmonella type III secretion effector SteA on host cells. Biochem. Biophys. Res. Commun. 449, 419–424 (2014).
Kenny, B., Abe, A., Stein, M. & Finlay, B. B. Enteropathogenic Escherichia coli protein secretion is induced in response to conditions similar to those in the gastrointestinal tract. Infect. Immun. 65, 2606–2612 (1997).
Hecht, G. A. I. L. Microbes and microbial toxins: paradigms for microbial-mucosal interactions. VII. Enteropathogenic Escherichia coli: physiological alterations from an extracellular position. Am. J. Physiol. 281(1), G1–G7 (2001).
Centers for Disease Control and Prevention. https://www.cdc.gov/.
Das, S. K. et al. Changing emergence of shigella sero-groups in bangladesh: Observation from four different diarrheal disease hospitals. PLoS ONE 8, e62029 (2013).
Js, W., Df, K. & Jl, M. Role of M cells in initial antigen uptake and in ulcer formation in the rabbit intestinal loop model of shigellosis. Infect. Immun. 57, 858–863 (1989).
Rb, Y. & Cl, B. Infection of cultured mouse macrophages with shigella flexneri. Infect. Immun. 1, 459–463 (1970).
Sansonetti, P. J. et al. Caspase-1 activation of IL-1β and IL-18 are essential for Shigella flexneri–induced inflammation. Immunity 12(5), 581–590 (2000).
Wallis, M. R. The pathogenesis of Campylobacter jejuni. Br. J. Biomed. Sci. 51, 57–64 (1994).
Martínez-Flores, I. et al. In silico clustering of Salmonella global gene expression data reveals novel genes co-regulated with the SPI-1 virulence genes through HilD. Sci. Rep. 61(6), 1–12 (2016).
Metris, A., Reuter, M., Gaskin, D. J., Baranyi, J. & van Vliet, A. H. In vivo and in silico determination of essential genes of Campylobacter jejuni. BMC Genomics 121(12), 1–14 (2011).
Basharat, Z., Jahanzaib, M. & Rahman, N. Therapeutic target identification via differential genome analysis of antibiotic resistant Shigella sonnei and inhibitor evaluation against a selected drug target. Infect. Genet. Evol. 94, 105004 (2021).
Edwards, J. S. & Palsson, B. O. The Escherichia coli MG1655 in silico metabolic genotype: Its definition, characteristics, and capabilities. Proc. Natl. Acad. Sci. 97, 5528–5533 (2000).
Peng, X., Wang, J., Wang, J., Wu, F. X. & Pan, Y. Rechecking the centrality-lethality rule in the scope of protein subcellular localization interaction networks. PLoS ONE 10, e0130743 (2015).
Arnold, R. J. et al. Clinical implications of chemotherapy-induced diarrhea in patients with cancer. J. Support. Oncol. 3, 227–232 (2005).
Naraev, B. G. et al. Management of diarrhea in patients with carcinoid syndrome. Pancreas 48, 961–972 (2019).
Rana, S. V. et al. Pro-inflammatory and anti-inflammatory cytokine response in diarrhoea-predominant irritable bowel syndrome patients. Trop. Gastroenterol. 33, 251–256 (2012).
Fox, K. et al. Ivabradine in stable coronary artery disease without clinical heart failure. N. Engl. J. Med. 371, 1091–1099 (2014).
Sheikh, I. A., Koley, H., Chakrabarti, M. K. & Hoque, K. M. The Epac1 signaling pathway regulates Cl− secretion via modulation of apical KCNN4c channels in Diarrhea. J. Biol. Chem. 288, 20404–20415 (2013).
D, E., S, L., AK, B. & A, E. What properties characterize the hub proteins of the protein-protein interaction network of Saccharomyces cerevisiae? Genome Biol. 7, (2006).
He, X. & Zhang, J. Why do hubs tend to be essential in protein networks?. PLOS Genet. 2, e88 (2006).
Latchman, D. S. Transcription factors: An overview. Int. J. Biochem. Cell Biol. 29, 1305–1312 (1997).
Ambros, V. The functions of animal microRNAs. Nat. 431, 350–355 (2004).
Bailey, T. L. & Machanick, P. Inferring direct DNA binding from ChIP-seq. Nucleic Acids Res. 40, e128–e128 (2012).
Saleem, R. A., Banerjee-Basu, S., Berry, F. B., Baxevanis, A. D. & Walter, M. A. Analyses of the effects that disease-causing missense mutations have on the structure and function of the winged-helix protein FOXC1. Am. J. Hum. Genet. 68, 627–641 (2001).
Dutton Sackett, S., Kaestner, K. H. & Advisor Jonathan Raper, D. A. The winged helix transcription factor Foxll in proliferation and homeostasis of the gastrointestinal tract and liver. (2008).
Kubosaki, A. et al. Genome-wide investigation of in vivoEGR-1 binding sites in monocytic differentiation. Genome Biol. 104(10), 1–14 (2009).
Nguyen, N., Zhang, X., Olashaw, N. & Seto, E. Molecular cloning and functional characterization of the transcription factor YY2 *. J. Biol. Chem. 279, 25927–25934 (2004).
Milunsky, J. M. et al. TFAP2A mutations result in branchio-oculo-facial syndrome. Am. J. Hum. Genet. 82, 1171–1177 (2008).
Lin, X., Shah, S. & Bulleit, R. F. The expression of MEF2 genes is implicated in CNS neuronal differentiation. Mol. Brain Res. 42, 307–316 (1996).
Wang, H. et al. Mutations in SREBF1, encoding sterol regulatory element binding transcription factor 1, cause autosomal-dominant IFAP syndrome. Am. J. Hum. Genet. 107, 34–45 (2020).
Kumar, N. et al. A YY1-dependent increase in aerobic metabolism is indispensable for intestinal organogenesis. Development 143, 3711 (2016).
Cohen, J. I. et al. Editor’s choice: Association of GATA2 Deficiency With Severe Primary Epstein-Barr Virus EBV Infection and EBV-associated Cancers. Clin. Infect. Dis. An Off. Publ. Infect. Dis. Soc. Am. 63, 41 (2016).
Wang, L., Fan, C., Topol, S. E., Topol, E. J. & Wang, Q. Mutation of MEF2A in an Inherited Disorder with Features of Coronary Artery Disease.
Su, D. N., Wu, S. P., Chen, H. T. & He, J. H. HOTAIR, a long non-coding RNA driver of malignancy whose expression is activated by FOXC1, negatively regulates miRNA-1 in hepatocellular carcinoma. Oncol. Lett. 12, 4061–4067 (2016).
Mahurkar-Joshi, S. et al. The colonic mucosal MicroRNAs, MicroRNA-219a-5p, and MicroRNA-338-3p are downregulated in irritable bowel syndrome and are associated with barrier function and MAPK signaling. Gastroenterology 160, 2409-2422.e19 (2021).
Kim, B.-S., Jung, J.-Y., Jeon, J.-Y., Kim, H.-A. & Suh, C.-H. Circulating hsa-miR-30e-5p, hsa-miR-92a-3p, and hsa-miR-223-3p may be novel biomarkers in systemic lupus erythematosus. HLA 88, 187–193 (2016).
Raitoharju, E. et al. Blood hsa-miR-122–5p and hsa-miR-885–5p levels associate with fatty liver and related lipoprotein metabolism—The Young Finns Study. Sci. Rep. 61(6), 1–13 (2016).
Tiedt, S. et al. RNA-Seq Identifies Circulating miR-125a-5p, miR-125b-5p, and miR-143-3p as Potential Biomarkers for Acute Ischemic Stroke. Circ. Res. 121, 970–980 (2017).
Miao, R. et al. hsa-miR-106b-5p participates in the development of chronic thromboembolic pulmonary hypertension via targeting matrix metalloproteinase 2. Pulmonary Circ. 10(3), 2045894020928300 (2020).
Yoshino, Y., Roy, B. & Dwivedi, Y. Altered miRNA landscape of the anterior cingulate cortex is associated with potential loss of key neuronal functions in depressed brain. Eur. Neuropsychopharmacol. 40, 70–84 (2020).
Xu, J., Zhang, J., Shan, F., Wen, J. & Wang, Y. SSTR5-AS1 functions as a ceRNA to regulate CA2 by sponging miR-15b-5p for the development and prognosis of HBV-related hepatocellular carcinoma. Mol. Med. Rep. 20, 5021–5031 (2019).
Ulivi, P. et al. Circulating plasma levels of miR-20b, miR-29b and miR-155 as predictors of bevacizumab efficacy in patients with metastatic colorectal cancer. Int. J. Mol. Sci. 19, 307 (2018).
Plaza, X. R. et al. miR-371a-3p, miR-373–3p and miR-367–3p as serum biomarkers in metastatic testicular germ cell cancers before, during and after chemotherapy. Cells 8, 1221 (2019).
Liang, H. et al. The PTTG1-targeting miRNAs miR-329, miR-300, miR-381, and miR-655 inhibit pituitary tumor cell tumorigenesis and are involved in a p53/PTTG1 regulation feedback loop. Oncotarget 6, 29413 (2015).
Martínez, C. et al. miR-16 and miR-125b are involved in barrier function dysregulation through the modulation of claudin-2 and cingulin expression in the jejunum in IBS with diarrhoea. Gut 66, 1597–1610 (2017).
Tao, W. et al. Elevated Circulating hsa-miR-106b, hsa-miR-26a, and hsa-miR-29b in Type 2 Diabetes Mellitus with Diarrhea-Predominant Irritable Bowel Syndrome. Gastroenterol. Res. Pract. 2016, (2016).
Pekow, J. R. et al. miR-143 and miR-145 are down-regulated in ulcerative colitis: putative regulators of inflammation and protooncogenes. Inflamm. Bowel Dis. 18, 94 (2012).
Yan, H., Zhang, X. & Xu, Y. Aberrant expression of miR-21 in patients with inflammatory bowel disease A protocol for systematic review and meta analysis. Med. United States 99, e19693 (2020).
Wan, J., Xia, L., Xu, W. & Lu, N. Expression and function of miR-155 in diseases of the gastrointestinal tract. Int. J. Mol. Sci. 17(5), 709 (2016).
Hassan, E. A., El-Din Abd El-Rehim, A. S., Mohammed Kholef, E. F. & Elsewify, W. A. E. Potential role of plasma miR-21 and miR-92a in distinguishing between irritable bowel syndrome, ulcerative colitis, and colorectal cancer. Gastroenterol. Hepatol. From Bed to Bench 13, 147 (2020).
Chen, Y. et al. miR-122 targets NOD2 to decrease intestinal epithelial cell injury in Crohn’s disease. Biochem. Biophys. Res. Commun. 438, 133–139 (2013).
Mogilyansky, E. & Rigoutsos, I. The miR-17/92 cluster: a comprehensive update on its genomics, genetics, functions and increasingly important and numerous roles in health and disease. Cell Death Differ. 20, 1603–1614 (2013).
Omidbakhsh, A., Saeedi, M., Khoshnia, M., Marjani, A. & Hakimi, S. Micro-RNAs -106a and -362-3p in peripheral blood of inflammatory bowel disease patients. Open Biochem. J. 12, 78 (2018).
Edgar, R., Domrachev, M. & Lash, A. E. Gene expression omnibus: NCBI gene expression and hybridization array data repository. Nucleic Acids Res. 30, 207–210 (2002).
GEO Accession viewer. https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE18810.
Leyva-Illades, D., Cherla, R. P., Galindo, C. L., Chopra, A. K. & Tesh, V. L. Global transcriptional response of macrophage-like THP-1 cells to Shiga toxin type 1. Infect. Immun. 78, 2454–2465 (2010).
Swan, C. et al. Identifying and testing candidate genetic polymorphisms in the irritable bowel syndrome (IBS): association with TNFSF15 and TNFα. Gut 62, 985–994 (2013).
Ritchie, M. E. et al. limma powers differential expression analyses for RNA-sequencing and microarray studies. Nucleic Acids Res. 43, e47–e47 (2015).
Zhang, Z. H. et al. A comparative study of techniques for differential expression analysis on RNA-Seq data. PLoS ONE 9, e103207 (2014).
Kuleshov, M. V. et al. Enrichr: A comprehensive gene set enrichment analysis web server 2016 update. Nucleic Acids Res. 44, W90 (2016).
Kanehisa, M. & Goto, S. KEGG: kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 28, 27–30 (2000).
Kanehisa, M., Furumichi, M., Tanabe, M., Sato, Y. & Morishima, K. KEGG: new perspectives on genomes, pathways, diseases and drugs. Nucleic Acids Res. 45, D353 (2017).
Shannon, P. et al. Cytoscape: A software environment for integrated models of biomolecular interaction networks. Genome Res. 13, 2498 (2003).
Szklarczyk, D. et al. STRING v11: protein–protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res. 47, D607–D613 (2019).
Chin, C.-H. et al. cytoHubba: Identifying hub objects and sub-networks from complex interactome. BMC Syst. Biol. 84(8), 1–7 (2014).
Sandelin, A., Alkema, W., Engström, P., Wasserman, W. W. & Lenhard, B. JASPAR: An open-access database for eukaryotic transcription factor binding profiles. Nucleic Acids Res. 32, D91 (2004).
Huang, H.-Y. et al. miRTarBase 2020: Updates to the experimentally validated microRNA–target interaction database. Nucleic Acids Res. 48, D148–D154 (2020).
Xia, J., Gill, E. E. & Hancock, R. E. W. Network analyst for statistical, visual and network-based meta-analysis of gene expression data. Nat. Protoc. 106(10), 823–844 (2015).
Bagowski, C. P., Bruins, W. & te Velthuis, A. J. The nature of protein domain evolution: Shaping the interaction network. Curr. Genomics 11, 368 (2010).
Marchler-Bauer, A. & Bryant, S. H. CD-Search: Protein domain annotations on the fly. Nucleic Acids Res. 32, W327 (2004).
Eswar, N. et al. Comparative protein structure modeling using modeller. Curr. Protocols Bioinform. 15(1), 5–6 (2006).
Yang, J. et al. Improved protein structure prediction using predicted interresidue orientations. Proc. Natl. Acad. Sci. 117, 1496–1503 (2020).
Lee, G. R., Won, J., Heo, L. & Seok, C. GalaxyRefine2: Simultaneous refinement of inaccurate local regions and overall protein structure. Nucleic Acids Res. 47, W451–W455 (2019).
Laskowski, R. A., MacArthur, M. W., Moss, D. S. & Thornton, J. M. PROCHECK: A program to check the stereochemical quality of protein structures. J. Appl. Crystallogr. 26, 283–291 (1993).
Wiederstein, M. & Sippl, M. J. ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res. 35, W407–W410 (2007).
Kozakov, D. et al. The ClusPro web server for protein-protein docking. Nat. Protoc. 12, 255 (2017).
Kozakov, D., Brenke, R., Comeau, S. R. & Vajda, S. PIPER: an FFT-based protein docking program with pairwise potentials. Proteins Struct. Funct. Bioinform. 65(2), 392–406 (2006).
Katchalski-Katzir, E. et al. Molecular surface recognition: determination of geometric fit between proteins and their ligands by correlation techniques. Proc. Natl. Acad. Sci. 89(6), 2195–2199 (1992).
Chuang, G. Y., Kozakov, D., Brenke, R., Comeau, S. R. & Vajda, S. DARS (decoys as the reference state) potentials for protein-protein docking. Biophys. J. 95(9), 4217–4227 (2008).
Laskowski, R. A., Jabłońska, J., Pravda, L., Vařeková, R. S. & Thornton, J. M. PDBsum: Structural summaries of PDB entries. Protein Sci. 27, 129–134 (2018).
GROMACS - A PARALLEL COMPUTER FOR MOLECULAR-DYNAMICS SIMULATIONS — the University of Groningen research portal. https://research.rug.nl/en/publications/gromacs-a-parallel-computer-for-molecular-dynamics-simulations.
Abraham, M. J. et al. GROMACS: High performance molecular simulations through multi-level parallelism from laptops to supercomputers. SoftwareX 1–2, 19–25 (2015).
Huang, W., Lin, Z. & van Gunsteren, W. F. Validation of the GROMOS 54A7 force field with respect to β-peptide folding. J. Chem. Theory Comput. 7(5), 1237–1243 (2011).
Schmid, N. et al. Definition and testing of the GROMOS force-field versions 54A7 and 54B7. Eur. Biophys. J. 40(7), 843–856 (2011).
Kumari, R., Kumar, R., Consortium, O. S. D. D. & Lynn, A. g_mmpbsa—A GROMACS tool for high-throughput MM-PBSA calculations. J. Chem. Inf. Model. 54, 1951–1962 (2014).
The author(s) acknowledge the Department of Pharmacology, University of Oxford, United Kingdom (UK), for their extended support during this study.
These authors contributed equally: Mohammad Uzzal Hossain, Nadim Ferdous, Mahjerin Nasrin Reza and Ishtiaque Ahammad.
Department of Pharmacology, Medical Sciences Division, University of Oxford, Oxford, OX13QT, UK
Mohammad Uzzal Hossain, Zachary Tiernan & Yi Wang
Bioinformatics Division, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka, 1349, Bangladesh
Mohammad Uzzal Hossain & Ishtiaque Ahammad
Department of Biotechnology and Genetic Engineering, Mawlana Bhashani Science and Technology University, Santosh, Tangail, 1902, Bangladesh
Nadim Ferdous, Mahjerin Nasrin Reza & A. K. M. Mohiuddin
Mathematical Institute, University of Oxford, Oxford, OX2 6GG, UK
Department of Chemistry, University of Oxford, Oxford, OX2 6GG, UK
Department of Microbiology, Jagannath University, Dhaka, 1100, Bangladesh
Molecular Biotechnology Division, Ministry of Science and Technology, National Institute of Biotechnology, Ganakbari, Ashulia, Savar, Dhaka, 1349, Bangladesh
Keshob Chandra Das & Md. Salimullah
Department of Biochemistry and Microbiology, North South University, Dhaka, 1229, Bangladesh
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
You can also search for this author in PubMed Google Scholar
M.S., K.C.D., C.A.K., A.K.M.M., M.U.H., and I.A., conceptualized the work. M.U.H., I.A., N.F., M.N.R., Z.T., B.L., F.O’., W.W., and S.S. performed the formal analysis. M.S., K.C.D., C.A.K., A.K.M.M., and M.U.H. investigated and supervised the work. M.U.H., I.A., N.F., and M.N.R. wrote the original draft. All authors reviewed the manuscript.
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
Hossain, M.U., Ferdous, N., Reza, M.N. et al. Pathogen-driven gene expression patterns lead to a novel approach to the identification of common therapeutic targets. Sci Rep 12, 21070 (2022). https://doi.org/10.1038/s41598-022-25102-8
DOI: https://doi.org/10.1038/s41598-022-25102-8
Anyone you share the following link with will be able to read this content:
Sorry, a shareable link is not currently available for this article.
Provided by the Springer Nature SharedIt content-sharing initiative
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.
Scientific Reports (Sci Rep) ISSN 2045-2322 (online)
Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.