Discovering Pirin's Molecular Interactions Using the HuRI and STRING Databases

        This week I will be using two databases to discover the molecular interactions that pirin is

subject to with other proteins in the human body. First is The Human Reference Protein 

Interactome Mapping Project (HuRI), a project by the Center for Cancer Systems Biology at 

Dana-Farber Cancer Institute that attempts to better understand how proteins within the human 

body interact with one another (Luck, Kim, Lambourne, et al., 2020). The second database I will 

be using is STRING, a broader database that collects and aggregates protein-protein 

interactions from "24,584,628 proteins in 5,090 organisms" (Szklarczyk, Gable, Nastou, et al.,

 2021). 


        Or, perhaps, this is would be the case if HuRI's protein database contained data on pirin, 

which is, unfortunately, one of the proteins absent from the database. This is a shame, as the 

Interactome database was exceptionally intuitive and had a user-interface that made 

navigating the database understandable and self-evident. While the STRING database 

relayed a substantive amount of information to me in my research, I found the website's 

navigation required a lot more trial an error to properly utilize in parsing information. However, 

it should be noted that in terms of raw data the STRING database was, for the purposes of my

research, the quintessential catalog of protein-protein interactions. STRING relayed the 

following 10 proteins that interacted with pirin in the human body, with all descriptions for 

these proteins on the STRING website being pulled from the UniProt database used in my 

previous blog post A Summary of Pirin (The UniProt Consortium, 2021)

 

PIWIL 1: "Piwi-like protein 1; endoribonuclease that plays a central role in postnatal germ cells by repressing transposable elements and preventing their mobilization, which is essential for the germline integrity. Acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins and governs the methylation and subsequent repression of transposons. Directly binds methylated piRNAs, a class of 24 to 30 nucleotide RNAs that are generated by a Dicer-independent mechanism and are primarily derived from transposons and other repeated sequence elements. Strongly prefers a uridine in the first position of their guide (g1U preference, also named 1U-bias). Not involved in the piRNA amplification loop, also named ping-pong amplification cycle. Acts as an endoribonuclease that cleaves transposon messenger RNAs. Besides their function in transposable elements repression, piRNAs are probably involved in other processes during meiosis such as translation regulation. Probable component of some RISC complex, which mediates RNA cleavage and translational silencing. Also plays a role in the formation of chromatoid bodies and is required for some miRNAs stability. Required to sequester RNF8 in the cytoplasm until late spermatogenesis; RNF8 being released upon ubiquitination and degradation of PIWIL1."

 

ABL2: "Abelson tyrosine-protein kinase 2; non-receptor tyrosine-protein kinase that plays an ABL1-overlapping role in key processes linked to cell growth and survival such as cytoskeleton remodeling in response to extracellular stimuli, cell motility and adhesion and receptor endocytosis. Coordinates actin remodeling through tyrosine phosphorylation of proteins controlling cytoskeleton dynamics like MYH10 (involved in movement); CTTN (involved in signaling); or TUBA1 and TUBB (microtubule subunits). Binds directly F-actin and regulates actin cytoskeletal structure through its F-actin-bundling activity. Involved in the regulation of cell adhesion and motility through phosphorylation of key regulators of these processes such as CRK, CRKL, DOK1 or ARHGAP35. Adhesion-dependent phosphorylation of ARHGAP35 promotes its association with RASA1, resulting in recruitment of ARHGAP35 to the cell periphery where it inhibits RHO. Phosphorylates multiple receptor tyrosine kinases like PDGFRB and other substrates which are involved in endocytosis regulation such as RIN1. In brain, may regulate neurotransmission by phosphorylating proteins at the synapse. ABL2 acts also as a regulator of multiple pathological signaling cascades during infection. Pathogens can highjack ABL2 kinase signaling to reorganize the host actin cytoskeleton for multiple purposes, like facilitating intracellular movement and host cell exit. Finally, functions as its own regulator through autocatalytic activity as well as through phosphorylation of its inhibitor, ABI1." 

 

PSMA2: "Proteasome subunit alpha type-7; component of the 20S core proteasome complex involved in the proteolytic degradation of most intracellular proteins. This complex plays numerous essential roles within the cell by associating with different regulatory particles. Associated with two 19S regulatory particles, forms the 26S proteasome and thus participates in the ATP-dependent degradation of ubiquitinated proteins. The 26S proteasome plays a key role in the maintenance of protein homeostasis by removing misfolded or damaged proteins that could impair cellular functions, and by removing proteins whose functions are no longer required. Associated with the PA200 or PA28, the 20S proteasome mediates ubiquitin-independent protein degradation. This type of proteolysis is required in several pathways including spermatogenesis (20S-PA200 complex) or generation of a subset of MHC class I-presented antigenic peptides (20S-PA28 complex). Inhibits the transactivation function of HIF-1A under both normoxic and hypoxia-mimicking conditions. The interaction with EMAP2 increases the proteasome-mediated HIF-1A degradation under the hypoxic conditions. Plays a role in hepatitis C virus internal ribosome entry site-mediated translation. Mediates nuclear translocation of the androgen receptor (AR) and thereby enhances androgen-mediated transactivation. Promotes MAVS degradation and thereby negatively regulates MAVS-mediated innate immune response." 

 

RAC1: "Ras-related C3 botulinum toxin substrate 1; plasma membrane-associated small GTPase which cycles between active GTP-bound and inactive GDP-bound states. In its active state, binds to a variety of effector proteins to regulate cellular responses such as secretory processes, phagocytosis of apoptotic cells, epithelial cell polarization, neurons adhesion, migration and differentiation, and growth-factor induced formation of membrane ruffles.

Rac1 p21/rho GDI heterodimer is the active component of the cytosolic factor sigma 1, which is involved in stimulation of the NADPH oxidase activity in macrophages. Essential for the SPATA13-mediated regulation of cell migration and adhesion assembly and disassembly. Stimulates PKN2 kinase activity. 

In concert with RAB7A, plays a role in regulating the formation of RBs (ruffled borders) in osteoclasts. 

In podocytes, promotes nuclear shuttling of NR3C2; this modulation is required for a proper kidney functioning. Required for atypical chemokine receptor ACKR2-induced LIMK1-PAK1-dependent phosphorylation of cofilin (CFL1) and for up-regulation of ACKR2 from endosomal compartment to cell membrane, increasing its efficiency in chemokine uptake and degradation. In neurons, is involved in dendritic spine formation and synaptic plasticity (By similarity). 

In hippocampal neurons, involved in spine morphogenesis and synapse formation, through local activation at synapses by guanine nucleotide exchange factors (GEFs), such as ARHGEF6/ARHGEF7/PIX.

In synapses, seems to mediate the regulation of F-actin cluster formation performed by SHANK3. In neurons, plays a crucial role in regulating GABA(A) receptor synaptic stability and hence GABAergic inhibitory synaptic transmission through its role in PAK1 activation and eventually F-actin stabilization (By similarity)." 

 

NCKAP1: "Nck-associated protein 1; part of the WAVE complex that regulates lamellipodia formation. The WAVE complex regulates actin filament reorganization via its interaction with the Arp2/3 complex. Actin remodeling activity is regulated by RAC1. As component of the WAVE1 complex, required for BDNF-NTRK2 endocytic trafficking and signaling from early endosomes; Belongs to the HEM-1/HEM-2 family."


NCK1: "Cytoplasmic protein NCK1; adapter protein which associates with tyrosine-phosphorylated growth factor receptors, such as KDR and PDGFRB, or their cellular substrates. Maintains low levels of EIF2S1 phosphorylation by promoting its dephosphorylation by PP1. Plays a role in the DNA damage response, not in the detection of the damage by ATM/ATR, but for efficient activation of downstream effectors, such as that of CHEK2. Plays a role in ELK1-dependent transcriptional activation in response to activated Ras signaling. Modulates the activation of EIF2AK2/PKR by dsRNA. May play a role in cell adhesion and migration through interaction with ephrin receptors."


WASF2: "Wiskott-Aldrich syndrome protein family member 2; downstream effector molecule involved in the transmission of signals from tyrosine kinase receptors and small GTPases to the actin cytoskeleton. Promotes formation of actin filaments. Part of the WAVE complex that regulates lamellipodia formation. The WAVE complex regulates actin filament reorganization via its interaction with the Arp2/3 complex; Wiskott-Aldrich Syndrome protein family." 


CTF1: "Cardiotrophin-1; induces cardiac myocyte hypertrophy in vitro. Binds to and activates the ILST/gp130 receptor; Interleukin 6 type cytokine family."


PIWIL2: "Piwi-like protein 2; endoribonuclease that plays a central role during spermatogenesis by repressing transposable elements and preventing their mobilization, which is essential for the germline integrity (By similarity).

Plays an essential role in meiotic differentiation of spermatocytes, germ cell differentiation and in self-renewal of spermatogonial stem cells (By similarity).

Acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins and govern the methylation and subsequent repression of transposons (By similarity). 

During piRNA biosynthesis, plays a key role in the piRNA amplification loop, also named ping-pong amplification cycle, by acting as a 'slicer-competent' piRNA endoribonuclease that cleaves primary piRNAs, which are then loaded onto 'slicer-incompetent' PIWIL4 (By similarity). 

PIWIL2 slicing produces a pre-miRNA intermediate, which is then processed in mature piRNAs, and as well as a 16 nucleotide by-product that is degraded (By similarity). 

Required for PIWIL4/MIWI2 nuclear localization and association with secondary piRNAs antisense (By similarity). 

Besides their function in transposable elements repression, piRNAs are probably involved in other processes during meiosis such as translation regulation (By similarity). 

Indirectly modulates expression of genes such as PDGFRB, SLC2A1, ITGA6, GJA7, THY1, CD9 and STRA8 (By similarity). 

When overexpressed, acts as an oncogene by inhibition of apoptosis and promotion of proliferation in tumors.

Represses circadian rhythms by promoting the stability and activity of core clock components ARNTL/BMAL1 and CLOCK by inhibiting GSK3B-mediated phosphorylation and ubiquitination-dependent degradation of these proteins."


PIWIL4: "Piwi-like protein 4; plays a central role during spermatogenesis by repressing transposable elements and preventing their mobilization, which is essential for the germline integrity (By similarity).

Acts via the piRNA metabolic process, which mediates the repression of transposable elements during meiosis by forming complexes composed of piRNAs and Piwi proteins and governs the methylation and subsequent repression of transposons (By similarity). 

Directly binds piRNAs, a class of 24 to 30 nucleotide RNAs that are generated by a Dicer-independent mechanism and are primarily derived from transposons and other repeated sequence elements (By similarity). 

Associates with secondary piRNAs antisense and PIWIL2/MILI is required for such association (By similarity). 

The piRNA process acts upstream of known mediators of DNA methylation (By similarity). 

Does not show endonuclease activity (By similarity). 

Plays a key role in the piRNA amplification loop, also named ping-pong amplification cycle, by acting as a 'slicer-incompetent' component that loads cleaved piRNAs from the 'slicer-competent' component PIWIL2 and target them on genomic transposon loci in the nucleus (By similarity). 

May be involved in the chromatin-modifying pathway by inducing 'Lys-9' methylation of histone H3 at some loci. 

In addition to its role in germline, PIWIL4 also plays a role in the regulation of somatic cells activities. Plays a role in pancreatic beta cell function and insulin secretion (By similarity). 

Involved in maintaining cell morphology and functional integrity of retinal epithelial through Akt/GSK3alpha/beta signaling pathway.

When overexpressed, acts as an oncogene by inhibition of apoptosis and promotion of cells proliferation in tumors." 

 

        Perhaps the most striking and immediate information relayed by the STRING database on

pirin interactions is the lack of experimentally validated results. Since starting this blog 2 weeks 

ago I was aware of pirin's status as a still-novel protein, however I was not truly made aware of 

the extent of the lack of experimentation in regards to its interactions with other proteins. This is,

however, not to say that the results found on the STRING database are without any 

substantiation; while pirin's interactions with other proteins are still only predicted by the

program, the basis for many of these inferences are found on highly respectable and curated

databases such as RefSeq and UniProt. However, the proteins PIWIL4, PIWIL2, PIWIL1, and 

CTF were listed as pirin interaction by STRING only because pirin and one of the 

aforementioned proteins were mutually mentioned in several Pubmed abstracts. 


        This brings us to the limitations of the STRING tool, which are most pronounced when there

is an absence of experimental data, as was the case with pirin. When using tools such as 

STRING it is important to remember that the data involved is compiled by a computer, and is

thus subject to mistakes that a human may otherwise avoid. From what I can gather, STRING 

uses a program known as a web crawler to search for co-mentions of protein names as well as

mentions of those proteins' homologs and orthologs on curated databases as well as Pubmed 

abstracts. A web crawler essentially functions by routinely monitoring a set parameter of 

websites and cataloguing the information on these websites for future use. While some web 

crawlers will systematically gather all data available, more sophisticated programs can perform

text mining, which is the  computational parsing of text into statistically significant groupings and

models. Essentially, STRING will observe unfathomable amounts of scientific literature and try

to determine which proteins interact with one another based upon how and where they are 

mentioned together as well as how homologs and orthologs of those proteins are said interact 

(Szklarczyk, Gable, Nastou, et al., 2021). 


        String is not open source, so it is difficult to determine how sophisticated its text mining is, 

but for proteins that have been experimentally proven to interact with one another it should be a

more than sufficient tool. While using STRING I observed numerous descriptions of proteins that

were integrated from other databases effectively; I do not think so little of the program as to

believe that it cannot recognize when humans have established with relative certainty that a

given protein-protein interaction exists and then feed that information back to the user. The 

problem lies when these experimental results are absent, as STRING has no choice but to make

its best guess. This, however, does not diminish the value of STRING, but merely alters it. While

we must keep in mind that STRING results are not to be taken as gospel, its ability to compile 

co-mentions of proteins in scientific literature in addition to its ability parse through absurd 

quantities of information on given proteins homologs and orthologs to make keener predictions is

impressive, and surpasses anything that is humanly possible without the assistance of

machines. This makes STRING an excellent starting place when forming hypotheses. For 

example, because there are large amounts of curated literature linking pirin with Cytoplasmic 

protein NCK1, a dephosphorylation coenzyme aswell as a protein associated with DNA repair, 

we might make a hypothesis that pirin serves functions in the hydrolysis of ADP or as a 

transcriptional regulator (Cardin & Larose, 2008) (Latreille & Larose, 2006). 





References


Cardin E, Larose L. Nck-1 interacts with PKR and modulates its activation by dsRNA. Biochem
 
        Biophys Res Commun. 2008 Dec 5;377(1):231-5. doi: 10.1016/j.bbrc.2008.09.112. Epub 

        2008 Oct 1. PMID: 18835251.


Latreille M, Larose L. Nck in a complex containing the catalytic subunit of protein phosphatase 1 
    
        regulates eukaryotic initiation factor 2alpha signaling and cell survival to endoplasmic 

        reticulum stress. J Biol Chem. 2006 Sep 8;281(36):26633-44. doi: 

        10.1074/jbc.M513556200. Epub 2006 Jul 11. PMID: 16835242.


Luck, K., Kim, DK., Lambourne, L. et al. A reference map of the human binary protein 

        Interactome. Nature 580, 402–408 (2020). https://doi.org/10.1038/s41586-020-2188-x


Szklarczyk D, Gable AL, Nastou KC, Lyon D, Kirsch R, Pyysalo S, Doncheva NT, Legeay M, 
 
        Fang T, Bork P, Jensen LJ, von Mering CThe STRING database in 2021: customizable
 
        protein–protein networks, and functional characterization of user-uploaded 

        gene/measurement sets. Nucleic Acids Res. 2021 Jan 8;49(D1):D605-12.


The UniProt Consortium, UniProt: the universal protein knowledgebase in 2021, Nucleic Acids 
    
        Research, Volume 49, Issue D1, 8 January 2021, Pages D480–D489, 


Comments

  1. Josiah,
    Recognizing the limitations of our tools is very important— and provides us with great insight on how to improve them! Very nice work.

    ReplyDelete

Post a Comment

Popular posts from this blog

Utilizing the GTEx Gene V8 and GNF Atlas Tracks in the UCSC Genome Browser to Evaluate PIR Expression

The VarSite Database and Potential Future Investigations of Pirin