This web page originated as an assignment in Emory University's Biology 142 lab course. Students were assigned proteins of interest and asked to research what is known about the protein and to examine whether the newly sequenced whale shark genome had evidence of an orthologous protein.
Background
The RIPK2 gene’s main function is to help regulate both innate and adaptive immune responses.[1] They do this by activating receptors, which then activate the host defense immune systems.[2] The caspase recruitment domain (CARD) is a protein, which helps to mediate the RIP and RICK proteins.[3] Studies have all shown a strong connection between regulated RIPK2 and cell survival.[4] This protein can be useful for the detection of pancreatic cancer because of its production of enzymes and bicarbonate. It has also been linked to Huntington’s disease when it is not regulated properly as it has been found to activate a protein related to Huntington’s.[5] Research is still being conducted; however, recent findings have indicated that RIPK2 might play a major role in hepatic cell migration. As a result, scientists are hopeful for these discoveries to lead to new information on carcinogenesis and liver regeneration.[6]
Figure 1: The Ripk2 gene activates the receptors, in this case, of the TAK1 and CARD9 proteins in order for them to receive the signals coding for the immune system activation.[7]
Methods:
Whale Shark predicted orthologs:
In order to find whale shark predicted orthologs, the human protein sequence (ENSP00000220751) was obtained from Ensembl and then was used as query in a Blast search against the predicted whale shark protein database using the Galaxy server: whaleshark.georgiaaquarium.org. From the data collected from Galaxy, the top 4 predicted protein hits were chosen according to the lowest and most significant e-values. The top 4 predicted hits were then used as queries (using the full predicted sequence) in protein BLASTs against the NCBI human protein database. Orthologs were found if the predicted protein returned RIPK2 for homo sapiens as the top hit.
Predicted orthologs:
The NCBI Blast server was used to identify RIPK2 predicted orthologs in other species other than whale sharks. To search for potential orthologs, protein BLASTs were performed using single species protein databases for the following species: mouse, zebrafish, chimp, fruit flies, yeast, and dogs. The human RIPK2 protein (ENSP00000220751) was used as query sequence in these searches with default settings. Orthologos were found if the predicted protein returned RIPK2 as the top hit. To verify if the returned hit was a potential ortholog, the full protein sequence for the top hit was then blasted against the human protein database. If RIPK2 was again returned as a top hit, then a potential ortholog was found.
Phylogenetic tree:
To construct the phylogenetic tree, the full protein sequence was obtained for each top hit with the lowest E-value for each non-whale shark species search (using the human protein as query). These were added onto a document along with the protein sequences from top 4 hits from the data collected in Galaxy for the whale shark genome. All of the protein sequences were in FASTA format and then the collection of the sequences was used to create a multiple sequence alignment and phylogenetic tree. ClustalW2 with default settings was used to create the alignment and the option “rooted phylogenetic tree” was used to create the tree.
Searching for RIPK2 in the Whale Shark:
The human RIPK2 protein sequence was used to query the whale shark predicted protein database and results are shown in Table 2. There were 4 hits with the smallest and most significant hit producing 5e-44 with the next smallest E-value being 2e-26. These 4 best hits were then Blasted against the human protein database using NCBI BLASTp. From all of the 4 genes, g29118.t1 returned RIPK2 protein as it’s best hit. This indicates that Gene ID 29118.t1 is a potential ortholog.
Table 2: Significant Predicted Protein Hits in the Whale Shark Genome
Gene ID
E-Value
Alignment length
% Identical
Amino acid length
g29118.t1
5e-44
162
46.3
172
g43597.t1
2e-26
276
28.62
331
g48190.t1
2e-21
263
27.76
409
g41392.t1
9e-21
252
26.98
258
Table 2. This table indicates the best human CD74 Blastp hits against the whale shark predicted protein database. The top 4 hits with their gene IDs are listed with their respective E-values, alignment lengths, % identity, and amino acid length. The top hit (highlighted in red) is a potential ortholog.
Protein Domain:
RIPK2’s domain was obtained by using the human RIPK2 protein (ENSP00000220751) as a query in Ensembl. From the results we obtained, most genes are part of the Pkc like superfamilies. Our best hit, with g29118.t1, has a multi-domain, exhibiting Pkinase_tyr domain as well as the Pkc Like superfamily, shown in Figure 3.
The overarching domain of RIPK2 is “Protein Kinase-like domain”. This domain is a domain of proteins that modify other proteins by phosphorylating them. It is a fundamental process and necessary step for signaling and regulatory processes in eukaryotic cells. The domain contains a nucleotide-binding site and a catalytic “apparatus in the interlobe cleft” (INTERPro). The domain splits into the more specific “Protein Kinase domain” which catalyzes a transfer of a gamma phosphate from ATP to one or more amino acids. This phosphorylation changes the shape, which affects the function, by changing the enzyme activity, the cellular location and the association with other proteins. These proteins play a role in apoptosis, differentiation, proliferation, and cell division among other things. The eukaryotic kinases share a conserved catalytic core where the N-terminus is more specifically oriented for ATP binding (due to glycine/lysine residue), and the central part of the domain is more important for enzyme activity (aspartic acid residue).
Overall, this domain splits into 3 more sub sets of proteins: serine/threonine protein kinases, tyrosine-protein kinases, and dual specificity kinases (when the protein can phosphorylate threonine and tyrosine on target proteins). RIPK2 is both a serine/threonine protein kinase and a tyrosine protein kinase and functions the same way, with phosphorylation. It is about 250-300 amino acids long, and its location is not fixed within the protein, but is generally closer towards the C-terminus. These findings come from Figure 4 and subsequent research.
Figure 3: Domain of gene g29118.t1 – basically RIPK2
Figure 3. Putative domains of the RIPK2 best hit predicted proteins. All contain the Pkc superfamily.
Figure 4: In depth domain of RIPK2
Figure 4. This figure shows the all of the domains found in the human protein RIPK2 (ENSP00000220751).
Orthologs:
The Human RIPK2 protein sequence (ENSP00000220751) was used as query in NCBI BLAST searches against other individual species’ protein databases. The orthologos were found by this method, then reciprocating them back against the human database, to ensure their accuracy.
Animal name
Query coverage
E value
% Identical
Protein length
Accession #
Ortholog?
Mouse
100%
0.0
84
539
NP_003812.1
Yes
Zebra Fish
77%
4e-144
62
584
NP_919392.2
Yes
Yeast
37%
2e-16
29
567
NP_013466.1
No
Fruit Fly
53%
1e-28
30
572
NP_525080.1
No
Chimp
100%
0.0
99
539
XP_519850.2
Yes
Dog
100%
0.0
87
538
XP_005638158.1
Yes
Table 2. The best hits for the human RIPK2 protein BLAST. Name of the animal, query coverage, E value, % identical, protein length, accession # and confirmation of ortholog is shown here.
Phylogeny:
The best hits from our query search were used to create a phylogenetic tree. From this tree one can tell that many of our whale shark genes are much further away as a homolog from the human gene than are other animals, such as the chimp or dog. The four whale shark proteins do share a high degree of similarity though, as they are all grouped together.
Figure 5. Phylogenetic tree of RIPK2 best hits. The best 4 hits from the BLAST searches were used in the ClustalW2 program to create the phylogenetic tree. Branch lengths represent relative evolutionary time while the closest names on the figure share common homology.
Conclusions:
From the phylogenetic tree, we were able to identify that the g29118.t1 gene is a predicted ortholog in humans. The extremely low e-value we found (Table 1) strengthens this prediction. Our data is not too surprising, however, because the RIPK2 gene is directly associated with the regulation of the immune system. Thus, it is logical that it would be found as a predicted ortholog in so many other organisms. In the three other Gene IDs we researched, none seemed to be orthologs based on the phylogenetic tree, but they do seem to be very closely related to each other. These results could implicate that these genes share a recent ancestor with humans, but more research would need to be conducted in order to be conclusive. Other additional research should be performed in relation to RIPK2’s ability to help with liver regeneration which we briefly discussed in the Background Information section.
[2] Medzhitov, R. and Janeway, CharleS. J. (2000), Innate immune recognition: mechanisms and pathways. Immunological Reviews, 173: 89–97. doi: 10.1034/j.1600-065X.2000.917309.x Accessed April 14, 2015.
[3] Dufner, Almut. “CARD Tricks: Controlling the interactions of CARD6 with RICK and Microtubules.” Taylor and Francis Online. March 24, 2006. Accessed April 13, 2015.
[6] Wu, S., T. Kanda, S. Nakamoto, F. Imazeki, and O. Yokosuka. "Knockdown of Receptor-interacting Serine/threonine Protein Kinase-2 (RIPK2) Affects EMT-associated Gene Expression in Human Hepatoma Cells." National Center for Biotechnology Information. September 1, 2012. Accessed April 13, 2015.
RIPK2 (ENSP00000220751)
This Project
This web page originated as an assignment in Emory University's Biology 142 lab course. Students were assigned proteins of interest and asked to research what is known about the protein and to examine whether the newly sequenced whale shark genome had evidence of an orthologous protein.
Background
The RIPK2 gene’s main function is to help regulate both innate and adaptive immune responses.[1] They do this by activating receptors, which then activate the host defense immune systems.[2] The caspase recruitment domain (CARD) is a protein, which helps to mediate the RIP and RICK proteins.[3] Studies have all shown a strong connection between regulated RIPK2 and cell survival.[4] This protein can be useful for the detection of pancreatic cancer because of its production of enzymes and bicarbonate. It has also been linked to Huntington’s disease when it is not regulated properly as it has been found to activate a protein related to Huntington’s.[5] Research is still being conducted; however, recent findings have indicated that RIPK2 might play a major role in hepatic cell migration. As a result, scientists are hopeful for these discoveries to lead to new information on carcinogenesis and liver regeneration.[6]
Figure 1: The Ripk2 gene activates the receptors, in this case, of the TAK1 and CARD9 proteins in order for them to receive the signals coding for the immune system activation.[7]
Methods:
Whale Shark predicted orthologs:
In order to find whale shark predicted orthologs, the human protein sequence (ENSP00000220751) was obtained from Ensembl and then was used as query in a Blast search against the predicted whale shark protein database using the Galaxy server: whaleshark.georgiaaquarium.org. From the data collected from Galaxy, the top 4 predicted protein hits were chosen according to the lowest and most significant e-values. The top 4 predicted hits were then used as queries (using the full predicted sequence) in protein BLASTs against the NCBI human protein database. Orthologs were found if the predicted protein returned RIPK2 for homo sapiens as the top hit.
Predicted orthologs:
The NCBI Blast server was used to identify RIPK2 predicted orthologs in other species other than whale sharks. To search for potential orthologs, protein BLASTs were performed using single species protein databases for the following species: mouse, zebrafish, chimp, fruit flies, yeast, and dogs. The human RIPK2 protein (ENSP00000220751) was used as query sequence in these searches with default settings. Orthologos were found if the predicted protein returned RIPK2 as the top hit. To verify if the returned hit was a potential ortholog, the full protein sequence for the top hit was then blasted against the human protein database. If RIPK2 was again returned as a top hit, then a potential ortholog was found.
Phylogenetic tree:
To construct the phylogenetic tree, the full protein sequence was obtained for each top hit with the lowest E-value for each non-whale shark species search (using the human protein as query). These were added onto a document along with the protein sequences from top 4 hits from the data collected in Galaxy for the whale shark genome. All of the protein sequences were in FASTA format and then the collection of the sequences was used to create a multiple sequence alignment and phylogenetic tree. ClustalW2 with default settings was used to create the alignment and the option “rooted phylogenetic tree” was used to create the tree.
Searching for RIPK2 in the Whale Shark:
The human RIPK2 protein sequence was used to query the whale shark predicted protein database and results are shown in Table 2. There were 4 hits with the smallest and most significant hit producing 5e-44 with the next smallest E-value being 2e-26. These 4 best hits were then Blasted against the human protein database using NCBI BLASTp. From all of the 4 genes, g29118.t1 returned RIPK2 protein as it’s best hit. This indicates that Gene ID 29118.t1 is a potential ortholog.
Table 2: Significant Predicted Protein Hits in the Whale Shark Genome
Table 2. This table indicates the best human CD74 Blastp hits against the whale shark predicted protein database. The top 4 hits with their gene IDs are listed with their respective E-values, alignment lengths, % identity, and amino acid length. The top hit (highlighted in red) is a potential ortholog.
Protein Domain:
RIPK2’s domain was obtained by using the human RIPK2 protein (ENSP00000220751) as a query in Ensembl. From the results we obtained, most genes are part of the Pkc like superfamilies. Our best hit, with g29118.t1, has a multi-domain, exhibiting Pkinase_tyr domain as well as the Pkc Like superfamily, shown in Figure 3.
The overarching domain of RIPK2 is “Protein Kinase-like domain”. This domain is a domain of proteins that modify other proteins by phosphorylating them. It is a fundamental process and necessary step for signaling and regulatory processes in eukaryotic cells. The domain contains a nucleotide-binding site and a catalytic “apparatus in the interlobe cleft” (INTERPro). The domain splits into the more specific “Protein Kinase domain” which catalyzes a transfer of a gamma phosphate from ATP to one or more amino acids. This phosphorylation changes the shape, which affects the function, by changing the enzyme activity, the cellular location and the association with other proteins. These proteins play a role in apoptosis, differentiation, proliferation, and cell division among other things. The eukaryotic kinases share a conserved catalytic core where the N-terminus is more specifically oriented for ATP binding (due to glycine/lysine residue), and the central part of the domain is more important for enzyme activity (aspartic acid residue).
Overall, this domain splits into 3 more sub sets of proteins: serine/threonine protein kinases, tyrosine-protein kinases, and dual specificity kinases (when the protein can phosphorylate threonine and tyrosine on target proteins). RIPK2 is both a serine/threonine protein kinase and a tyrosine protein kinase and functions the same way, with phosphorylation. It is about 250-300 amino acids long, and its location is not fixed within the protein, but is generally closer towards the C-terminus. These findings come from Figure 4 and subsequent research.
Figure 3: Domain of gene g29118.t1 – basically RIPK2
Figure 3. Putative domains of the RIPK2 best hit predicted proteins. All contain the Pkc superfamily.
Figure 4: In depth domain of RIPK2
Figure 4. This figure shows the all of the domains found in the human protein RIPK2 (ENSP00000220751).
Orthologs:
The Human RIPK2 protein sequence (ENSP00000220751) was used as query in NCBI BLAST searches against other individual species’ protein databases. The orthologos were found by this method, then reciprocating them back against the human database, to ensure their accuracy.
Table 2. The best hits for the human RIPK2 protein BLAST. Name of the animal, query coverage, E value, % identical, protein length, accession # and confirmation of ortholog is shown here.
Phylogeny:
The best hits from our query search were used to create a phylogenetic tree. From this tree one can tell that many of our whale shark genes are much further away as a homolog from the human gene than are other animals, such as the chimp or dog. The four whale shark proteins do share a high degree of similarity though, as they are all grouped together.
Figure 5. Phylogenetic tree of RIPK2 best hits. The best 4 hits from the BLAST searches were used in the ClustalW2 program to create the phylogenetic tree. Branch lengths represent relative evolutionary time while the closest names on the figure share common homology.
Conclusions:
From the phylogenetic tree, we were able to identify that the g29118.t1 gene is a predicted ortholog in humans. The extremely low e-value we found (Table 1) strengthens this prediction. Our data is not too surprising, however, because the RIPK2 gene is directly associated with the regulation of the immune system. Thus, it is logical that it would be found as a predicted ortholog in so many other organisms. In the three other Gene IDs we researched, none seemed to be orthologs based on the phylogenetic tree, but they do seem to be very closely related to each other. These results could implicate that these genes share a recent ancestor with humans, but more research would need to be conducted in order to be conclusive. Other additional research should be performed in relation to RIPK2’s ability to help with liver regeneration which we briefly discussed in the Background Information section.
References:
[1] "Receptor-interacting Serine/threonine-protein Kinase 2." UniProt. Accessed April 14, 2015. http://www.uniprot.org/uniprot/O43353.
[2] Medzhitov, R. and Janeway, CharleS. J. (2000), Innate immune recognition: mechanisms and pathways. Immunological Reviews, 173: 89–97. doi: 10.1034/j.1600-065X.2000.917309.x Accessed April 14, 2015.
[3] Dufner, Almut. “CARD Tricks: Controlling the interactions of CARD6 with RICK and Microtubules.” Taylor and Francis Online. March 24, 2006. Accessed April 13, 2015.
[4] Fundamental role of the Rip2/caspase-1 pathway in hypoxia and ischemia-induced neuronal cell death. Zhang, W.H., Wang, X., Narayanan, M., Zhang, Y., Huo, C., Reed, J.C., Friedlander, R.M. Proc. Natl. Acad. Sci. U.S.A. (2003) [Pubmed] Accessed April 13, 2015.
[5] Dysregulation of receptor interacting protein-2 and caspase recruitment domain only protein mediates aberrant caspase-1 activation in Huntington's disease. Wang, X., Wang, H., Figueroa, B.E., Zhang, W.H., Huo, C., Guan, Y., Zhang, Y., Bruey, J.M., Reed, J.C., Friedlander, R.M. J. Neurosci. (2005) [Pubmed] Accessed April 13, 2015.
[6] Wu, S., T. Kanda, S. Nakamoto, F. Imazeki, and O. Yokosuka. "Knockdown of Receptor-interacting Serine/threonine Protein Kinase-2 (RIPK2) Affects EMT-associated Gene Expression in Human Hepatoma Cells." National Center for Biotechnology Information. September 1, 2012. Accessed April 13, 2015.
[7]"DCAM Pharma Pipeline." DCAM Pharma. Accessed April 14, 2015. http://www.dcampharma.com/pipeline.html
"Tyrosine Kinase Catalytic Domain Signature." ==SPRINT== Query Results. N.p., n.d. Web. 14 Apr. 2015.
"Receptor-interacting Serine/threonine-protein Kinase 2." RIPK2. N.p., n.d. Web. 14 Apr. 2015.