This project emerged in Emory University laboratory course in Biology 142 during the spring semester of 2015. Newly acquired whale shark proteins of interest were assigned to various student groups in hopes of better understanding the origins of these different proteins. Students were tasked to use different resources to BLAST the assigned protein sequences to better understand the homology of the whale shark proteins as well as determine whether there was an orthologous protein within the whale shark genome.
Background Information
The ThPOK gene, also known as ZBTB7B, plays a key role in the human immune system by encoding a zinc-finger transcription factor. This transcription factor is important in regulating maturation of T cells, white blood cells that recognize and destroy infected cells in the body. During T cell maturation, T cells differentiate into either CD4 (helper) or CD8+ (cytotoxic) cells. It has been shown that ThPOK expression activates development of CD4 cells, while ThPOK silencing promotes development of CD8+ cells.
Figure 1. Role of ThPOK in CD4 and CD8 + differentiation pathways (Setoguchi 2009). Part (a) shows ThPOK expression in CD4 helper cells, while part (c) shows ThPOK's regulatory role in gene expression of genes that determine CD8 cytotoxic fates, such as Runx3.
The ThPOK gene contains several important structural features visible in Figure 2 below. The blue region represents the broad complex, tram track, bric à brac (BTB) domain. The purple region represents a proline-rich sequence. The red ovals represent four zinc fingers, which serve as the DNA binding domain for its transcriptional regulation activity. There are eight BTB-ZF transcription factors found in the human genome.
Figure 2. ThPOK structural features (Beaulieu 2011).
Methods
Whaleshark Predicted Orthologs
The ZBTB7B human protein sequence (ENSP00000292176) was used in a Blast as query against the predicted whale shark protein database using the Galaxy server on whaleshark.georgiaaquarium.org. The full predicted sequence of the top five hits were used as queries on the NCBI database to Blast against the human protein database. To further check for orthologs, the whale shark protein database was also searched using the elephant shark predicted protein sequence (for ZBTB7B) as query since the elephant shark is one of the whale sharks' closest ancestors.
Predicted Orthologs
The ZBTB7B protein sequence was also used as query in the NCBI Blast in order to check orthologs in organisms other than the whale shark. The protein Blasts were preformed by using singe species databases for the mouse, yeast, zebra fish, guinea pig, fruit fly, and whale shark protein databases. The default settings used in each of the searches for these organisms was the human ZBTB7B protien (ENSP00000292176).
Phylogenetic tree
The best hit (those with the lowest E-values) for each of the non-whale shark organisms search along with the top whale shark BLAST hit were used to create a multiple sequence alignment on ClusalW2, using the human protein as the query. The multiple hits were then used to create a phylogenetic tree using the same ClustalW2 software.
Searching forZBTB7B/ThPOKin the whale shark
Whale Shark ID
E-value
Alignment Length
Predicted Protein Length
% Identity
g19887.t1
2e-29
101
116
42.57
g14292.t1
1e-29
169
181
35.50
g15112.t1
8e-28
106
204
46.23
g18750.t1
4e-28
108
272
43.52
g19873.t1
3e-28
108
156
45.37
Table 1. The whale shark protein best hits are shown above. In addition to the E-value, the alignment length, predicted protein length and percent identity are displayed in the table.
Reciprocal cross and bootstrapping
The top three (three lowest E-values) from the whale shark database were blasted against the human protein database using NCBI BLASTp in a reciprocal cross. The top three hits were not matches for the ZBTB7B protein. This was just as predicted because the E-values from the primary whale shark Blast on Galaxy did not produce very significant E-values (lowest E-value was 2e-29).
Next, the ZBTB7B human query was blasted against the elephant shark database; a reciprocal cross of this best hit yielded the ZBTB7C protein, closely related to the ZBTB7B protein, with a E-value of 4e-56. This may suggest that there are several BTB domain proteins present in the whale shark that are a closer match to ZBTB7C than ZBTB7B.
Protein domains
Figure 3. Protein families and domains for the ZBTB7B/ThPOK human gene. ZBTB7B/ThPOK is a zinc finger protein that comes from the BTB superfamily.
Species
Name
ID
E-value
Length
Human
zinc finger and BTB domain-containing protein 7B isoform 2
NP_001239335.1
0.0
573
Mouse
zinc finger and BTB domain-containing protein 7B
NP_033591.2
0.0
544
Zebrafish
uncharacterized protein LOC393602
NP_956923.1
2e-68
353
Yeast
Azf1p
NP_014756.3
5e-23
914
Whale shark
Unknown
g19793.t1
2e-31
106
Guinea pig
zinc finger and BTB domain-containing protein 7B isoform X1
XP_003475641.1
0.0
543
Fruit fly
glass, isoform A
NP_476854.1
2e-27
604
Table 2. Data for best protein hits using human ZBTB7B/ThPOK FASTA sequence and BLASTp for several species.
Orthologues
Based on E-value data, there are several predicted orthologues for the ZBTB7B/ThPOK gene across species: in the mouse, the zebrafish, and the guinea pig. For the whale shark, a reciprocal best hit search returned the zinc finger protein 383 with an E-value of 4e-57 and a 48% identity match; therefore, it is unlikely that this gene is an orthologue of the ZBTB7B/ThPOK gene but it is likely an orthologue of a related gene in the BTB domain.
Phylogeny
Figure 4. Rooted phylogenetic tree with branch length comparing best hit proteins for ZBTB7B/ThPOK human protein query across species. "Guinea" refers to guinea pig, while "fruit" refers to fruit fly.
Conclusions
Based on the phylogenetic tree, the closest match for the human ZBTB7B gene is in the guinea pig. Orthologues for the gene were also found in the mouse and zebrafish, while closely related BTB-ZF transcription factors were found in the whale shark. These findings suggests common mechanisms in cell-mediated immunity across species' immune systems that likely originate from a common ancestor.
References
Beaulieu A, Sant'Angelo D. The BTB-ZF family of transcription factors: Key regulators of lineage commitment and effector function development in the immune system. Journal of Immunology. 2011. 187(6): 2841-2847.
Bell JJ, Bandhoola A. Putting ThPOK in place. Nature Immunology. 2008. 9: 1095-1096.
He X, Park K, Kappes DJ. The role of ThPOK in control of CD4/CD8 lineage commitment. Annual Review of Immunology. 2009. 28: 295-320.
Reis BS, Rogoz A, Costa-Pinto FA, Taniuchi I, Mucida D. Mutual expression of the transcription factors Runx3 and ThPOK regulates intestinal CD4+ T cell immunity. Nature Immunology 2013. 14: 271-280.
Setoguchi R, Taniuchi I, Bevan M. ThPOK Derepression is Required for Robust CD8 Cell Responses to Viral Infection. Journal of Immunology. 2009. 183 (7): 4467-4474.
Tanaka H, Taniuchi I. The CD4/CD8 Lineages: Central Decisions and Peripheral Modifications for T Lympocytes. Current Topics in Microbiology and Immunology. 2014. 373: 113-129.
This Project
This project emerged in Emory University laboratory course in Biology 142 during the spring semester of 2015. Newly acquired whale shark proteins of interest were assigned to various student groups in hopes of better understanding the origins of these different proteins. Students were tasked to use different resources to BLAST the assigned protein sequences to better understand the homology of the whale shark proteins as well as determine whether there was an orthologous protein within the whale shark genome.
Background Information
The ThPOK gene, also known as ZBTB7B, plays a key role in the human immune system by encoding a zinc-finger transcription factor. This transcription factor is important in regulating maturation of T cells, white blood cells that recognize and destroy infected cells in the body. During T cell maturation, T cells differentiate into either CD4 (helper) or CD8+ (cytotoxic) cells. It has been shown that ThPOK expression activates development of CD4 cells, while ThPOK silencing promotes development of CD8+ cells.
Figure 1. Role of ThPOK in CD4 and CD8 + differentiation pathways (Setoguchi 2009). Part (a) shows ThPOK expression in CD4 helper cells, while part (c) shows ThPOK's regulatory role in gene expression of genes that determine CD8 cytotoxic fates, such as Runx3.
The ThPOK gene contains several important structural features visible in Figure 2 below. The blue region represents the broad complex, tram track, bric à brac (BTB) domain. The purple region represents a proline-rich sequence. The red ovals represent four zinc fingers, which serve as the DNA binding domain for its transcriptional regulation activity. There are eight BTB-ZF transcription factors found in the human genome.
Figure 2. ThPOK structural features (Beaulieu 2011).
Methods
Whaleshark Predicted Orthologs
The ZBTB7B human protein sequence (ENSP00000292176) was used in a Blast as query against the predicted whale shark protein database using the Galaxy server on whaleshark.georgiaaquarium.org. The full predicted sequence of the top five hits were used as queries on the NCBI database to Blast against the human protein database. To further check for orthologs, the whale shark protein database was also searched using the elephant shark predicted protein sequence (for ZBTB7B) as query since the elephant shark is one of the whale sharks' closest ancestors.
Predicted Orthologs
The ZBTB7B protein sequence was also used as query in the NCBI Blast in order to check orthologs in organisms other than the whale shark. The protein Blasts were preformed by using singe species databases for the mouse, yeast, zebra fish, guinea pig, fruit fly, and whale shark protein databases. The default settings used in each of the searches for these organisms was the human ZBTB7B protien (ENSP00000292176).
Phylogenetic tree
The best hit (those with the lowest E-values) for each of the non-whale shark organisms search along with the top whale shark BLAST hit were used to create a multiple sequence alignment on ClusalW2, using the human protein as the query. The multiple hits were then used to create a phylogenetic tree using the same ClustalW2 software.
Searching for ZBTB7B/ThPOK in the whale shark
Reciprocal cross and bootstrapping
The top three (three lowest E-values) from the whale shark database were blasted against the human protein database using NCBI BLASTp in a reciprocal cross. The top three hits were not matches for the ZBTB7B protein. This was just as predicted because the E-values from the primary whale shark Blast on Galaxy did not produce very significant E-values (lowest E-value was 2e-29).
Next, the ZBTB7B human query was blasted against the elephant shark database; a reciprocal cross of this best hit yielded the ZBTB7C protein, closely related to the ZBTB7B protein, with a E-value of 4e-56. This may suggest that there are several BTB domain proteins present in the whale shark that are a closer match to ZBTB7C than ZBTB7B.
Protein domains
Figure 3. Protein families and domains for the ZBTB7B/ThPOK human gene. ZBTB7B/ThPOK is a zinc finger protein that comes from the BTB superfamily.
Orthologues
Based on E-value data, there are several predicted orthologues for the ZBTB7B/ThPOK gene across species: in the mouse, the zebrafish, and the guinea pig. For the whale shark, a reciprocal best hit search returned the zinc finger protein 383 with an E-value of 4e-57 and a 48% identity match; therefore, it is unlikely that this gene is an orthologue of the ZBTB7B/ThPOK gene but it is likely an orthologue of a related gene in the BTB domain.
Phylogeny
Figure 4. Rooted phylogenetic tree with branch length comparing best hit proteins for ZBTB7B/ThPOK human protein query across species. "Guinea" refers to guinea pig, while "fruit" refers to fruit fly.
Conclusions
Based on the phylogenetic tree, the closest match for the human ZBTB7B gene is in the guinea pig. Orthologues for the gene were also found in the mouse and zebrafish, while closely related BTB-ZF transcription factors were found in the whale shark. These findings suggests common mechanisms in cell-mediated immunity across species' immune systems that likely originate from a common ancestor.
References
Beaulieu A, Sant'Angelo D. The BTB-ZF family of transcription factors: Key regulators of lineage commitment and effector function development in the immune system. Journal of Immunology. 2011. 187(6): 2841-2847.
Bell JJ, Bandhoola A. Putting ThPOK in place. Nature Immunology. 2008. 9: 1095-1096.
He X, Park K, Kappes DJ. The role of ThPOK in control of CD4/CD8 lineage commitment. Annual Review of Immunology. 2009. 28: 295-320.
Reis BS, Rogoz A, Costa-Pinto FA, Taniuchi I, Mucida D. Mutual expression of the transcription factors Runx3 and ThPOK regulates intestinal CD4+ T cell immunity. Nature Immunology 2013. 14: 271-280.
Setoguchi R, Taniuchi I, Bevan M. ThPOK Derepression is Required for Robust CD8 Cell Responses to Viral Infection. Journal of Immunology. 2009. 183 (7): 4467-4474.
Tanaka H, Taniuchi I. The CD4/CD8 Lineages: Central Decisions and Peripheral Modifications for T Lympocytes. Current Topics in Microbiology and Immunology. 2014. 373: 113-129.