This web page originated as an assignment in Emory University's Biology 142 lab course. Students were assigned proteins of interest and asked to research what is known about the protein and to examine whether the newly sequenced whale shark genome had evidence of an orthologous protein. On receiving the protein TLR 6, our interests immediately peaked as we realized its importance in the natural function of our bodies, and also how our body is kept safe via these proteins.
Background/Introduction:
The TLR 6 gene is a protein coding gene found in homo sapiens among other species. It comes from the family gene of toll like receptors which play a pivotal role in the functioning of innate immunity. Innate immunity as the words suggest is an immunity that is not acquired by outside source (vaccinations etc.) or from the learnt attack of an invader, instead, it is developed by our own body for immediate response to these pathogens. The TLR gene family is able to locate PAMPS (pathogen associated molecular patterns) that are encoded in the attacker’s (gram positive bacteria and fungi) genome (and thus expressed), and is able to combat these by the immediate rapid division of necessary immune related cells such as macrophages, in order to develop effective immunity.
TLR genes traverse the cell membrane of the immunity related cells such as the macrophages, B lymphocytes and mast cells. Existing as homodimers, (and sometimes heterodimers), this gene is activated by the presence of foreign bacterial cells which are capable of causing a breach in the immune system. This activation tends to be specific towards evolutionary conserved entities of the surface of these invader cells, such as lipopolysaccharides which are located on the surface of these cells as periphery attachments, and also specific flagella. Activation of the family of genes, specifically TLR6, causes a cascading response which ultimately leads to a stimulation of inflammatory and/or “antigen-specific immune response.” (Hallman, M. 2001)
Figure 1. This is the Toll-like receptor 6 (TLR6) signaling pathway. TLR6 connects with TLR2 and membrane proteins CD14 and CD36 to initiate a MyD88-dependent response that causes pro-inflammatory cytokines (Mutagenetix Phenotypic Mutation 'insouciant').
Methods:
The protein sequence for the TLR 6 human gene was gained through ENSEMBL.
On the Georgia Aquarium site, the whaleshark predicted genome database was accessed and the human TLR6 protein sequence was used as a query against the whaleshark database.
The best hits were found using the lowest E-value and highest identity %
To double check that the best hits were the TLR 6 gene, the FASTFA sequence was developed for each best hit and BLAST-ed against the human genome on NCBI
Boot Strapping Method
To further check the best hits, the method below was incorporated
The human TLR 6 protein sequence was then BLAST-ed against the elephant shark genome in order to see if these organisms had the protein
The elephant shark (closer evolutionarily to the whaleshark than humans) best hit protein sequence from the above step was queried against the whale shark genome and those hits were also recorded
The top hits were matched against the previous whaleshark genome hits for possible correlation (same identity)
Finding Orthologs & Phylogenic Tree
Using NCBI BLAST, organisms at random were selected to see if they contained the TLR 6 gene in their genome (zebrafish, mouse, elephant shark, yeast, rabbit)
The TLR6 human protein sequence was used as a query while the organisms genome was used as the database and the best hits were recorded.
Using the accession code given by the top hit, the FASTFA sequence of each match were found and recorded in a separate document
The e-value, query coverage and identity number of the closest match were recorded in a table.
Using ClustalW, the protein sequences of the best hits for the TLR6 in each organism was aligned and a phylogenic tree was constructed based on alignment matches.
Best Hits for TLR 6 in the Whaleshark Genome
Table 2: Human TLR6 gene best hit on the whale shark predicted protein sequence. These hits had the lowest e-value out of all the hits and thus were chosen as the closest match to the TLR 6 gene
Due to the lowest e-value being so high (far from 0) and also the percentage ID of these protein sequences so low, we were skeptical about how good these top hits were in comparison to the TLR 6 gene in the human protein sequence.
The best hit FASTA sequence was obtained and queried with the human genome in NCBI Blast and it was found that it wasn't encoding the TLR6 protein but instead 'PREDICTED' or within the same TLR gene family. The top hit g48010.t1 best alignment was with TLR 2
Figure 2: The best whaleshark hit resulted in the TLR 2 protein when crossed with the human genome
Bootstrapping:
The human TLR6 gene was then BLAST-ed against the elephant shark genome and the best hit was the TLR 1 gene. This indicates that there is an orthologous gene in humans and elephant sharks, and thus possibly whalesharks.
Figure 3: showing the best hit in the elephant shark genome for the TLR 6 gene
This best hit (XP_007887373.1) was then queried against the whaleshark genome to look for best hits. Results below:
sequence ID
E-value
identity %
alignment length
query coverage
g48010.t1
5 e-33
32.31
84
260
g21305.t1
1 e-32
29.75
83
279
Table 3: Two matches of best hits in the whaleshark genome when queried against the elephant shark sequence query.
These two best hits were also one of the top five best hits when the human protein sequence was queried against the whaleshark genome. The top hit (g48010.t1) was found to be the TLR 2 protein from previous analysis (see figure 2)
Protein Domain (Conserved Domain)
Conserved domain of the TLR 6 gene
Figure 4: Putative Conserved domains in the TLR 6 gene
The conserved domain within this protein consists of Leucine rich repeats and the TIR_2 super family. "The TIR superfamily is defined by a common intracellular TIR domain, involved in the initiation of signaling." (Boraschi, 2006)The TIR superfamily contains several smaller domain families such as the LRR domain family which includes the toll like receptors (TLR) and other similar receptors. LRR stands for Leucine-rich repeats and these consists of 2-45 (mainly 22-28) "amino acid motifs" that tend to form tertiary structured proteins in the shape of a "horseshoe" or crescent moon. (Enkhbayar P, 2001) These repeats are usually found in eukaryota domain, and also in viruses. It's main job, as hinted by its cause for unique folding patterns suggests protein interactions with other proteins and molecules present in the cell. (Kobe, B. 2001) Proteins containing the leucine-rich repeats are usually involved in a variety of processed such as disease resistance, apoptosis, immune response. (Rothberg, JM., 1990) And therefore this protein domain for TLR 6 is quite accurate because this protein plays a key role in immunity which incorporates all the above processes. The TIR family also contains the IG domain family (involves the IL receptors) and a "series of TIR domain-containing intracellular adapter molecules." (Boraschi, 2006)
Orthologs
Organism
Actual Protein
E-value
query coverage (%)
identity %
Accession code
mouse
TLR 6
0
100
74
NP_035734.3
zebrafish
TLR 6
6.00 E-147
94
36
NP_001124065.1
yeast
SOG2p
0.025
36
26
NP_014998.3
rabbit
PREDICTED TLR 6
0
100
79
XP_002709434.1
clawed frog
PREDICTED TLR 6
0
94
46
XP_002938695.1
Table 4: The TLR 6 gene in humans being compared with other organisms who have a possible chance of having the same or similar gene in their genome. These are considered the best hits to the TLR 6 gene in humans, some being the same exact gene and others within the same gene family
The e values of these matches are very low which indicate a possibility of homology. Also, both the query and identity percentages are high (Above 50) and thus is can be further confirmed that these organisms have the TLR6 gene (or more likely a gene in the same gene family, e.g. TLR 1) This is all with the exception of yeast which is seen to have a very high e-value and also low percentage query and identity
Clustal Sequence Alignments
Figure 5: Showing two clustal sequence alignments of the best hit to the human TLR 6 protein in different organisms
Phylogenetic Tree
Figure 6: The phylogenic tree of the best hits for the TLR 6 gene in human, rabbit mouse, yeast, zebra fish etc.
Figure 6: The phylogenic tree of the best hits for the TLR 6 gene in human, rabbit mouse, yeast, zebra fish etc.
The figure shows the phylogenic tree of the best hits for the TLR 6 gene. The branch length version was used in order to get a better indication of the evolutionary connection and thus homology of the individuals studied. This indicates a very low alignment match between the human TLR 6 gene and the best hit gene in the whaleshark genome. However, both rabbits and mice have the same TLR6 gene within their genome as indicated in the close alignment match and also the the data in table 4 confirms this.
Conclusion
Finding the TLR 6 gene in the whaleshark genome after several checks and cross-references using more similar organisms was not successful. It can thus be confirmed that the whaleshark does not have the TLR 6 gene within its genome. Nevertheless, some of our hits are worth focusing on as it points to the possibility of other similar genes within the whaleshark genome, possibly in the TLR gene family. The TLR family is responsible along with other genes for innate immunity in certain organisms, thus it can be concluded that other TLR genes are present in the whale shark, just not the specific protein we received (TLR 6). The TLR 6 gene is located in the genome of many other organisms,among which are humans, rabbits and mice. However, the fact that we received identical best hits in both the whale shark and elephant shark genome, it can be confirmed that the evolutionary changes happened before the divergence of these two sharks. TLR 6 became useless or was never needed in the shark (elephant and whale) genome as there may have been different methods of innate immunity or different protein characteristics which resulted in another TLR gene taking the same function as the TLR 6 gene in human
Works Cited:
Boraschi, D., and A. Tagliabue. (2006) "The Interleukin-1 Receptor Family." National Center for Biotechnology Information. U.S. National Library of Medicine, n.d. Web. <http://www.ncbi.nlm.nih.gov/pubmed/17027517>.
"Genes and Mapped Phenotypes." National Center for Biotechnology Information. U.S. National Library of Medicine, n.d. Web. 07 Apr. 2015. <__http://www.ncbi.nlm.nih.gov/gene/10333__>.
Enkhbayar P, Kamiya M, Osaki M, Matsumoto T, Matsushima N. (2004) Structural principles of leucine-rich repeat (LRR) proteins.
Genes Dev. 4 2169-87 1990
Hajjar, A. M., O’Mahony, D. S., Ozinsky, A., Underhill, D. M., Aderem, A., Klebanoff, S. J., Wilson, C. B. (2001) Cutting edge: functional interactions between Toll-like receptor (TLR) 2 and TLR1 or TLR6 in response to phenol-soluble modulin. J. Immunol. 166, 1-3
Jang, T. H. "Crystal Structure of TIR Domain of TLR6 Reveals Novel Dimeric Interface of TIR-TIR Interaction for Toll-like Receptor Signaling Pathway." National Center for Biotechnology Information. U.S. National Library of Medicine, n.d. Web. <http://www.ncbi.nlm.nih.gov/pubmed/25088687>.
Kobe B, Kajava AV. (2001) The leucine-rich repeat as a protein recognition motif.
Rothberg JM, Jacobs JR, Goodman CS, Artavanis-Tsakonas S. (1990) slit: an extracellular protein necessary for development of midline glia and commissural axon pathways contains both EGF and LRR
TLR6 (ENSP00000371376)This Project
This web page originated as an assignment in Emory University's Biology 142 lab course. Students were assigned proteins of interest and asked to research what is known about the protein and to examine whether the newly sequenced whale shark genome had evidence of an orthologous protein. On receiving the protein TLR 6, our interests immediately peaked as we realized its importance in the natural function of our bodies, and also how our body is kept safe via these proteins.Background/Introduction:
The TLR 6 gene is a protein coding gene found in homo sapiens among other species. It comes from the family gene of toll like receptors which play a pivotal role in the functioning of innate immunity. Innate immunity as the words suggest is an immunity that is not acquired by outside source (vaccinations etc.) or from the learnt attack of an invader, instead, it is developed by our own body for immediate response to these pathogens. The TLR gene family is able to locate PAMPS (pathogen associated molecular patterns) that are encoded in the attacker’s (gram positive bacteria and fungi) genome (and thus expressed), and is able to combat these by the immediate rapid division of necessary immune related cells such as macrophages, in order to develop effective immunity.
TLR genes traverse the cell membrane of the immunity related cells such as the macrophages, B lymphocytes and mast cells. Existing as homodimers, (and sometimes heterodimers), this gene is activated by the presence of foreign bacterial cells which are capable of causing a breach in the immune system. This activation tends to be specific towards evolutionary conserved entities of the surface of these invader cells, such as lipopolysaccharides which are located on the surface of these cells as periphery attachments, and also specific flagella. Activation of the family of genes, specifically TLR6, causes a cascading response which ultimately leads to a stimulation of inflammatory and/or “antigen-specific immune response.” (Hallman, M. 2001)
Figure 1. This is the Toll-like receptor 6 (TLR6) signaling pathway. TLR6 connects with TLR2 and membrane proteins CD14 and CD36 to initiate a MyD88-dependent response that causes pro-inflammatory cytokines (Mutagenetix Phenotypic Mutation 'insouciant').
Methods:
Best Hits for TLR 6 in the Whaleshark Genome
Table 2: Human TLR6 gene best hit on the whale shark predicted protein sequence. These hits had the lowest e-value out of all the hits and thus were chosen as the closest match to the TLR 6 gene
Due to the lowest e-value being so high (far from 0) and also the percentage ID of these protein sequences so low, we were skeptical about how good these top hits were in comparison to the TLR 6 gene in the human protein sequence.
The best hit FASTA sequence was obtained and queried with the human genome in NCBI Blast and it was found that it wasn't encoding the TLR6 protein but instead 'PREDICTED' or within the same TLR gene family. The top hit g48010.t1 best alignment was with TLR 2
Bootstrapping:
The human TLR6 gene was then BLAST-ed against the elephant shark genome and the best hit was the TLR 1 gene. This indicates that there is an orthologous gene in humans and elephant sharks, and thus possibly whalesharks.
This best hit (XP_007887373.1) was then queried against the whaleshark genome to look for best hits. Results below:
These two best hits were also one of the top five best hits when the human protein sequence was queried against the whaleshark genome. The top hit (g48010.t1) was found to be the TLR 2 protein from previous analysis (see figure 2)
Protein Domain (Conserved Domain)
Figure 4: Putative Conserved domains in the TLR 6 gene
The conserved domain within this protein consists of Leucine rich repeats and the TIR_2 super family. "The TIR superfamily is defined by a common intracellular TIR domain, involved in the initiation of signaling." (Boraschi, 2006)The TIR superfamily contains several smaller domain families such as the LRR domain family which includes the toll like receptors (TLR) and other similar receptors. LRR stands for Leucine-rich repeats and these consists of 2-45 (mainly 22-28) "amino acid motifs" that tend to form tertiary structured proteins in the shape of a "horseshoe" or crescent moon. (Enkhbayar P, 2001) These repeats are usually found in eukaryota domain, and also in viruses. It's main job, as hinted by its cause for unique folding patterns suggests protein interactions with other proteins and molecules present in the cell. (Kobe, B. 2001) Proteins containing the leucine-rich repeats are usually involved in a variety of processed such as disease resistance, apoptosis, immune response. (Rothberg, JM., 1990) And therefore this protein domain for TLR 6 is quite accurate because this protein plays a key role in immunity which incorporates all the above processes. The TIR family also contains the IG domain family (involves the IL receptors) and a "series of TIR domain-containing intracellular adapter molecules." (Boraschi, 2006)
Orthologs
Organism
Actual Protein
E-value
query coverage (%)
identity %
Accession code
mouse
TLR 6
0
100
74
NP_035734.3
zebrafish
TLR 6
6.00 E-147
94
36
NP_001124065.1
yeast
SOG2p
0.025
36
26
NP_014998.3
rabbit
PREDICTED TLR 6
0
100
79
XP_002709434.1
clawed frog
PREDICTED TLR 6
0
94
46
XP_002938695.1
The e values of these matches are very low which indicate a possibility of homology. Also, both the query and identity percentages are high (Above 50) and thus is can be further confirmed that these organisms have the TLR6 gene (or more likely a gene in the same gene family, e.g. TLR 1)
This is all with the exception of yeast which is seen to have a very high e-value and also low percentage query and identity
Clustal Sequence Alignments
Phylogenetic Tree
Figure 6: The phylogenic tree of the best hits for the TLR 6 gene in human, rabbit mouse, yeast, zebra fish etc.
The figure shows the phylogenic tree of the best hits for the TLR 6 gene. The branch length version was used in order to get a better indication of the evolutionary connection and thus homology of the individuals studied. This indicates a very low alignment match between the human TLR 6 gene and the best hit gene in the whaleshark genome. However, both rabbits and mice have the same TLR6 gene within their genome as indicated in the close alignment match and also the the data in table 4 confirms this.
Conclusion
Finding the TLR 6 gene in the whaleshark genome after several checks and cross-references using more similar organisms was not successful. It can thus be confirmed that the whaleshark does not have the TLR 6 gene within its genome. Nevertheless, some of our hits are worth focusing on as it points to the possibility of other similar genes within the whaleshark genome, possibly in the TLR gene family. The TLR family is responsible along with other genes for innate immunity in certain organisms, thus it can be concluded that other TLR genes are present in the whale shark, just not the specific protein we received (TLR 6). The TLR 6 gene is located in the genome of many other organisms,among which are humans, rabbits and mice. However, the fact that we received identical best hits in both the whale shark and elephant shark genome, it can be confirmed that the evolutionary changes happened before the divergence of these two sharks. TLR 6 became useless or was never needed in the shark (elephant and whale) genome as there may have been different methods of innate immunity or different protein characteristics which resulted in another TLR gene taking the same function as the TLR 6 gene in human
Works Cited:
Boraschi, D., and A. Tagliabue. (2006) "The Interleukin-1 Receptor Family." National Center for Biotechnology Information. U.S. National Library of Medicine, n.d. Web. <http://www.ncbi.nlm.nih.gov/pubmed/17027517>.
"Genes and Mapped Phenotypes." National Center for Biotechnology Information. U.S. National Library of Medicine, n.d. Web. 07 Apr. 2015. <__http://www.ncbi.nlm.nih.gov/gene/10333__>.
"TLR6 Gene." - GeneCards. N.p., n.d. Web. 07 Apr. 2015. <http://www.genecards.org/cgi-bin/carddisp.pl?gene=TLR6>.
Enkhbayar P, Kamiya M, Osaki M, Matsumoto T, Matsushima N. (2004) Structural principles of leucine-rich repeat (LRR) proteins.
Genes Dev. 4 2169-87 1990
Hajjar, A. M., O’Mahony, D. S., Ozinsky, A., Underhill, D. M., Aderem, A., Klebanoff, S. J., Wilson, C. B. (2001) Cutting edge: functional interactions between Toll-like receptor (TLR) 2 and TLR1 or TLR6 in response to phenol-soluble modulin. J. Immunol. 166, 1-3
Hallman, M. "Toll-like Receptors as Sensors of Pathogens." WikiGenes - Collaborative Publishing. N.p., n.d. Web. 06 Apr. 2015. <__https://www.wikigenes.org/e/ref/e/11518816.html__>.
Jang, T. H. "Crystal Structure of TIR Domain of TLR6 Reveals Novel Dimeric Interface of TIR-TIR Interaction for Toll-like Receptor Signaling Pathway." National Center for Biotechnology Information. U.S. National Library of Medicine, n.d. Web. <http://www.ncbi.nlm.nih.gov/pubmed/25088687>.
Kobe B, Kajava AV. (2001) The leucine-rich repeat as a protein recognition motif.
Misch EA, et al. Genes Immun, 2013 Oct.,
A TLR6 polymorphism is associated with increased risk of Legionnaires' disease.
Mutagenetix Phenotypic Mutation 'insouciant' (n.d.). Retrieved April 14, 2015, from https://mutagenetix.utsouthwestern.edu/phenotypic/phenotypic_rec.cfm?pk=128
Rothberg JM, Jacobs JR, Goodman CS, Artavanis-Tsakonas S. (1990) slit: an extracellular protein necessary for development of midline glia and commissural axon pathways contains both EGF and LRR