Contents

  1. Introduction to Protein Analysis
  2. Obtain a sequence of interest.
  3. Identify ORF's and translate into protein
  4. Identify Similar Proteins from the Databases
  5. Align your sequence vs similar sequences and look for Gene Families
  6. Determine the putative function of your protein
  7. Determine the putative structure of your protein
  8. Protein Structure Visualization Tools
  9. Other Interesting Things You Can Do With Proteins

IX. Other Interesting Things You Can Do With Proteins

  1. Protein Analysis of Entire Genomes
    1. PEDANT - Protein Extraction, Description and Analysis Tool
      http://pedant.mips.biochem.mpg.de/
      • PEDANT is a software system for completely automatic and exhaustive analysis of protein sequence sets from individual sequences to complete genomes.
      • To date, 17 complete microbial genomes and 19 incomplete genomes have been analyzed with this tool.
      • The analysis has already been performed, this site gives you access to the results and statistics.
      • The analysis process uses most of the tools we have already discussed here.
      • In addition to providing a list of all ORF's for that genome, you can also search through the results by protein name, structure, patterns, etc.
      • Although human is not listed as an option, it can be accessed here http://pedant.mips.biochem.mpg.de/human/
    2. ProDOMGC - Complete Genomes Protein Domain Database
      http://www.toulouse.inra.fr/prodomCG.html
      • ProDom-CG 20 was constructed by automatic clustering of protein domains derived from 20 complete genomes available in April 1999 (4 archeal genomes, 14 bacterial genomes and 2 eukaryotic genome).
      • You can search for the following things:
        • BLAST to compare your sequence to the ProDomCG 20 database
        • Fetch Domain Families, their Consensus Sequences and PROSITE pattern
        • Fetch Domain Families by Keyword (e.g. : kinase phytochrome => kinase AND phytochrome )
        • Fetch Domain Families by PROSITE pattern (e.g: PROSITE:PS00010,PDOC00010 or ASX_HYDROXYL)
        • Fetch Domain Families by archae, bacteria or eukaroyte
        • Fetch Graphic Representations of Proteins
        • Of all proteins having the query domain – Enter the ProDomCG 20 Family Number (e.g. : 350)
        • Of all proteins having a domain in common with the query protein

  2. Structure / Function Databases
    1. GeneCards
      http://bioinfo.weizmann.ac.il/cards
      • GeneCards concept applied to protein structure/function
      • Provides rich content annotation on structure and function, generating dynamic links to several external sources.
      • Integrates data from several sources that are of free access for Universities. If you plan to use OCA in a commercial environment or for profit, please be sure you have the proper clearance from OCA and all the OCA data sources.