Contents
- Introduction to Protein Analysis
- Obtain a sequence of interest.
- Identify ORF's and translate into protein
- Identify Similar Proteins from the Databases
- Align your sequence vs similar sequences and look for Gene Families
- Determine the putative function of your protein
- Determine the putative structure of your protein
- Protein Structure Visualization Tools
- Other Interesting Things You Can Do With Proteins
VII. Determine the putative structure of your protein
- Fold and secondary structure prediction
- PredictProtein Server
http://www.embl-heidelberg.de/Services/sander/predictprotein/
http://dodo.cpmc.columbia.edu/predictprotein/ USA mirror
Example
- There are many programs, mostly running on neural nets, that are used to predict the protein structure
Figure 1
- The default is to run through them all sequentially. The order is as follows:
- Blast (for fast database search vs SWISSPROT)
- Maxhom (for multiple sequence alignment of similar sequences identified by BLAST)
- ProSite (scanning for functional motifs) reported only if hit found
- SEG (detection of composition-biased regions) reported only if more than 10 residues of low-complexity found
- ProDom (scanning for the putative domain structure for your protein) reported only if hit found
- Coils (prediction of coiled-coil regions) reported only if hit found
- PHDsec (prediction of secondary structure)
- PHDacc (prediction of solvent accessibility)
- PHDhtm (prediction of transmembrane helices and their topology) reported only if hit found
- NOTE: By default, the threshold for what is considered to be a membrane helix is rather restrictive. This has two consequences:
- Almost no false positives (proteins identified to contain membrane helices that do actually NOT contain membrane helices),
- Some membrane proteins may be missed
- The results are returned via email or ftp
- Meta PP
http://dodo.cpmc.columbia.edu/predictprotein/submit_meta.html
- You submit your sequence once (paste into WWW form), and you get results from currently more than 10 different programs via email. The following services are available at the moment:
- Miscellaneous services
- SignalP - prediction of presence and location of signal peptide cleavage sites in amino acid sequences from different organisms
- NetOglyc - prediction of mucin type GalNAc O-glycosylation sites in mammalian proteins.
- NetPicoRNA - predictions of cleavage sites of picornaviral proteases.
- ChloroP - predictions of whether or not a protein contains an N-terminal chloroplast transit peptide, cTP, and of probable sites for cleavage of the transit peptide.
- Secondary structure prediction
- JPRED - consensus method for protein secondary structure prediction.
- TMHMM - prediction the location of transmembrane helices and their topology.
- TOPPRED - prediction of location and orientation of transmembrane helices.
- DAS - prediction of location of transmembrane helices.
- Threading (remote homology search)
- FRSVR - prediction-based threading, also incorporating purely sequence-based database searches.
- SAMT98 - hidden Markov model method (SAM-T98) for finding remote homologs of protein sequences.
- SWISS-MODEL - prediction of 3D structure by homology modeling (automated server).
- CPHmodels - prediction of 3D structure by homology modeling through a collection of methods and databases developed to predict protein structures.
- Transmembrane region prediction
- TMHMM
http://www.cbs.dtu.dk/services/TMHMM-1.0/
- TOPPRED
http://www.biokemi.su.se/~server/toppred2/
- Coiled coil region prediction
- MultiCoil
http://gaiberg.wi.mit.edu/cgi-bin/multicoil.pl
The MultiCoil program predicts the location of coiled-coil regions in amino acid sequences and classifies the predictions as dimeric or trimeric. The method is based on the PairCoil algorithm.
- Tertiary structure prediction by homology modeling
- Swiss-Model - Automated Protein Modeling Server
http://www.expasy.ch/swissmod/SWISS-MODEL.html
- A free service that generates a PDB coordinate file of your protein sequence of interest
- Methods of operation
- Finds all proteins with similarities to your protein and that have a known structure
- Select as templates those matches with > 25 % identity and a projected model size of > 20 residues. This also allows detection of domains which may be modeled separately.
- Creates input files
- Generates model
- Performs energy minimization
- Returns results to you as an email attachment
- Use the 'First approach' mode
- Fill in required information (name, email, title)
- Provide sequence or accession number (SWISSPROT only?)
- You may alter the P(N) or E value
- You may provide your own templates to use for alignment purposes
- Result options
- Normal - PDB model coordinates and log file
- Swiss PdbViewer - PDB model coordinates and log file as a Swiss PdbViewer project file
- Short - PDB coordinates only
- Fold Recognition
- You have the option of having your sequence forwarded to another server for fold prediction
- Swiss PdbViewer
- RasMol
- MAGE
- ModBase - Database of comparative protein structure models
http://pipe.rockefeller.edu/modbase/
- Currently, the database contains approximately 15,000 reliable models for substantial segments of approximately 4,000 proteins in the genomes of Saccharomyces cerevisiae, Mycoplasma genitalium, Methanococcus jannaschii, Caenorhabditis elegans, and Escherichia coli.
- The database also contains the alignments on which the models were based and model evaluations.
- Database is searchable by keyword or sequence alignment
- The models are derived by ModPipe, an automated modeling pipeline relying on the programs PSI-BLAST and MODELLER.
- Align your protein structure vs other protein structures
- DALI Server - Automated Protein Structure Alignment
http://www.ebi.ac.uk/dali/
- Compares protein structure in 3D
- Submit coordinate files
- Must include at least the CA coordinates
- Multiple alignment of structural neighbors is emailed back to you
|
|