Structure Analysis of Thymidylate Synthase

PDB ID: 2TSC      Project by Ileen Nagorner

General Info

Rasmol Images

Experimental Detail

CATH

SCOP

Secondary Structure

Structural Neighbors

Molecular Motions

Spacefill display using RasMol Molecular Graphics package.  Represents currently selected atoms as solid spheres. 

Oxygen Nitrogen Carbon Phosphorus Sulfur

General Information for 2TSC

Protein Name

Thymidylate Synthase

Protein Source

Escherichia coli

Protein Function

Provides the sole de novo source of dTMP for DNA biosynthesis

Thymidylate synthase (TS) catalyzes the final step in the de novo synthesis of deoxy-thymidine monophosphate (dTMP) using the substrate dUMP along with a cofactor.  As such it is the limiting irreversible step in de novo DNA, catalyzing the conversion of dUMP to dTMP.  The enzyme is essential for regulating the balanced supply of the 4 DNA precursors in normal DNA replication. The enzyme is an important target for certain chemotherapeutic drugs.

Protein Information

Residues: 528, Atoms: 4600, Polymer Chains: A,B

Compound:

Thymidylate Synthase Complex with dUMP and An Anti-Folate

Experiment

X-ray diffraction

Remark

The ASYMMETRIC UNIT CONSISTS OF ONE DIMER.  THE TWO MONOMERS IN THE DIMER ARE EACH BOUND TO A MOLECULE OF  SUBSTRATE (DUMP) AND A COFACTOR ANALOG, 10-PROPARGYL-5

 

Rasmol Images

RasMol generated image using a Wireframe Display. 

Using RasMol's command line, ligands highlighted in green.

110 atoms were selected along with the ligand selection.

 

RasMol generated image using a Wireframe Display. 

Ligands were again selected and colored green.  The zoom and rotation features within RasMol allows for a close-up view of the two thymidylate synthase ligands -- UMP and CB3.

 

Domain Segment FROM TO Rasmol Highlighted Domains
1 1 A:1:- A:56:-

2 A:146:- A:264:-
2 1 A:57:- A:145:-

 

Experimental Detail 

Space Group Symmetric relationship between the molecules in a crystal lattice, which include translational and rotational symmetry P63

 

Unit Cell

 

The simplest portion of the structure which is repeated and shows its full symmetry is defined as the unit cell.

 

dim [Å ]:

a

127.10

b

127.10

c

67.90

angles [°]:

alpha

90.00

beta

90.00

gamma

120.00

hexagonal

R-Factor

 

The comparison of calculated structure factor with measured structure factor. Measure of agreement between crystalline state and x-ray scattering data 0.180
Resolution  Quality of the X-ray structure as compared with structures solved at similar resolutions. In X-ray crystallography, the precision with which atoms are located in space; usually expressed in Å (10-10m). 1.97 Å

Data of 1.2 Å or better resolution is considered "high resolution."  Small numeric values for resolution mean small uncertainty, hence good resolution; larger values mean poor resolution. For example, 5.0 Å is rather poor resolution for a protein and 6 Å may resemble randomness.

 

CATH Classification: Hierarchical Classification of Protein Domain Structure

Class
Class is determined according to the secondary structure composition and packing within the structures.

Class 3: Mixed Alpha-Beta
Architecture
Gross orientation of secondary structure independent of connectivity.  Overall shape as determined by the orientations of the secondary structures ignoring connectivity.
2-Layer  Sandwich
Topology
Cluster structures according to topological connection and numbers of secondary structure.  Structues are grouped into fold families at this level. 

 

Thymidylate Synthase

Topology Representative 1tys00
Homologous Superfamily Cluster protein with highly similar structure and function

 

 

Transferase (Methyltransferase)
 
 

Left:   Domain 2 tscA0     Right: Domain 2tscB0

 

SCOP Structure Classification: Structural Classification of Proteins

Classification reflects both structural and evolutionary relatedness.

Family Clear evolutionary relationship
Proteins clustered together into families are clearly evolutionarily related (i.e. in general >30% pairwise residue identities between the proteins)
Superfamily  Probable common evolutionary origin
Proteins that have low sequence identities, but whose structural and functional features suggest that a common evolutionary origin is probable.
Fold Major structural similarity
Proteins are defined as having a common fold if they have same major secondary structures in same arrangement and with the same topological connections.

Lineage:

Conservation of function:

Secondary Structure Assignment

 

Rasmol generated image highlighting helices (orange) and sheets (green)

 

 

Chains Residues  
2TSC:A 264
2TSC:B 264

H=helix; B=residue in isolated beta bridge; E=extended beta strand; G=310 helix; I=pi helix; T=hydrogen bonded turn; S=bend.

Chain 2TSC:A  Sequence and secondary structure

   1 MKQYLELMQK VLDEGTQKND RTGTGTLSIF GHQMRFNLQD GFPLVTTKRC 
       HHHHHHHH HHHH EEE   TTSS EEEEE  EEEEEETTT  B   SSS   

  51 HLRSIIHELL WFLQGDTNIA YLHENNVTIW DEWADENGDL GPVYGKQWRA 
      HHHHHHHHH HHHTT  BSH HHHHTT  TT GGGTTTTSB   S HHHHHH  

 101 WPTPDGRHID QITTVLNQLK NDPDSRRIIV SAWNVGELDK MALAPCHAFF 
     EE TTS EE  HHHHHHHHHH H TT S  EE E   TTTGGG  SS  SB EE 

 151 QFYVADGKLS CQLYQRSCDV FLGLPFNIAS YALLVHMMAQ QCDLEVGDFV 
     EEEESSSEEE EEEEESEEET TTTHHHHHHH HHHHHHHHHH HTT EE EEE 

 201 WTGGDTHLYS NHMDQTHLQL SREPRPLPKL IIKRAPESIF DYRFEDFEIE 
     EEESEEEEEG GGHHHHHHHH TS     EEE EE    SSTT    GGGEEEE 

 251 GYDPHPGIKA PVAI 
     S               

Chain 2TSC:B  Sequence and secondary structure

   1 MKQYLELMQK VLDEGTQKND RTGTGTLSIF GHQMRFNLQD GFPLVTTKRC 
       HHHHHHHH HHHH EEE   SSSS EEEEE  EEEEEESTT      SSS   

  51 HLRSIIHELL WFLQGDTNIA YLHENNVTIW DEWADENGDL GPVYGKQWRA 
      HHHHHHHHH HHHHT  BSH HHHTTT  TT GGG  SSSB   S HHHHHH  

 101 WPTPDGRHID QITTVLNQLK NDPDSRRIIV SAWNVGELDK MALAPCHAFF 
     EE TTS EE  HHHHHHHHHH H TT S  EE E   GGGTTT SSS  SB EE 

 151 QFYVADGKLS CQLYQRSCDV FLGLPFNIAS YALLVHMMAQ QCDLEVGDFV 
     EEEE SSEEE EEEEESEEET TTTHHHHHHH HHHHHHHHHH HHT EE EEE 

 201 WTGGDTHLYS NHMDQTHLQL SREPRPLPKL IIKRAPESIF DYRFEDFEIE 
     EEESEEEEEG GGHHHHHHHH TS      EE EE    SSGG GTTGGGEEEE 

 251 GYDPHPGIKA PVAI 
     S  
         
Amino acid sequence with display of secondary structure elements.

  Structure & Sequence Alignments

Combinatorial Extension (CE) Structure Alignment - 2TSC:A Neighbors

Sequence alignment based on assembled pairwise structure alignments between 2TSC:A and its neighbors. Light color indicates not-aligned residues in structural neighbors.

Search parameters to find neighbors:  Z-Score>4.0, RMSD<3.0,Å, Gaps<30.0%,  Sequence identity<40.0%

  2TSC:A    1/2     MKQYLELMQKVLDEGTQKN---DRT-------GTGTLSIFGHQMRFNLQDGFPLVTTKRC
  1B02:A    5/6     DKQYNSIIKDIINNGISDEEFDVRTKWDSDGTPAHTLSVISKQMRFDNS-EVPILTTKKV
  1B5E:A    9/10    -EEIRLHLGLALKEKDFVV---DKT-------GVKTIEIIGASFVAD-----EPFIFGAL

  2TSC:A   51/52    HLRSIIHELLWF-LQGDTNIAYLHENNVTIWDEWADENGDLGPVYGKQWRAWPTPDGRHI
  1B02:A   64/65    AWKTAIKELLWIWQLKSNDVNDLNMMGVHIWDQWKQEDGTIGHAYGFQLGK-------KN
  1B5E:A   53/54    NDEYIQRELEWY-KSKSLFVKDIPGETPKIWQQVASSKGEINSNYGWAIWSED-----NY

  2TSC:A  110/111   --------DQITTVLNQLKNDPDSRRIIVSAWNVGELDKM-----ALAPCHAFFQFYVAD
  1B02:A  117/118   RSLNGEKVDQVDYLLHQLKNNPSSRRHITMLWNPDELDAM-----ALTPCVYETQWYVKH
  1B5E:A  107/108   --------AQYDMCLAELGQNPDSRRGIMIYTRPSMQFDYNKDGMSDFMCTNTVQYLIRD

  2TSC:A  157/158   GKLSCQLYQRSCDVFLGLPFNIASYALLVHMMAQQCD-------LEVGDFVWTGGDTHLY
  1B02:A  172/173   GKLHLEVRARSNDMALGNPFNVFQYNVLQRMIAQVTG-------YELGEYIFNIGDCHVY
  1B5E:A  159/160   KKINAVVNMRSNDVVFGFRNDYAWQKYVLDKLVSDLNAGDSTRQYKAGSIIWNVGSLHVY

  2TSC:A  210/211   SNHMDQTHLQLSREPRPLPKLIIKRAPESIFDYRFEDFEIEGYDPHPGIKAPVAI
  1B02:A  225/226   TRHIDNLKIQMEREQFEAPELWINPEVKDFYDFTIDDFKLINYKHGDKLLFEVAV
  1B5E:A  219/220   SRHFYLVDHWWKTGE----------------------------------------

1B02:A
1.2 37.1
256 7.3
1B5E:A
2.8 20.7
213 6.8
2.8 22.7
211 -1.0
  2TSC:A 1B02:A
RMSD(Å) Sequence identity(%)
Length of alignment Z-score

CE Result Observations

RMSD All RMSD values (1.2, 2.8 and 2.8) were under 3 angstroms which indicates the structural similarity is strong.  Angstrom values approaching 6 designates randomness.
Z scores Since proteins with a similar fold will typically have a Z-score of 3.5 or better, the above z scores  (7.3 and 6.8) would indicate structural similarity between these proteins.
Sequence Identity Structures were limited to those that have < 40% (37.1, 20.7 and 22.7) sequence identity.  As a result, 1B02:A and 1B5E:A show very little sequence identity.  . 
Alignment Lengths The length of the chains being compared is 256 and 211 and 211.
CLUSTAL W (1.82) multiple sequence alignment
1B02_A          --MTQFDKQYNSIIKDIINNGISDEEFDVRTKWDSDGTPAHTLSVISKQMRFDNSE-VPI 57
2TSC_A          CHAINAMKQYLELMQKVLDEGTQK----------NDRTGTGTLSIFGHQMRFNLQDGFPL 50
1B5E_A          ---MISDSMTVEEIRLHLGLALKEKD------FVVDKTGVKTIEIIGASFVADEPFIFGA 51
                      .   . ::  :. . ..           * * . *:.::. .:  :    .  

1B02_A          LTTKKVAWKTAIKELLWIWQLKSNDVNDLNMMGVHIWDQWKQEDGTIGHAYGFQLGKKNR 117
2TSC_A          VTTKRCHLRSIIHELLWFLQGDTN-IAYLHENNVTIWDEWADENGDLGPVYGKQWR-AWP 108
1B5E_A          LN------DEYIQRELEWYKSKSLFVKDIPGETPKIWQQVASSKGEINSNYG------WA 99
                :.         *:. *   : .:  :  :      **::  ...* :.  **        

1B02_A          SLNGEKVDQVDYLLHQLKNNPSSRRHITMLWNP-----DELDAMALTPCVYETQWYVKHG 172
2TSC_A          TPDGRHIDQITTVLNQLKNDPDSRRIIVSAWNV-----GELDKMALAPCHAFFQFYVADG 163
1B5E_A          IWSEDNYAQYDMCLAELGQNPDSRRGIMIYTRPSMQFDYNKDGMSDFMCTNTVQYLIRDK 159
                  .  :  *    * :* ::*.*** *    .       : * *:   *    *: : . 
1B02_A          KLHLEVRARSNDMALGNPFN-VFQYNVLQRMIAQVTG------YELGEYIFNIGDCHVYT 225
2TSC_A          KLSCQLYQRSCDVFLGLPFN-IASYALLVHMMAQQCD------LEVGDFVWTGGDTHLYS 216
1B5E_A          KINAVVNMRSNDVVFGFRNDYAWQKYVLDKLVSDLNAGDSTRQYKAGSIIWNVGSLHVYS 219
                *:   :  ** *: :*   :   .  :* :::::          : *. ::. *. *:*:

1B02_A          RHIDNLKIQMEREQFEAPELWINPEVKDFYDFTIDDFKLINYKHGDKLLFEVAV 279
2TSC_A          NHMDQTHLQLSREPRPLPKLIIKRAPESIFDYRFEDFEIEGYDPHPGIKAPVAI 270
1B5E_A          RHFYLVDHWWKTG----ETHISKKDYVGKYA----------------------- 246
                .*:   .   .           :    . :  

*

residues or nucleotides in that column are identical in all sequences in the alignment

:

conserved substitutions have been observed

.

semi-conserved substitutions are observed

  Molecular Motions

1. Motions of Fragments Smaller than Domains

A. Motion is predominantly Shear (Known Fragment Motion - Shear Mechanism)

1. Proteins for which open and closed conformations are known

  • Thymidylate Synthase [tms]

  • Same classification as insulin

Animations of pathways between conformations within thymidylate synthase