Revision as of 23:57, 24 June 2004 editVanished user 1234567890 (talk | contribs)1,862 editsm comment -> <! >← Previous edit | Revision as of 11:26, 16 July 2004 edit undoDr. Strangelove~enwiki (talk | contribs)50 edits article completely redone. Perhaps a longer section on applications should be added. Will be done as soon as I find timeNext edit → | ||
Line 1: | Line 1: | ||
Protein structural alignment is a form of ] which tries to establish equivalences between two or more ] structures based on their ]. In contrast to simple ], where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins in the so called "twilight zone" and "midnight zone" of ], where relationships between proteins can't be detected by ] methods. The method can therefore be used to establish evolutionary relationships between proteins that share no or nearly no common ]. This is especially important in the light of ] and ] projects. | |||
{{disputed}} | |||
The result of a structural alignment of two proteins is a superposition of their atomic ] sets with a minimal ] deviation (RMSD) between the two structures. | |||
<!--- So we should fix it (please do)! ---> | |||
Protein structural alignment (also known as structure superposition) is a powerful form of ]. | |||
Two ] structures (]) can be aligned by superposing their 3D ], with the aim of minimizing the ] deviation (RMSD) of the ] (which is not quite the same as ]). | |||
Many ] have been developed to optimize this complex task. | |||
The task is complex because of the large number of ] between two datasets of points in 3D (each ] molecule has six degrees of freedom, three ] and three ]). This is just the simple case of 'rigid body' superposition. | |||
== Algorithms == | == Algorithms == | ||
Up to now there is no definitive algorithmic solution to structural alignment. It could be shown that the alignment problem is ]. All current algorithms employ ] methods. Therefor different algorithms may not produce exactly the same results for the same alignment problem. | |||
Some approaches use quaternerions to reduce the dimensionality of the space without removing any information (] ]). | |||
=== Representation of structures === | |||
A major class of approaches is based on reducing the representation of the complex protein molecules into its ]. Secondary structucture elements can be approximatly represented as vectors and aligned in pairs. | |||
Protein structures have to be represented in some coordinate independent space to make them comparable. One possible representation is the so-called ], which is a two-dimensional ] containing all pairwise distance between all C{{sub|α}} atoms of the protein backbone. This can also be represented as a set of overlapping sub-matrices spanning only fragments of the protein. | |||
Dynamic programming can be used to find local regions of similarity which are then expanded. | |||
Another possible representation is the reduction of the protein structure to the level of ] elements (SSEs), which can be represented as ]s, and can carry additional information about relationships to other SSEs, as well as about certain biophysical properties. | |||
=== Comparison and Optimization === | |||
The Dali algorithm (named after ]) uses network isomorphism between the contact networks of two proteins to perform alignment. | |||
In the case of ] representation, the comparison algorithm breaks down the distance matrices into regions of overlap, which are then again combined if there is overlap between adjacent fragments, thereby extending the alignment. | |||
Sequence similarity between proteins can be used to optimize structural alignmens. | |||
If the SSE representation is chosen, there are several possibilities. One can search for the maximum ensemble of equivalent SSE pairs using algorithms to solve the maximum ] from ]. Other approaches employ ] or combinatorial ]. | |||
] is a good technique! | |||
== Packages == | == Packages == | ||
Several tools for single and multiple structural alignments are available on the web: | |||
* CE ('''C'''ombinatorial '''E'''xtension of the optimum path): http://cl.sdsc.edu/ce.html | |||
* VMD | |||
* DALI ('''D'''istance Matrix '''A'''lignment): http://www.ebi.ac.uk/dali/ | |||
* SSM | |||
* HOMSTRAD ('''Hom'''ologous '''St'''ructure '''A'''lignment '''D'''atabase): http://www-cryst.bioc.cam.ac.uk/~homstrad/ | |||
* CE | |||
* SARF2 ('''S'''patial '''Ar'''rangement of Backbone '''F'''ragments): http://123d.ncifcrf.gov/sarf2.html | |||
* Dali | |||
* SSAP ('''S'''equential '''S'''tructure '''A'''lignment '''P'''rogram): http://www.biochem.ucl.ac.uk/cgi-bin/cath/GetSsapRasmol.pl | |||
* Deep View | |||
... And many many more! | |||
⚫ | == See also == | ||
For example see... | |||
⚫ | * ] | ||
o ProSup The ProSup - structure comparison server. | |||
o MASS MULTIPLE Alignment by Secondary Structures | |||
== |
== References == | ||
*Bourne, P.E & Shindyalov, I.N. (2003): ''Structure Comparison and Alignment''. In: Bourne, P.E., Weissig, H. (Eds): ''Structural Bioinformatics''. Hoboken NJ: Wiley-Liss. ISBN 0-471-20200-2 | |||
Structural alignment may be used to uncover distant ] between proteins, or to uncover the evolutionary ]s. | |||
It is a major tool in ] and (the related) ]. | |||
⚫ | == See also == | ||
⚫ | * ] |
Revision as of 11:26, 16 July 2004
Protein structural alignment is a form of alignment which tries to establish equivalences between two or more protein structures based on their fold. In contrast to simple structural superposition, where at least some equivalent residues of the two structures are known, structural alignment requires no a priori knowledge of equivalent positions. Structural alignment is a valuable tool for the comparison of proteins in the so called "twilight zone" and "midnight zone" of homology (biology), where relationships between proteins can't be detected by sequence alignment methods. The method can therefore be used to establish evolutionary relationships between proteins that share no or nearly no common primary structure. This is especially important in the light of structural genomics and proteomics projects. The result of a structural alignment of two proteins is a superposition of their atomic coordinate sets with a minimal root mean square deviation (RMSD) between the two structures.
Algorithms
Up to now there is no definitive algorithmic solution to structural alignment. It could be shown that the alignment problem is NP-hard. All current algorithms employ heuristic methods. Therefor different algorithms may not produce exactly the same results for the same alignment problem.
Representation of structures
Protein structures have to be represented in some coordinate independent space to make them comparable. One possible representation is the so-called distance matrix, which is a two-dimensional matrix containing all pairwise distance between all Cα atoms of the protein backbone. This can also be represented as a set of overlapping sub-matrices spanning only fragments of the protein. Another possible representation is the reduction of the protein structure to the level of secondary structure elements (SSEs), which can be represented as vectors, and can carry additional information about relationships to other SSEs, as well as about certain biophysical properties.
Comparison and Optimization
In the case of distance matrix representation, the comparison algorithm breaks down the distance matrices into regions of overlap, which are then again combined if there is overlap between adjacent fragments, thereby extending the alignment. If the SSE representation is chosen, there are several possibilities. One can search for the maximum ensemble of equivalent SSE pairs using algorithms to solve the maximum clique problem from graph theory. Other approaches employ dynamic programming or combinatorial simulated annealing.
Packages
Several tools for single and multiple structural alignments are available on the web:
- CE (Combinatorial Extension of the optimum path): http://cl.sdsc.edu/ce.html
- DALI (Distance Matrix Alignment): http://www.ebi.ac.uk/dali/
- HOMSTRAD (Homologous Structure Alignment Database): http://www-cryst.bioc.cam.ac.uk/~homstrad/
- SARF2 (Spatial Arrangement of Backbone Fragments): http://123d.ncifcrf.gov/sarf2.html
- SSAP (Sequential Structure Alignment Program): http://www.biochem.ucl.ac.uk/cgi-bin/cath/GetSsapRasmol.pl
See also
References
- Bourne, P.E & Shindyalov, I.N. (2003): Structure Comparison and Alignment. In: Bourne, P.E., Weissig, H. (Eds): Structural Bioinformatics. Hoboken NJ: Wiley-Liss. ISBN 0-471-20200-2