| Quick navigation: | Home | Site Map || References | Biography || Copyright | Other copyright | Contact us | Advert | | |||
Re: [ccp4bb] PDBSET bug with unusual (but legal) atom names |
||||
- Protein crystallographyMain steps:- Protein purification- Crystallisation Special:- Programs for crystallography- X-ray detectors Basic tutorials:- Chemistry- Protein - Peptide - Amino Acids Xtal community:- CCP4BB |
CCP4bb navigationCCP4bb <-- 1999 <-- November 1999 <-- 30 November 1999Subject: Re: PDBSET bug with unusual (but legal) atom names From: "Borhani, David" David {- dot -} Borhani {- at -} DESHAWRESEARCH {- dot -} COM Date: 2008-09-05 I believe I found the bug, but I don't understand enough to know how to fix the code correctly. In $CCP4/lib/src/mmdb/mmdb_rwbrook.cpp, lines 1546-1561: if (FTN_STR(ID)[0]==' ') { atomName[0] = char(0); if ((FTN_STR(AtNam)[1]=='H') ((FTN_STR(AtNam)[1]=='D') && (FTN_STR(ID)[2]=='D'))) { i = 0; while ((i if (i } if (!atomName[0]) { atomName[0] = ' '; GetStrTer ( &(atomName[1]),FTN_STR(AtNam),3,4,FTN_LEN(AtNam) ); } } else GetStrTer ( atomName,FTN_STR(AtNam),4,5,4 ); This code snippet is in subroutine MMDB_F_ATOM, which is called from $CCP4/lib/src/rwbrook.f S/R XYZATOM, which is called (multiple places for both input & output) from pdbset. The logic here, which I don't understand fully (in part because I don't really understand c++, but also because the code is so sparsely commented), seems to me to be wrong. It seems to be testing whether the first character of the "ID", which I think is the two-character atomic element name, is a space or not. If a space, do something special for hydrogen/deuterium, AND importantly (and incorrectly) ensure that the 4-character atom name on output also begins with a space; if not a space (correct for ALL atoms, no???), simply copy the input 4-character atom name (without adjustment to its justification) to output. Could someone more in the know confirm that this logic is flawed or not, and also please suggest a proper code fix? Thanks! Dave David Borhani, Ph.D. D. E. Shaw Research, LLC 120 West Forty-Fifth Street, 39th Floor New York, NY 10036 David.Borhani@DEShawResearch.com 212-478-0698 http://www.deshawresearch.com ======================================================================== ========= I used PDBSET (ccp4-6.0.2; linux; script at end of email) to generate symmetry-related chains (with chain renaming). Certain atom names get mangled. Specifically, the NAP residue, which as many of us are painfully aware is named unusually, comes out wrong: Example input: ATOM 12107 AOP1 NAP G1001 22.769 33.214 214.847 1.00 34.14 O ATOM 12108 AP2* NAP G1001 21.800 33.997 213.998 1.00 37.32 P ATOM 12109 AOP2 NAP G1001 22.509 35.058 213.208 1.00 32.04 O ATOM 12110 AOP3 NAP G1001 20.909 33.196 213.051 1.00 31.50 O INCORRECT output: ATOM 12107 AOP NAP M1001 22.769 33.214 214.847 1.00 34.14 M O ATOM 12108 AP2 NAP M1001 21.800 33.997 213.998 1.00 37.32 M P ATOM 12109 AOP NAP M1001 22.509 35.058 213.208 1.00 32.04 M O ATOM 12110 AOP NAP M1001 20.909 33.196 213.051 1.00 31.50 M O Note the incorrect one-character rightward shift of the atom name, and thus loss of the last character of the atom name. ALSO, the bug has nothing to do with the symmetry generation/chain renaming: I re-ran PDBSET, simply requesting translation of all coordinates by [0.00001,0,0], and I get the same incorrect atom name output. I waded through pdbset.f, rwbrook.f, and the new MMDB lib routines, but I must admit that the bug location remains totally opaque to me! Any help in locating the bug would be much appreciated! Thanks, Dave David Borhani, Ph.D. D. E. Shaw Research, LLC 120 West Forty-Fifth Street, 39th Floor New York, NY 10036 David.Borhani@DEShawResearch.com 212-478-0698 http://www.deshawresearch.com PDBSET scripts used (via ccp4i gui): symgen X, Y, Z symgen -Y+1/2, X+1/2, Z symgen Y-1/2, -X+1/2, Z symgen -X, -Y+1, Z chain symmetry 1 - B A chain symmetry 2 - B D chain symmetry 3 - B C chain symmetry 4 - B B chain symmetry 1 - A E chain symmetry 2 - A H chain symmetry 3 - A G chain symmetry 4 - A F chain symmetry 1 - H I chain symmetry 2 - H L chain symmetry 3 - H K chain symmetry 4 - H J chain symmetry 1 - G M chain symmetry 2 - G P chain symmetry 3 - G O chain symmetry 4 - G N End remark test shift - 1e-05 0.0 0.0 end CCP4bb navigationCCP4bb <-- 1999 <-- November 1999 <-- 30 November 1999 |
|||
| ProteinCrystallography.org: Copyright 2006-2010 by Quid United Ltd |