| Quick navigation: | Home | Site Map || References | Biography || Copyright | Other copyright | Contact us | Advert | | |
Re: [ccp4bb] PDBSET bug with unusual (but legal) atom names |
||
- Protein crystallographyMain steps:- Protein purification- Crystallisation Special:- Programs for crystallography- X-ray detectors Basic tutorials:- Chemistry- Protein - Peptide - Amino Acids Xtal community:- CCP4BB |
CCP4bb navigationCCP4bb <-- 1999 <-- November 1999 <-- 30 November 1999Subject: Re: PDBSET bug with unusual (but legal) atom names From: Tim Gruene tg {- at -} SHELX {- dot -} UNI-AC {- dot -} GWDG {- dot -} DE Date: 2008-09-05 even though I also don't fully understand the code, your interpretation seems to make sense. I don't know a fix for the code. Yet, when I used pdbset from ccp4-6.0.99e, this error did not occur (but did with pdbset from ccp4-6.0.2). So if it's an option you might simply update to the latest beta-Version of ccp4. $CCP4/lib/src/mmdb/mmdb_rwbrook.cpp seems completely rewritten in the beta-version since that code and your excerpt hardly share any similarity! Tim -- Tim Gruene Institut fuer anorganische Chemie Tammannstr. 4 D-37077 Goettingen GPG Key ID = A46BEE1A On Thu, 4 Sep 2008, Borhani, David wrote: > I believe I found the bug, but I don't understand enough to know how to > fix the code correctly. > > In $CCP4/lib/src/mmdb/mmdb_rwbrook.cpp, lines 1546-1561: > > if (FTN_STR(ID)[0]==' ') { > atomName[0] = char(0); > if ((FTN_STR(AtNam)[1]=='H') > ((FTN_STR(AtNam)[1]=='D') && (FTN_STR(ID)[2]=='D'))) { > i = 0; > while ((i > (FTN_STR(AtNam)[0]!=HydAtomName[i][0])) i++; > if (i > GetStrTer ( atomName,FTN_STR(AtNam),4,5,FTN_LEN(AtNam) ); > } > if (!atomName[0]) { > atomName[0] = ' '; > GetStrTer ( &(atomName[1]),FTN_STR(AtNam),3,4,FTN_LEN(AtNam) ); > } > } else > GetStrTer ( atomName,FTN_STR(AtNam),4,5,4 ); > > This code snippet is in subroutine MMDB_F_ATOM, which is called from > $CCP4/lib/src/rwbrook.f S/R XYZATOM, which is called (multiple places > for both input & output) from pdbset. > > The logic here, which I don't understand fully (in part because I don't > really understand c++, but also because the code is so sparsely > commented), seems to me to be wrong. > > It seems to be testing whether the first character of the "ID", which I > think is the two-character atomic element name, is a space or not. If a > space, do something special for hydrogen/deuterium, AND importantly (and > incorrectly) ensure that the 4-character atom name on output also begins > with a space; if not a space (correct for ALL atoms, no???), simply copy > the input 4-character atom name (without adjustment to its > justification) to output. > > Could someone more in the know confirm that this logic is flawed or not, > and also please suggest a proper code fix? > > Thanks! > Dave > > David Borhani, Ph.D. > D. E. Shaw Research, LLC > 120 West Forty-Fifth Street, 39th Floor > New York, NY 10036 > David.Borhani@DEShawResearch.com > 212-478-0698 > http://www.deshawresearch.com > > ======================================================================== > ========= > > I used PDBSET (ccp4-6.0.2; linux; script at end of email) to generate > symmetry-related chains (with chain renaming). > > Certain atom names get mangled. Specifically, the NAP residue, which as > many of us are painfully aware is named unusually, comes out wrong: > > Example input: > ATOM 12107 AOP1 NAP G1001 22.769 33.214 214.847 1.00 34.14 > O > ATOM 12108 AP2* NAP G1001 21.800 33.997 213.998 1.00 37.32 > P > ATOM 12109 AOP2 NAP G1001 22.509 35.058 213.208 1.00 32.04 > O > ATOM 12110 AOP3 NAP G1001 20.909 33.196 213.051 1.00 31.50 > O > > INCORRECT output: > ATOM 12107 AOP NAP M1001 22.769 33.214 214.847 1.00 34.14 > M O > ATOM 12108 AP2 NAP M1001 21.800 33.997 213.998 1.00 37.32 > M P > ATOM 12109 AOP NAP M1001 22.509 35.058 213.208 1.00 32.04 > M O > ATOM 12110 AOP NAP M1001 20.909 33.196 213.051 1.00 31.50 > M O > > Note the incorrect one-character rightward shift of the atom name, and > thus loss of the last character of the atom name. > > ALSO, the bug has nothing to do with the symmetry generation/chain > renaming: I re-ran PDBSET, simply requesting translation of all > coordinates by [0.00001,0,0], and I get the same incorrect atom name > output. > > I waded through pdbset.f, rwbrook.f, and the new MMDB lib routines, but > I must admit that the bug location remains totally opaque to me! Any > help in locating the bug would be much appreciated! > > Thanks, > Dave > > David Borhani, Ph.D. > D. E. Shaw Research, LLC > 120 West Forty-Fifth Street, 39th Floor > New York, NY 10036 > David.Borhani@DEShawResearch.com > 212-478-0698 > http://www.deshawresearch.com > > PDBSET scripts used (via ccp4i gui): > > symgen X, Y, Z > symgen -Y+1/2, X+1/2, Z > symgen Y-1/2, -X+1/2, Z > symgen -X, -Y+1, Z > chain symmetry 1 - > B A > chain symmetry 2 - > B D > chain symmetry 3 - > B C > chain symmetry 4 - > B B > chain symmetry 1 - > A E > chain symmetry 2 - > A H > chain symmetry 3 - > A G > chain symmetry 4 - > A F > chain symmetry 1 - > H I > chain symmetry 2 - > H L > chain symmetry 3 - > H K > chain symmetry 4 - > H J > chain symmetry 1 - > G M > chain symmetry 2 - > G P > chain symmetry 3 - > G O > chain symmetry 4 - > G N > End > > remark test > shift - > 1e-05 0.0 0.0 > end > CCP4bb navigationCCP4bb <-- 1999 <-- November 1999 <-- 30 November 1999 |
|
| ProteinCrystallography.org: Copyright 2006-2010 by Quid United Ltd |