Quick navigation: Home   |    Site Map   ||    References   |    Biography   ||    Copyright   |    Other copyright   |    Contact us   |    Advert   |   
 

Re: [ccp4bb] PDBSET bug with unusual (but legal) atom names

- Protein crystallography

Main steps:

   - Protein purification
   - Crystallisation

Special:

   - Programs for crystallography
   - X-ray detectors

Basic tutorials:

   - Chemistry
   - Protein
   - Peptide
   - Amino Acids

Xtal community:

   - CCP4BB

CCP4bb navigation

CCP4bb <-- 1999 <-- November 1999 <-- 30 November 1999
Previous message:
Subject: Pittsburgh Diffraction Society meeting and Remote Access Workshop
From: Clyde Smith csmith {- at -} SLAC {- dot -} STANFORD {- dot -} EDU
Date: 2008-09-04
Next message:
Subject: Re: PDBSET bug with unusual (but legal) atom names
From: Tim Gruene tg {- at -} SHELX {- dot -} UNI-AC {- dot -} GWDG {- dot -} DE
Date: 2008-09-05


Subject: Re: PDBSET bug with unusual (but legal) atom names
From: "Borhani, David" David {- dot -} Borhani {- at -} DESHAWRESEARCH {- dot -} COM
Date: 2008-09-05

I believe I found the bug, but I don't understand enough to know how to
fix the code correctly.

In $CCP4/lib/src/mmdb/mmdb_rwbrook.cpp, lines 1546-1561:

if (FTN_STR(ID)[0]==' ') {
atomName[0] = char(0);
if ((FTN_STR(AtNam)[1]=='H')


((FTN_STR(AtNam)[1]=='D') && (FTN_STR(ID)[2]=='D'))) {
i = 0;
while ((i (FTN_STR(AtNam)[0]!=HydAtomName[i][0])) i++;
if (i GetStrTer ( atomName,FTN_STR(AtNam),4,5,FTN_LEN(AtNam) );
}
if (!atomName[0]) {
atomName[0] = ' ';
GetStrTer ( &(atomName[1]),FTN_STR(AtNam),3,4,FTN_LEN(AtNam) );
}
} else
GetStrTer ( atomName,FTN_STR(AtNam),4,5,4 );

This code snippet is in subroutine MMDB_F_ATOM, which is called from
$CCP4/lib/src/rwbrook.f S/R XYZATOM, which is called (multiple places
for both input & output) from pdbset.

The logic here, which I don't understand fully (in part because I don't
really understand c++, but also because the code is so sparsely
commented), seems to me to be wrong.

It seems to be testing whether the first character of the "ID", which I
think is the two-character atomic element name, is a space or not. If a
space, do something special for hydrogen/deuterium, AND importantly (and
incorrectly) ensure that the 4-character atom name on output also begins
with a space; if not a space (correct for ALL atoms, no???), simply copy
the input 4-character atom name (without adjustment to its
justification) to output.

Could someone more in the know confirm that this logic is flawed or not,
and also please suggest a proper code fix?

Thanks!
Dave

David Borhani, Ph.D.
D. E. Shaw Research, LLC
120 West Forty-Fifth Street, 39th Floor
New York, NY 10036
David.Borhani@DEShawResearch.com
212-478-0698
http://www.deshawresearch.com

========================================================================
=========

I used PDBSET (ccp4-6.0.2; linux; script at end of email) to generate
symmetry-related chains (with chain renaming).

Certain atom names get mangled. Specifically, the NAP residue, which as
many of us are painfully aware is named unusually, comes out wrong:

Example input:
ATOM 12107 AOP1 NAP G1001 22.769 33.214 214.847 1.00 34.14
O
ATOM 12108 AP2* NAP G1001 21.800 33.997 213.998 1.00 37.32
P
ATOM 12109 AOP2 NAP G1001 22.509 35.058 213.208 1.00 32.04
O
ATOM 12110 AOP3 NAP G1001 20.909 33.196 213.051 1.00 31.50
O

INCORRECT output:
ATOM 12107 AOP NAP M1001 22.769 33.214 214.847 1.00 34.14
M O
ATOM 12108 AP2 NAP M1001 21.800 33.997 213.998 1.00 37.32
M P
ATOM 12109 AOP NAP M1001 22.509 35.058 213.208 1.00 32.04
M O
ATOM 12110 AOP NAP M1001 20.909 33.196 213.051 1.00 31.50
M O

Note the incorrect one-character rightward shift of the atom name, and
thus loss of the last character of the atom name.

ALSO, the bug has nothing to do with the symmetry generation/chain
renaming: I re-ran PDBSET, simply requesting translation of all
coordinates by [0.00001,0,0], and I get the same incorrect atom name
output.

I waded through pdbset.f, rwbrook.f, and the new MMDB lib routines, but
I must admit that the bug location remains totally opaque to me! Any
help in locating the bug would be much appreciated!

Thanks,
Dave

David Borhani, Ph.D.
D. E. Shaw Research, LLC
120 West Forty-Fifth Street, 39th Floor
New York, NY 10036
David.Borhani@DEShawResearch.com
212-478-0698
http://www.deshawresearch.com

PDBSET scripts used (via ccp4i gui):

symgen X, Y, Z
symgen -Y+1/2, X+1/2, Z
symgen Y-1/2, -X+1/2, Z
symgen -X, -Y+1, Z
chain symmetry 1 -
B A
chain symmetry 2 -
B D
chain symmetry 3 -
B C
chain symmetry 4 -
B B
chain symmetry 1 -
A E
chain symmetry 2 -
A H
chain symmetry 3 -
A G
chain symmetry 4 -
A F
chain symmetry 1 -
H I
chain symmetry 2 -
H L
chain symmetry 3 -
H K
chain symmetry 4 -
H J
chain symmetry 1 -
G M
chain symmetry 2 -
G P
chain symmetry 3 -
G O
chain symmetry 4 -
G N
End

remark test
shift -
1e-05 0.0 0.0
end




ProteinCrystallography.org: Copyright 2006-2010 by Quid United Ltd