| Quick navigation: | Home | Site Map || References | Biography || Copyright | Other copyright | Contact us | | |
|
Re: [ccp4bb] Highest shell standards |
|
CCP4bb navigationCCP4bb <-- 2007 <-- March 2007 <-- 21 March 2007Subject: Re: Highest shell standards From: Ulrich Genick genick {- at -} BRANDEIS {- dot -} EDU Date: 2007-03-21 The first thing to keep in mind is that the goal of a structure determination is not to get the best stats or to claim the highest possible resolution. The goal is to get the best possible structure and to be confident that observed features in a structure are real and not the result of noise. From that perspective, if any of the conclusions one draws from a structure change depending on whether one includes data with an I/sigI in the highest resolution shell of 2 or 1, one probably treads on thin ice. The general guide that one should include only data, for which the shell's average I/sigI > 2 comes from the following simple consideration. F/sigF = 2 I/sigI So if you include data with an I/sigI of 2 then your F/sigF =4. In other words you will have a roughly 25% experimental uncertainty in your F. Now assume that you actually knew the structure of your protein and you would calculate the crystallographic R-factor between the Fcalcs from your true structure and the observed F. In this situation, you would expect to get a crystallographic R- factor around 25%, simply because of the average error in your experimental structure factor. Since most macromolecular structures have R-factors around 20%, it makes little sense to include data, where the experimental uncertainty alone will guarantee that your R-factor will be worse. Of course, these days maximum-likely-hood refinement will just down weight such data and all you do is to burn CPU cycles. If you actually want to do a semi rigorous test of where you should stop including data, simply include increasingly higher resolution data in your refinement and see if your structure improves. If you have really high resolution data (i.e. better than 1.2 Angstrom) you can do matrix inversion in SHELX and get estimated standard deviations (esd) for your refined parameters. As you include more and more data the esds should initially decrease. Simply keep including higher resolution data until your esds start to increase again. Similarly, for lower resolution data you can monitor some molecular parameters, which are not included in the stereochemical restraints and see, if the inclusion of higher-resolution data makes the agreement between the observed and expected parameters better. For example SHELX does not restrain torsion angles in aliphatic portions of side chains. If your structure improves, those angles should cluster more tightly around +60 -60 and 180... Cheers, Ulrich > Could someone point me to some standards for data quality, > especially for publishing structures? I'm wondering in particular > about highest shell completeness, multiplicity, sigma and Rmerge. > > A co-worker pointed me to a '97 article by Kleywegt and Jones: > > http://xray.bmc.uu.se/gerard/gmrp/gmrp.html > > "To decide at which shell to cut off the resolution, we nowadays > tend to use the following criteria for the highest shell: > completeness > 80 %, multiplicity > 2, more than 60 % of the > reflections with I > 3 sigma(I), and Rmerge < 40 %. In our opinion, > it is better to have a good 1.8 Å structure, than a poor 1.637 Å > structure." > > Are these recommendations still valid with maximum likelihood > methods? We tend to use more data, especially in terms of the > Rmerge and sigma cuttoff. > > Thanks in advance, > > Shane Atwell > CCP4bb navigationCCP4bb <-- 2007 <-- March 2007 <-- 21 March 2007 |
| ProteinCrystallography.org: Copyright 2006-2007 by Quid United Ltd |