Quick navigation:        Home   |    Site Map   ||    References   |    Biography   ||    Copyright   |    Other copyright   |    Contact us   |   
Protein structure
 

Re: [ccp4bb] Does NCS bias a randomly-chosen test set (even if not enforced)?

 

Basic tutorials:
 
 

CCP4bb navigation

CCP4bb <-- 2008 <-- February 2008 <-- 11 February 2008
Previous message:
Subject: Protein crystallography position at Sussex University
From: Darren Thompson D {- dot -} Thompson {- at -} SUSSEX {- dot -} AC {- dot -} UK
Date: 2008-02-11
Next message:
Subject: Re: Protein-protein docking
From: Walter Novak wnovak {- at -} BRANDEIS {- dot -} EDU
Date: 2008-02-11


Subject: Re: Does NCS bias a randomly-chosen test set (even if not enforced)?
From: Dirk Kostrewa kostrewa {- at -} LMB {- dot -} UNI-MUENCHEN {- dot -} DE
Date: 2008-02-11

Dear Ed,

although, I don't think that a comparison of refinement in a higher
and a lower symmetry space group is valid for general NCS cases, I
will try to answer your question. Here are my thoughts for two
different cases:

(1) You have data to atomic resolution with high I/sigma and low Rsym
(I assume high redundancy). The n copies of the asymmetric unit in
the unit cell are really identical and obey the higher symmetry (so,
not a protein crystal). When you process the data in lower symmetry
(say, P1), the non-averaged "higher-symmetry"-equivalent Fobs will
differ due to measurement errors, and thus reflections in the working-
set will differ to "higher-symmetry"-related reflections in the test-
set due to these measurement errors. If you then refine the n copies
against the working-set in the lower P1 symmetry, you minimize
Fobs
(work)-Fcalc
, resulting in Fcalcs that become closer to the working-
set Fobs. As a consequence, the Fcalcs will thus diverge somewhat
from the test-set Fobs. However, since this atomic model is assumed
to be very well defined obeying the higher symmetry, and,
furthermore, the working-set contains well measured "higher-symmetry"-
equivalent Fobs, the resulting atomic positions, and thus the Fcalcs,
will be very close to their equivalent values in the higher-symmetry
refinement. Therefore, the Fcalcs will also be still very similar to
the "higher-symmetry"-equivalent Fobs in the test-set, and I would
expect a difference between Rwork and Rfree ranging from "0" to the
value of Rsym. In other words, the Fobs in the test-set are not
really independent of the reflections in the working-set, and thus
Rfree is heavily biased towards Rwork.
In this case, I would not expect large differences in the outcome due
to the additional application of "NCS"-constraints/restraints.

(2) You have data to non-atomic lower resolution, weak I/sigma and
poor Rsym. It is impossible to say whether the n copies of the
asymmetric unit in the unit cell are really identical, but they are
treated so assuming the higher symmetry (so, a real protein crystal).
For data processing, the same holds true as for case (1). In
contrast, here I think that it makes a difference, whether you apply
"NCS"-constraints/restraints between the n copies in the lower
symmetry P1, or not. If you apply "NCS"-constraints or strong "NCS"-
restraints, the n copies are made equal and you get n times the
average structure. This is similar to the refinement in the higher
symmetry, except that again you minimize the discrepancy between
Fcalcs and working-set Fobs, which will increase the discrepancy to
the "higher-symmetry"-related Fobs in the test-set. But since the
Fobs in the test-set are still not really independent to the Fobs in
the working-set, I would again expect maximum differences between
Rwork and Rfree in the same order of magnitude as Rsym. So, Rfree is
still biased towards Rwork, but it might be more difficult to notice
this. But if you do not apply "NCS"-constraints/restraints, you give
the less well-defined atomic model more freedom to converge against
the working-set Fobs, resulting in a higher discrepancy between Rwork
and Rfree. But since the Fobs in the working set still contain
"higher-symmetry"-equivalent Fobs, you will end up with a model that
still shows some similarity to the refined structure in the higher
symmetry. As a result, the Rfree is even then not really independent
of Rwork, but it might be even more difficult to notice this,
depending on data resolution and quality. Here, I can't give a range
of differences between Rwork and Rfree.

So, this is still not quantitative, and I hope that I'm not
completely wrong with my argumentation.

These lower vs. higher symmetry examples given above are only
transferable to reality in special NCS-cases with pseudo-higher
symmetry (what Dale Tronrud discussed). Taking these special cases
aside, what do the NCS experts say to my original statement that
precautions against NCS bias in Rfree must only be taken if NCS-
constraints/restraints are really applied during refinement?

Best regards,

Dirk.

Am 08.02.2008 um 21:43 schrieb Edward A. Berry:

> Clarification-
>
> Someone wrote:
>>> Ah- that's going way to fast for the beginners, at least one of
>>> them!
>>> Can someone explain why the R-free will be very close to the R-work,
>>> preferably in simple concrete terms like Fo, Fc, at sym-related
>>> reflections, and the change in the Fc resulting from a step of
>>> refinement?
>>>
>>> Ed
>> Hi Ed,
>> Here's what I think they're saying:
>> If the NCS is almost crystallographic, then one wedge of spots
>> will be almost identical to another wedge. If spot "a" is in the
>> test set, but the almost-crystallographically identical spot "a' "
>> in the 2nd wedge isn't, then because you're refining directly
>> against a', spot a doesn't really count as "free".
>> Was that the question?
> Thanks, but,
>
> Here we are talking about refining a structure in an artificially low
> space group, to get away from the complexities of the G-function and
> degree of overlap. The "NCS" brings a reflection in the test set
> exactly
> onto a reflection in the work set. I'm asking "so what?"
>
> Think about what you mean when you say "spot a and spot a' are
> crystallographically identical".
>
> Do you mean the Fo are identical?
> They are not, because if we consider it a lower space group then
> we will not average these spots, but have separate experimentally
> determined values for them. However as pointed out by Jon Wright
> and Dean Madden yesterday, the difference between sym-related Fobs
> is usually much smaller than the difference between Fo and Fc, so
> the sym-related Fobs can be considered almost the same in comparison
> to Fc. Specifically,they are likely to be both on the same side of Fc,
> so changing two Fc in the same direction will have the same effect on
>
Fo-Fc
at the two reflections.
>
> Do you mean the Fc are identical? If we start with the symmetrical
> structure refined in the higher space group, their initial values will
> be the same. However if we do not enforce NCS, then the changes
> induced
> by refinement will be asymmetric, and the two "NCS-related" Fc will
> start to diverge. A change which is made because it improves the
> fit for
> some reflections in the working set may well make the fit worse for
> the related reflections in the test set. The only way they are coupled
> is through the fact that if a change makes the model more like the
> real
> structure, then the expected value of the resulting change in
Fo-Fc

> is negative for all reflections.
>
> Remember R and Rfree will be statistically the same before refinement,
> and start to diverge once refinement begins. Dirk's lesson seems
> to imply they will diverge less if there is (perfect) NCS, even if
> the NCS is not applied.
>
> (I'm probably wrong, but I want someone to show me,and not with
> hand-waving
> arguments or invocation of crystallographic intuition or such)
>
> To convince me, someone needs to show that the expected value of
> the change
> in
Fo-Fc
at a test reflection upon a change in the model (a step
> of refinement)
> is negative, even in the absence of any real improvement in the model,
> simply because the change reduces
Fo-Fc
at a sym-related working
> reflection.
>
> Ed


*******************************************************
Dirk Kostrewa
Gene Center, A 5.07
Ludwig-Maximilians-University
Feodor-Lynen-Str. 25
81377 Munich
Germany
Phone: +49-89-2180-76845
Fax: +49-89-2180-76999
E-mail: kostrewa@lmb.uni-muenchen.de
*******************************************************



CCP4bb navigation

CCP4bb <-- 2008 <-- February 2008 <-- 11 February 2008
Previous message:
Subject: Protein crystallography position at Sussex University
From: Darren Thompson D {- dot -} Thompson {- at -} SUSSEX {- dot -} AC {- dot -} UK
Date: 2008-02-11
Next message:
Subject: Re: Protein-protein docking
From: Walter Novak wnovak {- at -} BRANDEIS {- dot -} EDU
Date: 2008-02-11



ProteinCrystallography.org: Copyright 2006-2008 by Quid United Ltd