| Quick navigation: | Home | Site Map || References | Biography || Copyright | Other copyright | Contact us | Advert | | |
Re: [ccp4bb] R-free ratio, effect of ncs restraints? |
||
- Protein crystallographyMain steps:- Protein purification- Crystallisation Special:- Programs for crystallography- X-ray detectors Basic tutorials:- Chemistry- Protein - Peptide - Amino Acids Xtal community:- CCP4BB |
CCP4bb navigationCCP4bb <-- 1999 <-- November 1999 <-- 30 November 1999Subject: Re: R-free ratio, effect of ncs restraints? From: Ian Tickle ianjt05 {- at -} GMAIL {- dot -} COM Date: 2010-04-09 Hi Mark I think you need to distinguish between the mechanics of the refinement software on the one hand and the effect on the statistics on the other. I think you are referring to the former, in other words in the software the restraints are as you say treated exactly like the X-ray observations; they appear to augment the observations, and clearly do not reduce the *actual* number of parameters in any way (unlike constraints which do). Unfortunately this train of thought leads nowhere because even though in the software restraints and observations appear to be equivalent, 1 restraint is in no way *statistically* equivalent to 1 X-ray observation. We can make progress in understanding the statistics however if we consider the *effective* number of parameters which turns out to be (see the paper that Ed referred to for the proof): m_eff = m - r + Drest where m is the actual number of parameters, r is the number of restraints and Drest is a kind of correction for the fact that 1 parameter is not equivalent to 1 restraint (Drest depends on r in a complicated way; it's actually the contribution of the restraints to the least-squares residual, or equivalently to the negative log-likelihood, so 'good' restraints increase Drest less than 'bad' ones). In other words adding restraints does indeed have the effect of reducing the *effective* number of parameters (though not 1-for-1 since Drest also varies as you add restraints). We need m_eff in order to compute the ratio (no of observations) / (effective no of parameters). The expected Rfree/R (i.e. the expectation is predicated on the assumption that the parameter refinement is at a global minimum whose position in parameter space is a function of the weights you used) is then sqrt((f + m_eff) / (f - m_eff)), where f is the size of the working set. This can be written as sqrt((x+1) / (x-1)) where x = f / m_eff i.e. the effective obs/param ratio. This shows the direct relationship between example you can see what happens as x tends towards unity on the one hand and towards infinity on the other! Cheers -- Ian On Fri, Apr 9, 2010 at 9:31 PM, Mark J. van Raaij > Hi All, > in a paper (which I can't locate now...) which I read recently it was stated > that restraints do not reduce the number of parameters, rather they augment > the number of data points (so strong restraints are like strong data, weak > restraints weak data...). Only strict NCS constraints, where the copies have > to stay exactly the same, would reduce the number of parameters. Both > augment the data to parameter ratio, of course. I really liked this > explanation. > Mark > On 9 April 2010 21:54, Ian Tickle >> >> Hi Ed >> >> It's very difficult to deal theoretically with NCS because, unlike >> bond lengths where the uncertainties are known a priori (at least in >> principle), with NCS you don't know the uncertainties a priori, if you >> see what I mean (rather like unknown unknowns!). In other words the >> optimal weights and hence the effective number of parameters will >> depend on the exactness of the NCS. In practice you can of course >> determine the weights by minimising Rfree w.r.t.them. So I think it >> would be quite difficult to do what you are proposing, i.e. to >> disentangle the effects of the obs/param ratio and any effect of >> correlation of the working & test sets. Interesting problem though! >> >> BTW I think you are mis-quoting the formula in the paper, it should be >> >> >> In other words R is reduced below its expected value in the absence of >> random error, by overfitting the errors in the working set, but people >> tend to forget that the test set also has, on average, random errors >> of the same magnitude which tend to increase Rfree *above* its >> expected value. >> >> Cheers >> >> -- Ian >> >> >> On Fri, Apr 9, 2010 at 8:25 PM, Edward A. Berry >> wrote: >> > Has anyone looked theoretically at how ncs-restraints affect >> > the expected Rfree/R ratio? >> > >> > Tickle et al., Acta Cryst. (1998). D54, 547-557 >> > concluded Rfree/R = sqrt(Nobs/(Nobs-Nparam)) . >> > He suggested that, with restrained refinement of coordinates >> > plus individual isotropic B-factors, the effective number >> > of parameters per atom is two. If we add strong N-fold NCS >> > restraints on coordinates and B-factor, does that effectively >> > reduce the number of parameters by a factor of N? >> > Giving 2/N for parameters per atom? >> > >> > I'm curious how much of the drop in the r-free ratio observed >> > on enforcing NCS is due to the reduction in the effective >> > number of parameters, and how much is due to linking reflections >> > in the free set with the working set. Given an expression to >> > predict the effect of reducing number of parameters, seeing >> > how much of the actual drop in Rfree/R it accounts for >> > would let us see how severe the linkage problem is. >> > >> > Ed >> > > > > > -- > Mark J van Raaij > http://webspersoais.usc.es/mark.vanraaij > http://www.ibmb.csic.es > CCP4bb navigationCCP4bb <-- 1999 <-- November 1999 <-- 30 November 1999 |
|
| ProteinCrystallography.org: Copyright 2006-2010 by Quid United Ltd |