[RASMB] RE: DC/Dt vs. sedfit
Peter Schuck
pschuck at helix.nih.gov
Fri Feb 17 07:16:35 PST 2006
Borries,
For this to be a forum of scientific discussion, rather than voicing of
sentiments, I won't bother responding to your specific remarks about c(s),
because, to be frank - they are absolutely baseless. I challenge you to
provide data and sound arguments consistent with sedimentation theory that
support your statements. Comparison of c(s) with "calibration of columns",
and concerns about non-spherical molecules is just completely unfounded, to
be polite.
As you know, one has to consider two different basic things about fitting:
1) What do we know about the sample? In this case we adjust the model to
match it. In the c(s) implementation in SEDFIT there are a variety of
options to make use of prior knowledge. Not only the one for segmented
f/f0 values in different s-ranges, or the one with discrete Lamm equation
solutions replacing peaks, but also that of a pre-existing, known
relationship of M and s. You mentioned fibril formation - a case where the
relationship between s and M of fibrils was established is MacRaild et al.
(http://www.biophysj.org/cgi/content/full/84/4/2562), and this can be used
as prior knowledge in c(s).
Generally, if one argues about failures of c(s) but is not using the best
models for the given cases, based on the knowledge about the samples, this
can't be helped.
2) In the absence of additional knowledge, or as is the case in most
situations, if the default c(s) with a single f/f0 value is perfectly fine
- what is the information we can extract from the noisy data? Here, we
look at the residuals of the fit, and all models that seem otherwise
possible and fit the data statistically well are to be accepted equally as
possible interpretations. Regularization provides the simplest (in a sense
of broadest) possible solution. As a consequence, one will have to look
for the residuals of the fit to the raw data. For historic reasons, this
has not always been done because the dcdt transformation did not provide
that information. However, this is an important aspect of getting a
reliable fit in c(s).
By questioning if this can be done by SEDFIT users, I think you are
actually insulting the intelligence of who you call "the regular user",
from whom you might find more than 50 publications in 2005 alone using
c(s), in top peer-reviewed journals. (You can find some at
http://www.analyticalultracentrifugation.com/references.htm)
Your conclusions about c(s) may be biased by your own implementation of
this method, which as far as you have reported on it, does not have any
regularization and therefore is highly susceptible to the well-known
ill-conditioned boundary modeling, which we have described in detail in the
original work introducing c(s) in 2000. In the absence of regularization,
I would propose to call such distributions "pseudo-c(s)".
Regarding the methods you are referring to:
1) the van Holde Weischet method you advertise is based on a single species
Faxen approximation of the Lamm equation, and the extrapolation procedure
is based on the notion of inverting the error function. As we have
pointed a few years ago
(http://www.biophysj.org/cgi/content/abstract/82/2/1096) inverse error
function are *not linear in the arguments*, i.e. the inverse error function
does invert a single error function but not a sum of error functions, which
means that you cannot deconvolute diffusion from mixtures, only from single
species. This is why this method produces the artifactual diagonal lines,
except for cases where each species' sedimentation is reflected in a
separate boundary. A detailed analysis of this problem and comparison of
the methods can be found in the 2002 BJ paper mentioned.
So far, you have not offered any comment or any solution to this
theoretical and important practical problem, which is crucial if you want
to claim a scientific foundation to the statement that this approach is
deconvoluting heterogeneity from mixtures.
Further, there is no statistically well-balanced and rigorous way to apply
this to interference optical data. There may be semi-empirical schemes
that work in certain cases, but I am really not sure how this problem is
dealt with.
You have reported a method to turn G(s) into a "more familiar" differential
sedimentation coefficient distribution g(s), smoothing the G(s) histogram -
arbitrarily I might say - with Gaussians of USER-SELECTED WIDTH.
Are referring to this method as a more rigorous alternative to c(s)?
2) No, I've certainly not been alluding to your two-dimensional
analysis. I have been talking about a c(s,f/f0) distribution which is a
(single data set) special case of the work on global size-and-shape
distribution that many of you might remember me talking about in the AUC
Euroconference in Grenoble several years ago. As the original method, the
c(s,f/f0) approach applies regularization to stabilize the analysis. Some
other extension was done to eliminate ill-conditioned diffusion (or molar
mass) information and extract the reliable aspects. You will see the
details hopefully shortly.
3) You are talking about the need to use remote supercomputers for your
kind of analysis - you may be surprised to hear that I've done all the work
on our two-dimensional size and shape distribution c(s,f/f0) at home on my
sofa, using only my laptop. As everybody is in a position to verify, this
takes usually on the order of a few minutes on a reasonably fast PC. We're
not talking here about the protein folding problem, molecular dynamics, or
simulating a nuclear explosion! Of course one can implement every problem
in such a way that it requires supercomputers, but if that's really
necessary is a different story.
Peter
More information about the RASMB
mailing list