[RASMB] RE: DC/Dt vs. sedfit

Fri Feb 17 07:16:35 PST 2006

Borries,

For this to be a forum of scientific discussion, rather than voicing of 
sentiments, I won't bother responding to your specific remarks about c(s), 
because, to be frank - they are absolutely baseless.  I challenge you to 
provide data and sound arguments consistent with sedimentation theory that 
support your statements.  Comparison of c(s) with "calibration of columns", 
and concerns about non-spherical molecules is just completely unfounded, to 
be polite.

As you know, one has to consider two different basic things about fitting:

1) What do we know about the sample?  In this case we adjust the model to 
match it.  In the c(s) implementation in SEDFIT there are a variety of 
options to make use of prior knowledge.  Not only the one for segmented 
f/f0 values in different s-ranges, or the one with discrete Lamm equation 
solutions replacing peaks, but also that of a pre-existing, known 
relationship of M and s.  You mentioned fibril formation - a case where the 
relationship between s and M of fibrils was established is MacRaild et al. 
(http://www.biophysj.org/cgi/content/full/84/4/2562), and this can be used 
as prior knowledge in c(s).

Generally, if one argues about failures of c(s) but is not using the best 
models for the given cases, based on the knowledge about the samples, this 
can't be helped.

2) In the absence of additional knowledge, or as is the case in most 
situations, if the default c(s) with a single f/f0 value is perfectly fine 
- what is the information we can extract from the noisy data?  Here, we 
look at the residuals of the fit, and all models that seem otherwise 
possible and fit the data statistically well are to be accepted equally as 
possible interpretations.  Regularization provides the simplest (in a sense 
of broadest) possible solution.  As a consequence, one will have to look 
for the residuals of the fit to the raw data.  For historic reasons, this 
has not always been done because the dcdt transformation did not provide 
that information.  However, this is an important aspect of getting a 
reliable fit in c(s).

By questioning if this can be done by SEDFIT users, I think you are 
actually insulting the intelligence of who you call "the regular user", 
from whom you might find more than 50 publications in 2005 alone using 
c(s), in top peer-reviewed journals.  (You can find some at 
http://www.analyticalultracentrifugation.com/references.htm)

Your conclusions about c(s) may be biased by your own implementation of 
this method, which as far as you have reported on it, does not have any 
regularization and therefore is highly susceptible to the well-known 
ill-conditioned boundary modeling, which we have described in detail in the 
original work introducing c(s) in 2000.  In the absence of regularization, 
I would propose to call such distributions "pseudo-c(s)".

Regarding the methods you are referring to:

1) the van Holde Weischet method you advertise is based on a single species 
Faxen approximation of the Lamm equation, and the extrapolation procedure 
is based on the notion of inverting the error function.   As we have 
pointed a few years ago 
(http://www.biophysj.org/cgi/content/abstract/82/2/1096) inverse error 
function are *not linear in the arguments*, i.e. the inverse error function 
does invert a single error function but not a sum of error functions, which 
means that you cannot deconvolute diffusion from mixtures, only from single 
species.  This is why this method produces the artifactual diagonal lines, 
except for cases where each species' sedimentation is reflected in a 
separate boundary.  A detailed analysis of this problem and comparison of 
the methods can be found in the 2002 BJ paper mentioned.

So far, you have not offered any comment or any solution to this 
theoretical and important practical problem, which is crucial if you want 
to claim a scientific foundation to the statement that this approach is 
deconvoluting heterogeneity from mixtures.

Further, there is no statistically well-balanced and rigorous way to apply 
this to interference optical data.  There may be semi-empirical schemes 
that work in certain cases, but I am really not sure how this problem is 
dealt with.

You have reported a method to turn G(s) into a "more familiar" differential 
sedimentation coefficient distribution g(s), smoothing the G(s) histogram - 
arbitrarily I might say - with Gaussians of USER-SELECTED WIDTH.

Are referring to this method as a more rigorous alternative to c(s)?

2) No, I've certainly not been alluding to your two-dimensional 
analysis.  I have been talking about a c(s,f/f0) distribution which is a 
(single data set) special case of the work on global size-and-shape 
distribution that many of you might remember me talking about in the AUC 
Euroconference in Grenoble several years ago.  As the original method, the 
c(s,f/f0) approach applies regularization to stabilize the analysis.  Some 
other extension was done to eliminate ill-conditioned diffusion (or molar 
mass) information and extract the reliable aspects.  You will see the 
details hopefully shortly.

3) You are talking about the need to use remote supercomputers for your 
kind of analysis - you may be surprised to hear that I've done all the work 
on our two-dimensional size and shape distribution c(s,f/f0) at home on my 
sofa, using only my laptop.  As everybody is in a position to verify, this 
takes usually on the order of a few minutes on a reasonably fast PC.  We're 
not talking here about the protein folding problem, molecular dynamics, or 
simulating a nuclear explosion!  Of course one can implement every problem 
in such a way that it requires supercomputers, but if that's really 
necessary is a different story.

Peter