FW: [RASMB] Origin software and vbar

Wed May 11 14:49:00 PDT 2005

Hi all,
this is a delayed contribution to the issue of sedimentation equilibrium 
analysis (which by accident was sent originally to 
rasmb-admin at server1.bbri.org instead of  RASMB at server1.bbri.org).

Writing your own code is very nice, of course, and also can be fun as it 
has a way of taking you quickly beyond simple scripts or c-code for 
sedimentation equilibrium function, including more features of particular 
interest.  I also completely agree with John that AUC users are smart 
enough to figure out what they need in order to answer their questions.
If a simple program does the job - great, it means it was a good experiment 
and you have a nice system!  There's obviously no reason to change 
something if it works.

For me, the most important question for sedimentation equilibrium in 
practice is usually to determine the stoichiometry and the binding 
constants of self-associating and hetero-associating systems.  We switch 
biological systems frequently and some of them don't lend themselves 
particularly well to AUC analysis (even though in some cases AUC may still 
be the best tool).  For the software the key for me is the range of binding 
constants one can determine, and if one can exploit all the information 
from the experiment.  Convenience and a slick software interface are nice, 
but the bottom line is the result.  Obviously I'm not a believer in NONLIN, 
mostly because there are a number of features more recently developed which 
it doesn't have, but which will substantially improve the capability of 
sedimentation equilibrium analysis for associating systems.  Unfortunately, 
they are also beyond quickly writing a line of code.

* flexible mass conservation analysis
* global multi-signal and multi-wavelength analysis for heterogeneous 
protein interactions
* elimination of baseline signals (including radial-dependent features from 
IF optics)
* correlation of data from dilution or titration series in different cells

Taken together, these allow you to extend the analysis to weaker and 
stronger interactions, and for heterogeneous interactions eliminate the 
problem of too similar molar mass, or of low extinction coefficients.

Mass conservation analysis goes back quite a bit, many researchers have 
used it in the past, for example Marc Lewis quite intensively.  More 
recently John Philo has shown how it can be used as a soft, scalable 
constraint in the analysis (Methods Enzymol 321 (2000) 100-120), and he has 
published a number of sophisticated applications with it.  He also 
originally suggested to use constraints in the fitted total amount of 
material between different cells, dependent on the experimental 
design.  We've combined this idea with an approach of Roark of using data 
from multiple rotor speeds from each cell (Biophys Chem 5 (1976) 185-196), 
which we modified to allow fitting for an unknown bottom position 
(constrained among some of the data sets, e.g. absorbance data of the same 
cell at different wavelengths).  This combination allows to solve the 
problem that we may know neither the real loading concentration nor the 
bottom position of the cell precisely, by restricting the analysis to the 
material that is redistributing between equilibria at different rotor 
speeds.  Baseline signals, including radial dependent offsets may be 
determined, too.  In the end, we can design experiments in different cells 
as dilution series or titration series, and use our knowledge of the 
constancy of the molar ratio or the constancy of the 'effective' loading 
concentration of one component.  Technically, it relieves us a bit from the 
typical nightmare of fitting noisy exponentials.  This can substantially 
stabilize the analysis, meaning one can get a better handle on the binding 
constants, strong or weak.

This approach in described detail in Anal. Biochem. 326 (2004) 234-256, and 
a couple of applications can be found already in the literature.  How to go 
from the design of an experiment to conducting the AUC run and to the data 
analysis with SEDPHAT is described in our step-by-step protocol (in press 
in CSHLP, preprints are available from the sedfit website), and we'll have 
shortly a tutorial for this on the sedphat website.

However, this approach is not completely SEDPHAT specific.  Most recently, 
some of these ideas have been incorporated in other shared software, too, - 
with or without reference -  and I expect more will join, soon.  In the end 
it doesn't matter what the name of the exe-file is you use to fit, but it's 
the conceptual approach of the analysis that makes the difference.
Therefore, if you are contemplating migrating to a different equilibrium 
analysis platform with self-associating or hetero-associating systems in 
mind, mass conservation and the ability to correlate multiple signals from 
different cells (and for convenience perhaps systematic noise composition 
if you are using interference optical data as well) is what I would 
recommend looking for.

Peter

P.S.  Regarding Borries' idea of open source collaborative development of 
software, for me it does not make sense.  1) SEDPHAT and SEDFIT both have > 
1,000,000 lines of code, and it will be a nightmare to migrate to a 
different platform. 2) The overhead in getting familiar with common 
structures, module interfaces, languages, or libraries, and in the end 
documenting for others what I wrote will be much larger for me than simply 
to add stuff to my existing code, even if it would mean writing some 
aspects from scratch (there's nothing wrong with duplicating things, if it 
takes less time). 3) Every once in a while there are ideas how to do 
something different that requires a fundamental change how data and 
functions are organized.  There would be either a significantly higher 
hurdle doing this, or the requirement of an enormous amount of overhead 
(and impossilbe foresight) to provide general enough structures from the 
outset.  4) Frankly, I've had bad experiences sharing code.  5) Despite 
what I said above about going beyond quickly writing a fitting function, 
after all, we're not writing a piece of code for sending somebody to the 
moon.  Seriously, I think the development of multiple independent software 
will be helpful for the community, even if they duplicate some functions.

If a task can be easily modularized, one may be able to write a separate 
standalone software.  Or perhaps allow to output the result in the common 
XLA format which everybody can read (e.g. after editing files).  An example 
for something that can be done as a module is the prediction of buffer 
density and viscosity.  SEDNTERP is a great utility which most of us use 
quite extensively. Another example of a stand-alone utility that we use a 
lot is WINMATCH to assess if equilibrium is reached.   Considering that 
setting up an experiment takes me a couple of hours (not including the time 
to think about it), and running it a couple of days, I'll be happy taking 
10 sec to copy the values from SEDNTERP over to another software or even 
writing it by hand into a notebook.