[RASMB] Sedfit Workshop

Wed Oct 5 10:13:12 PDT 2005

Hi All,

Since Peter Schuck advertised his workshop through the RASMB mail list, and 
since I attended, I though it appropriate to thank Peter for this workshop 
and make some general comments about teaching Sedimentation Data Analysis 
in general.

Thank you Peter, (and group and NIH helpers) for all the work that went 
into the workshop.  I was really impressed with the 380 MB of sample data 
that had been prepared.  And thank you for the time spent during that week.

I believe there is a need for workshops at this level.  I had attended the 
workshop at UConn when Peter did a single day for sedfit, but it seemed for 
me that the 3 days were needed to really get the hang of the program.  Most 
of the people at the workshop were already using sedfit and other analysis 
programs, but everyone seemed to come away with a better understanding of 
data analysis.  And since Peter and some of the instructors have a much 
better grasp mathematically of the data analysis than I do, it helped my 
understanding to have some things re-explained.  I think I understand even 
l(s)g(s) better.

Peter had people bring some data and make mini-presentations during the 
afternoons.  I thought this was an excellent idea and really added to the 
depth of the workshop.  It helped also to know what people were struggling 
with in their research.  There did seem to be a divide between the 
interests of people in industry and the interests of academic 
researchers.  Industry researchers were much more concerned about 
reproducibility, quantification of small amount of aggregates, and 
robustness of the analysis, while academic researchers often want to ignore 
low percentage impurities and get more thermodynamic information about the 
major components from the data. I would like to suggest to all centrifuge 
education centers to try to fit in some similar participatory sessions.  I 
know this is done informally at the UConn workshop as the teachers often 
help various people with their data sets during the workshop:  but this is 
done more informally and often other participants would have benefited from 
seeing a real struggle with the data.  In one case at UConn, the 
participant was told by the experienced NonLin user that his data was just 
too noisy to get more information.  I found out since the participant later 
asked to run his samples here at BBRI having moved on from a place that had 
an ultracentrifuge to one that did not.

One big take-home lesson is that curve fitting ultracentrifuge data with 
any program is influenced and limited by the data itself.  For instance, 
Peter mentioned that "analyzing equilibrium data is 'veird'".  Having 
studied with Tom Laue, I had heard before that analyzing equilibrium data 
is weird, but somehow this information had more impact when Peter said it, 
partly because it confirmed the data and the actual error surfaces are 
weird and not just the program NonLin.  During the workshop, a collaborator 
and I looked at some of his equilibrium data with Sedphat, Sedanal, and 
HeteroAnalysis.  At one point the collaborator walked away in 
frustration.  An hour later, he came back and was shocked to find that now 
the data was fitting.  We had to float the concentrations to get to a good 
fit, but he thought the concentrations were the one solid piece of 
information we had about the experiment:  he wanted to float the baseline 
offsets.  The concentrations floated to a factor of 2 fold over what he had 
measured, at first:  but then with the best fit they floated back down to 
within 10% of his measured concentrations.  I know from even a little bit 
of experience that if we had kept the concentrations fixed, it was never 
going to find the best fit.  By the way, HeteroAnalysis (the re-incarnation 
of NonLin) was the program that finally converged to the best fit though 
the other programs when given the good guesses from HeteroAnalysis 
converged at the same good answer.  (This supports other anecdotal evidence 
I have heard favoring the use of HeteroAnalysis for initial fitting).  Of 
course, like NonLin of old, it crashed a bit and got lost occasionally, and 
needed to have data sets added one by one at one point.  People would like 
to load the data to a program, push a button, then get the best fit:  but I 
do not think that is possible with any software.

Sincerely,

David Hayes