[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [sdpd] Re: UPPW-5 solution - UPPW-6 problem
Hi Alan (and sdpd-ers)
I'd like to respond to some of the points raised in your email:
1)
> it is my opinion that a combined ITO and DICVOL probably solves more
> than 90% of every thing thrown at it. Thus new algorithms are only
> filling a small gap
Although I respect and use both ITO and DICVOL (and had a hand in the
development of the former), in my experience this statement claims too
much for them.
It was more nearly true 10 years ago when sdpd addressed rather simpler
problems, but programs like those were never designed to tackle the more
difficult and complex (but generally better measured) datasets that
modern lab diffractometers and synchrotrons are producing.
Thus, for example, ITO is reasonably tolerant of modest amounts of
impurity lines, but not of dominant zones, while DICVOL is liable to
struggle with today's complex high-volume datasets and can be completely
blocked by the presence of a single unidentified impurity line.
That's one reason why the Crysfire suite contains many more than the
classic trio of ITO, DICVOL and TREOR among its repertoire of 10+
supported indexing programs - so that, for example, I would advise
anyone using it always to routinely run also at least KOHL and LZON.
These new and more ambitions problems have much to do with why there is
a demand for new global optimisation and joint probability methods, such
as SVD-index, McMaille, AUTOX and Hmap (with more in the pipeline)
2)
> I have heard the term "exhaustive" used so much in indexing that I am
> beginning to believe that I got something wrong. So please enlighten me
> if you can.
An exhaustive search is one that definitively reports all the possible
solutions within its search domain, so that one never has to search that
particular domain again.
Here are some examples of methods that are exhaustive (though this is
not necessarily exactly true of all their implementations in programs):
a) Binary search = successive dichotomy.
Binary search requires an existence test for solutions, usually provided
by specifying hard +- limits in 2Theta or d-spacing for each observed
line (peak) position. A binary search with a particular dataset and
specified +- line-position limits, and specified cell parameter and
volume limits, should be completely definitive and should never require
repeating.
In my experience, DICVOL and LOSH (and hence LZON) approach these
requirements but do not completely achieve them, since from time to time
I have found them to produce hopefully small logical inconsistencies.
For example, solutions which are known from other methods to be present
within their declared search space are sometimes not reported by these
programs. Similarly, sometimes solutions that are reported by one run,
are not reported by another run where the conditions are slightly
different, though apparently not enough to account for the different
behaviour.
I haven't investigated the reasons for these occasional anomalies, but
assume that they arise as subtle side effects of internal optimisations
made by the author(s) to improve execution speed.
I can't comment on the behaviour of X-Cell, which is also reported to
use dichotomy (binary search), since it is a commercial program that I
haven't personally used.
b) Index permutation
Full index permutation in index space should also be exhaustive, within
its declared bounds.
Taupin's program (TAUP [=Powder]) makes quite a close approach to this,
though with a limited set of base lines. A consequence is that it can
become incredibly slow in low symmetry.
c) Grid search
Grid search methods are formally exhaustive for the particular
grid-point array used, and become fully exhaustive (within their various
other bounds, such as limits on cell volume, Miller indices, etc.) if
the step size is made sufficiently small.
Whether they will succeed in exhaustively flushing out all the solutions
present within their search space depends also on the power of the merit
criteria used.
My Mmap and Hmap programs achieve this reasonably reliably for the
2-dimensional (most-dominant-zone-based) SIW sections of solution space
for which they are designed. Their successor PEURIST, when it eventually
appears, will incorporate an extension of these methods.
Another grid-search program, which operates directly in up to three
dimensions (i.e. up to orthorhombic) is SCANIX by Wojtek Paszkowicz.
Grid search is inherently a relatively inefficient method (though its
calculations sometimes incorporate sophisticated optimisations), but it
can be very robust.
To summarise:
Truly exhaustive methods do exist, and there are already a number of
programs which come close to implementing them. However, indexing's
solution space is simply too large in low symmetry for us to rely on
them exclusively. There are reasons to suspect that the optimisations in
some of the existing programs can also occasionally produce logical
inconsistencies that make them fall short of complete exhaustiveness.
Perhaps, now that processor power is far more accessible than when those
programs were written, re-implementations will be developed ab initio
which, though perhaps not as fast, do not contain such compromises.
3)
> in my view an iterative least squares estimate between the observed
> and calculated d-spacings is the best solution choice within a
> particular range.
I'd prefer to substitute "2Thetas" for "d-spacings" in that sentence, at
least for data obtained with angular dispersion instruments.
Least-squares refinement vs 2Theta (as with Celref within Chekcell) may
not produce as high figures of merit as LS refinement vs d (or, more
usually, Q=1/dsq), but it is likely to produce a cell that more nearly
approaches physical reality.
Refinement against 2Theta in expanding shells of 2Theta (and hence d*) is
also particularly powerful at releasing a trial solution that is stuck in
a local minimum reasonably close to the physical solution.
I'd add that there are circumstances in which least-squares itself can
become unstable. In such situations one can still fall back on parabolic
refinement against a merit surface (the 3-point fitting of a parabola
cyclically to each variable parameter in turn, until convergence is
reached). This is a slower but incredibly robust and general method,
which can often succeed when others fail.
4)
> it's a lot of fun in any case.
You bet! At least if one enjoys exploring really knurly search spaces
that can contain tens of thousands of possible solutions!
With best wishes to all UPPW indexers and spectators
Robin Shirley
-----------------------------------------
To: sdpd...@yahoogroups.com
From: Alan Coelho <alan.coelho...@attglobal.net>
Date: Tue, 25 Nov 2003 16:43:44 +0100
Subject: [sdpd] Re: UPPW-5 solution - UPPW-6 problem
Reply-to: sdpd...@yahoogroups.com
hello to all
First I would like to praise Armel for bringing indexing into the
spotlight. Now I would like to comment on Armel's statement "Have the
most recent indexing software outmatched the old established ones ?
Perhaps, hard to say".
I don't think that these examples are going to show up progress on
whether new algorithms succeed; they do however show when they fail. I
do enjoy the challenge so don't stop Armel. The reason for being
pessimistic is the fact that if a new method does find a correct
solution to a difficult unknown then who is to know if the correct
solution was indeed found. As much as I think that "real" data is
necessary for testing methods I do think that there is no substitute for
simulated test data where the solutions are known. There is also no
substitute for understanding the methods rather than trusting their
implementations.
UPPW-5 is a case where powder data does not yield a unique solution.
When multiple solutions yield similar "perfect" Pawley/Le Bail fits with
similar de Wolff values then it is not a matter of failure of the
programs/methods but rather a failure of the data to yield a unique
solution. In my view it is therefore not possible for any indexing
method to resolve the ambiguity. This is not to say however that the
door should be closed to new methods. The way forward is to go
backwards. Back tracking could mean recollecting the data on a higher
resolution instrument (ie. Peter Stephens), annealing the sample or
trying some SEM/TEM analysis. If all this fails then it is really a
matter of trying structure solution for each of the possible lattice
parameters.
Excuse the long mail but while I am at it I would like to correct a
misconception regarding the idea of an "exhaustive" search and in the
process state the reason why I developed an indexing algorithm. I have
heard the term "exhaustive" used so much in indexing that I am beginning
to believe that I got something wrong. So please enlighten me if you can.
On data with small 2Th errors then a method can claim to be exhaustive.
However, on data with large errors due to say peak overlap on a dominant
zone problem then the term "exhaustive" looses meaning. The successive
dichotomy method, a stroke of genius by Daniel Louër to use it, is often
regarded as being exhaustive. For data with large errors the delta-2Th
values would need to be set large for the dichotomy method to proceed to
the correct solution. If the delta-2Th were indeed set large enough then
there would be many solution ranges returned (note I am defining a
solution range as a solution with +- delta-2Th). Thus sure enough the
solution range would be there but the correct range would be impossible
to define. If the correct range could somehow be identified then in my
view an iterative least squares estimate between the observed and
calculated d-spacings is the best solution choice within a particular
range. Note multiple Palwey/Le Bail fits would not be feasible if the
delta-2Th were large; this brings me to my own algorithm (dare I say its
Topas) which returns iterative least squares solutions. Now having said
that no method is going to resolve ambiguity, it is my opinion that a
combined ITO and DICVOL probably solves more than 90% of every thing
thrown at it. Thus new algorithms are only filling a small gap and to
find this gap is presumably what UPPW is all about - or is it? If not
then it's a lot of fun in any case.
cheers
alan
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/
------------------------ Yahoo! Groups Sponsor ---------------------~-->
Buy Ink Cartridges or Refill Kits for your HP, Epson, Canon or Lexmark
Printer at MyInks.com. Free s/h on orders $50 or more to the US & Canada.
http://www.c1tracking.com/l.asp?cid=5511
http://us.click.yahoo.com/mOAaAA/3exGAA/qnsNAA/UIYolB/TM
---------------------------------------------------------------------~->
Your use of Yahoo! Groups is subject to http://docs.yahoo.com/info/terms/