On 2005-06-27, Greg Stark <gsstark@mit.edu> wrote:
> I believe all the picksplit functions are based on (apparently via
> copy/paste) a single algorithm that depends on a single operator: a kind
> of "distance" function. Usually it's the same function underlying the
> penalty gist api function.
That's not quite true. There are at least two quite different picksplit
algorithms in those of the contrib/* modules that I've studied, and in
general I do not think it is possible to provide a single generic
picksplit that will work efficiently for _all_ data types. (And it is of
course important not to constrain the types of data that are allowed...)
It might be reasonable to implement a "default" picksplit based on a
user-supplied metric function (_not_ the same metric as "penalty"). But
I think there always needs to be scope for the user to provide their own
split function.
--
Andrew, Supernews
http://www.supernews.com - individual and corporate NNTP services