[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]

Re: (unfortunate) performance of sort



[A complimentary Cc of this posting was sent to John P. Linderman
<jpl@research.att.com>],
who wrote in article <387DEC36.FC9DD18C@research.att.com>:
>     abe> I don't agree:
>     abe> when changing: sort {$a <=> $b} @p; #pipe2 (original)
>     abe> into: sort @p; #pipe1
>     abe> The benchmarks change dramaticaly in favour of the latter:

> I should clarify a couple points.  I meant that the poor performance
> of the sort was not due to the fact that a subroutine was being
> invoked, although that can, indeed, exact a (less terrifying)
> performance penalty.

I think sort {$a <=> $b} @p does not envoke any subroutine with recent
Perls.

> The behavior is not limited to organ-pipe (increasing then decreasing)
> input.  It is equally bad for saw-tooth input, which completes the
> initialization of @p using
>     push(@p, @p);       # rather than push(@p, reverse(@p))
> or truncated sawtooth input
>     push(@p, @p[0 .. int($size/4)]);
> It also doesn't rely on the monotonicity of the input...
> a little bit of disorder won't prevent the chaos,
> although this is harder to demonstrate in a few lines.
> We are talking large, unhappy, families of nasty inputs,
> not just a single delicately tuned pathological input.

What about using 3 different splitters (say,

  median of 1, n/2, n
  median of n/2-1, n/2, n/2+1
  median of n/4, n/2, 3n/4

) in turn on consequent iterations of the algo? Would not it make it
more robust on expected inputs?

> To put things into some kind of perspective, note that both Abe's
> benchmarks and mine don't look scary for arrays with less than
> 1000 elements.  The quadratic performance, 1000 * 1000,
> is 100 times worse than the N log N performance, 1000 * 10,
> but with the well-behaved sort time under 1/100 of a second,
> the difference can overlooked (unless you do it many times,
> or need rapid responses... can we step away from the nuclear
> reactor while we discuss this :-).  So for small (<1000 element)
> arrays, don't worry.

But for these guys one would not worry with rewriting qsort at all, as
p5p did...

Ilya


[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index][Thread Index][Top&Search][Original]