On 1/8/08, Erik Jones <erik@myemma.com> wrote:
> > Hmm so rather than NOT IN ( .. LIMIT 50) would you suggest IN ( ...
> > OFFSET 50) like in Erik's example? Or something else entirely?
>
> Well, that would give you some gain. Think about it like this: once
> a given user's history records are at 50 and you insert a row, if you
> use the NOT IN clause your comparing each of 51 rows to each of the
> 50 you want to keep to find the one that can go while with the IN
> version your comparing each of the 51 rows to the 1 that can go. Now
> how much of a gain that will be I can't say, YMMV. I don't remember
> you saying anything about it so I'll also go ahead and point out that
> you most likely will want an index on user_id if you don't already.
Thanks for the explanation Erik. I did already have the index, but
I've reimplemented using IN/OFFSET instead of NOT IN/LIMIT and it does
indeed seem to be faster.
> > Do you think a regular batch process to delete rows might be more
> > appropriate than a trigger in this scenario?
>
> That depends on your usage pattern. Assuming you aren't running user
> history report queries constantly that's probably what I'd do. Also,
> if you're sure you won't need anything but the last 50 records per
> user, I'd definitely agree with cleaning out data that's not needed.
OK cool, thanks for your advice Erik.
Cheers,
Jamie