Re: count(*) and bad design was: Experiences with extensibility

Поиск
Список
Период
Сортировка
От Alban Hertroys
Тема Re: count(*) and bad design was: Experiences with extensibility
Дата
Msg-id 90E693F2-125B-4554-A919-36C839F29F6C@solfertje.student.utwente.nl
обсуждение исходный текст
Ответ на Re: count(*) and bad design was: Experiences with extensibility  (Ivan Sergio Borgonovo <mail@webthatworks.it>)
Список pgsql-general
On Jan 15, 2008, at 3:03 PM, Ivan Sergio Borgonovo wrote:

> On Tue, 15 Jan 2008 14:43:35 +0100
> Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote:
>
>> You need to scroll to the last row to find the size of the result
>> set, but after that it's pretty easy to return random rows by
>> scrolling to them (and marking them 'read' in some way to prevent
>> accidentally returning the same row again).
>
> Could you post a snippet of code or something giving a more detailed
> idea of it?
>
> BTW since cursors support offset if you're not interested if the
> order of the retrieved rows is random too you don't even have to
> remember which one you read I think.

I posted it on this list a while ago when I came up with this
solution. I had some trouble finding my old post in the pgsql-general
archives though - I could find the thread, just not my final posting,
and searching didn't even turn up the thread.

I did find it here: http://www.mail-archive.com/pgsql-
general@postgresql.org/msg103670.html
The thread contains several other approaches to the problem, it
really depends on your problem domain which one fits your bill.

I think the function in my original posting could do with clearer
comments though, so here's the function again:

/*
  * Return $limit random rows from the result set of SQL query $query
  */
function randomSet(
    $query,        // The query to execute
    $limit        // The (max) number of random rows required
) {
         // SQL to declare the cursor
         query("DECLARE _cur SCROLL CURSOR WITHOUT HOLD FOR $query");

         /* Get the range for random(1, n)
     *
     * Determined by scrolling the cursor to the last row.
     * Equivalent to select count(*), but without a separate query.
     */
         query("MOVE FORWARD ALL IN _cur");
         $count = pg_affected_rows();

         $uniques = array(); // A list of used cursor offsets
         $resultSet = array();

    // Fetch random rows until we have enough or there are no more
         while ($limit > 0 && count($uniques) < $count) {
        // Determine random scroll offset
                 $idx = random(1, $count);

                 // Skip records with an index we already used
                 if (in_array($idx, $uniques))
                         continue;

                 //Fetch the random row
                 $record = query("FETCH ABSOLUTE $idx FROM _cur");

        // Add the row offset to the list of used offsets
                 $uniques[] = $idx;

                 $resultSet[] = $record;
                 $limit--;
         }

         // query
         query("CLOSE _cur");

         return $resultSet;
}



Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,478f32e59497683469944!



В списке pgsql-general по дате отправления:

Предыдущее
От: Reg Me Please
Дата:
Сообщение: Accessing composite type columns from C
Следующее
От: Sebastián Baioni
Дата:
Сообщение: Can't make backup (again)