Re: Sequence vs UUID

Поиск
Список
Период
Сортировка
От Ron
Тема Re: Sequence vs UUID
Дата
Msg-id c140b8ee-941b-383a-986d-de1b924a0b65@gmail.com
обсуждение исходный текст
Ответ на Re: Sequence vs UUID  (Miles Elam <miles.elam@productops.com>)
Ответы Re: Sequence vs UUID  (Miles Elam <miles.elam@productops.com>)
Список pgsql-general
Then it's not a Type 4 UUID, which is perfectly fine; just not random.

Also, should now() be replaced by clock_timestamp(), so that it can be 
called multiple times in the same transaction?

On 1/28/23 21:28, Miles Elam wrote:
> On Sat, Jan 28, 2023 at 6:02 PM Ron <ronljohnsonjr@gmail.com> wrote:
>> Type 4 UUIDs are sub-optimal for big table because cache hit rates drop through the floor.
>>
>> This lesson was burned into my psyche waaaay back in the Clinton administration.  It was my task to speed up a five
hourbatch job which read input records from a flat file, did some validations and then inserted them.  Simply sorting
theinput file on the primary key fields -- we used natural keys, not synthetics -- dropped the run time to two hours.
(VMSSORT saved the day, because you could tell it the sort order you wanted; thus, I could preserve the header record
atthe top of the file, and the trailer record at the end of the file without jumping through a bunch of hoops.)
 
> This can be mitigated with judicious use of a sequence at the front of
> the uuidv4.
>
>      https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/
>
> More effort than just calling the built-in gen_random_uuid() or
> equivalent in app code, but a substantial performance gain for your
> effort.
>
>      https://github.com/tvondra/sequential-uuids
>
> And in a managed environment where you can't install custom
> extensions, a fairly simple function with divide on unix epoch seconds
> combined with a call to overlay(...) should suffice performance-wise.
> At 60 seconds, this will loop every 45 days or so, and you can choose
> how much "sequentialness" works for you, from 1 to 4 bytes at the
> expense of pseudo-randomness.
>
> -----------------------------
>
> -- Generate time interval UUID
> CREATE OR REPLACE FUNCTION gen_interval_uuid(interval_seconds int4 =
> 60, block_num_bytes int2 = 2)
> RETURNS uuid LANGUAGE sql VOLATILE PARALLEL SAFE AS $$
>    SELECT encode(
>      overlay(
>        -- convert the uuid to byte array
>        uuid_send(gen_random_uuid())
>        -- use only the bottom bytes
>        PLACING substring(
>            int4send((extract(epoch FROM now()) / interval_seconds)::int4)
>            FROM (5 - block_num_bytes)
>        )
>        -- place at the front two bytes of the uuid
>        FROM 1
>      )
>      -- convert the resulting byte array to hex for conversion to uuid
>      , 'hex')::uuid
>    WHERE interval_seconds > 0 AND block_num_bytes BETWEEN 1 AND 4
> $$;
>
> -----------------------------
>
> Technically affecting the v4 spec. You could always convert to a
> UUIDv8, which is the intention behind that new version even though the
> standard hasn't been ratified yet.
>
>
> Cheers,
>
> Miles Elam
>
>

-- 
Born in Arizona, moved to Babylonia.



В списке pgsql-general по дате отправления:

Предыдущее
От: Miles Elam
Дата:
Сообщение: Re: Sequence vs UUID
Следующее
От: Miles Elam
Дата:
Сообщение: Re: Sequence vs UUID