Re: Sequence vs UUID
От | Ron |
---|---|
Тема | Re: Sequence vs UUID |
Дата | |
Msg-id | c140b8ee-941b-383a-986d-de1b924a0b65@gmail.com обсуждение исходный текст |
Ответ на | Re: Sequence vs UUID (Miles Elam <miles.elam@productops.com>) |
Ответы |
Re: Sequence vs UUID
(Miles Elam <miles.elam@productops.com>)
|
Список | pgsql-general |
Then it's not a Type 4 UUID, which is perfectly fine; just not random. Also, should now() be replaced by clock_timestamp(), so that it can be called multiple times in the same transaction? On 1/28/23 21:28, Miles Elam wrote: > On Sat, Jan 28, 2023 at 6:02 PM Ron <ronljohnsonjr@gmail.com> wrote: >> Type 4 UUIDs are sub-optimal for big table because cache hit rates drop through the floor. >> >> This lesson was burned into my psyche waaaay back in the Clinton administration. It was my task to speed up a five hourbatch job which read input records from a flat file, did some validations and then inserted them. Simply sorting theinput file on the primary key fields -- we used natural keys, not synthetics -- dropped the run time to two hours. (VMSSORT saved the day, because you could tell it the sort order you wanted; thus, I could preserve the header record atthe top of the file, and the trailer record at the end of the file without jumping through a bunch of hoops.) > This can be mitigated with judicious use of a sequence at the front of > the uuidv4. > > https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/ > > More effort than just calling the built-in gen_random_uuid() or > equivalent in app code, but a substantial performance gain for your > effort. > > https://github.com/tvondra/sequential-uuids > > And in a managed environment where you can't install custom > extensions, a fairly simple function with divide on unix epoch seconds > combined with a call to overlay(...) should suffice performance-wise. > At 60 seconds, this will loop every 45 days or so, and you can choose > how much "sequentialness" works for you, from 1 to 4 bytes at the > expense of pseudo-randomness. > > ----------------------------- > > -- Generate time interval UUID > CREATE OR REPLACE FUNCTION gen_interval_uuid(interval_seconds int4 = > 60, block_num_bytes int2 = 2) > RETURNS uuid LANGUAGE sql VOLATILE PARALLEL SAFE AS $$ > SELECT encode( > overlay( > -- convert the uuid to byte array > uuid_send(gen_random_uuid()) > -- use only the bottom bytes > PLACING substring( > int4send((extract(epoch FROM now()) / interval_seconds)::int4) > FROM (5 - block_num_bytes) > ) > -- place at the front two bytes of the uuid > FROM 1 > ) > -- convert the resulting byte array to hex for conversion to uuid > , 'hex')::uuid > WHERE interval_seconds > 0 AND block_num_bytes BETWEEN 1 AND 4 > $$; > > ----------------------------- > > Technically affecting the v4 spec. You could always convert to a > UUIDv8, which is the intention behind that new version even though the > standard hasn't been ratified yet. > > > Cheers, > > Miles Elam > > -- Born in Arizona, moved to Babylonia.
В списке pgsql-general по дате отправления: