Обсуждение: what checksum algo?

Поиск
Список
Период
Сортировка

what checksum algo?

От
Scott Ribe
Дата:
What checksum algorithm wound up in 9.3?

(I found Simon Riggs 12/2011 submittal using Fletcher's, Michael Paquier's 7/2013 post stating CRC32 reduced to 16, and
anotherpost online claiming that it was changed from CRC before release but not stating what it was changed to.) 

--
Scott Ribe
scott_ribe@elevated-dev.com
http://www.elevated-dev.com/
(303) 722-0567 voice






Re: what checksum algo?

От
Michael Paquier
Дата:
On Thu, Nov 14, 2013 at 12:58 AM, Scott Ribe
<scott_ribe@elevated-dev.com> wrote:
> What checksum algorithm wound up in 9.3?
>
> (I found Simon Riggs 12/2011 submittal using Fletcher's, Michael Paquier's 7/2013 post stating CRC32 reduced to 16,
andanother post online claiming that it was changed from CRC before release but not stating what it was changed to.) 
CRC16 is used. It has been introduced with this commit:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=43e7a668499b8a69a62cc539a0fbe6983384339c
And then moved completely to src/include/storage/checksum_impl.h with
this commit to ease an external use of this algo:
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=f04216341dd1cc235e975f93ac806d9d3729a344
Regards,
--
Michael


Re: what checksum algo?

От
Peter Geoghegan
Дата:
On Wed, Nov 13, 2013 at 4:39 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> CRC16 is used.

Actually, subsequently another algorithm was introduced - see commit
43e7a668499b8a69a62cc539a0fbe6983384339c .


--
Regards,
Peter Geoghegan


Re: what checksum algo?

От
Tatsuo Ishii
Дата:
Hi,

It was good to see you in Japan.

PostgreSQL Enterprise Consortium (non profit PostgreSQL related
organization in Japan. http://www.pgecons.org) is about to inspect the
performance impact of the checksum using High-end PC server (real 80
cores with 2TB memory). What in my mind is using pgbench with custom
query (purely SELECT). Is there any recommendations/suggestions in
doing that?

(The result will be in public of course).
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp

> On Wed, Nov 13, 2013 at 4:39 PM, Michael Paquier
> <michael.paquier@gmail.com> wrote:
>> CRC16 is used.
>
> Actually, subsequently another algorithm was introduced - see commit
> 43e7a668499b8a69a62cc539a0fbe6983384339c .
>
>
> --
> Regards,
> Peter Geoghegan
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general


Re: what checksum algo?

От
Peter Geoghegan
Дата:
On Wed, Nov 13, 2013 at 5:53 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
> It was good to see you in Japan.

Likewise.

> PostgreSQL Enterprise Consortium (non profit PostgreSQL related
> organization in Japan. http://www.pgecons.org) is about to inspect the
> performance impact of the checksum using High-end PC server (real 80
> cores with 2TB memory). What in my mind is using pgbench with custom
> query (purely SELECT). Is there any recommendations/suggestions in
> doing that?
>
> (The result will be in public of course).

Well, off the top of my head I would of course be sure to build
Postgres to take advantage of this:

 * Vectorization of the algorithm requires 32bit x 32bit -> 32bit integer
 * multiplication instruction. As of 2013 the corresponding instruction is
 * available on x86 SSE4.1 extensions (pmulld) and ARM NEON (vmul.i32).
 * Vectorization requires a compiler to do the vectorization for us. For recent
 * GCC versions the flags -msse4.1 -funroll-loops -ftree-vectorize are enough
 * to achieve vectorization.

Unfortunately I have no idea what packagers are currently doing about
this. Could you please enlighten me, Devrim?

It also occurs to me that pgbench will be pretty unsympathetic to
checksums as compared to a non-checksummed baseline here, because of
course as always it uses a uniform distribution, and that's going to
literally maximize the amount of verification that must occur. Maybe
that's something you're interested in, because you want to
characterize the worst case. If the average case is more interesting,
you could try applying this patch:

https://commitfest.postgresql.org/action/patch_view?id=1240

I don't know if the patch is any good, having not looked at the code,
but surely as the original author of pgbench you are eminently
qualified to judge this. I think that in general I prefer a uniform
distribution, because most often I look to pgbench to satisfy myself
that certain types of regressions have not occurred. That's quite a
different thing to a representative workload, obviously.

--
Regards,
Peter Geoghegan


Re: what checksum algo?

От
Tatsuo Ishii
Дата:
> Well, off the top of my head I would of course be sure to build
> Postgres to take advantage of this:
>
>  * Vectorization of the algorithm requires 32bit x 32bit -> 32bit integer
>  * multiplication instruction. As of 2013 the corresponding instruction is
>  * available on x86 SSE4.1 extensions (pmulld) and ARM NEON (vmul.i32).
>  * Vectorization requires a compiler to do the vectorization for us. For recent
>  * GCC versions the flags -msse4.1 -funroll-loops -ftree-vectorize are enough
>  * to achieve vectorization.
>
> Unfortunately I have no idea what packagers are currently doing about
> this. Could you please enlighten me, Devrim?

No problem. We will install PostgreSQL from source code anyway.  I
tried in my local environment, PostgreSQL compiles fine with the
addional arguments you gave me and passed regression test (of course
pg_regress.c is modified to add initdb -k flag).

> It also occurs to me that pgbench will be pretty unsympathetic to
> checksums as compared to a non-checksummed baseline here, because of
> course as always it uses a uniform distribution, and that's going to
> literally maximize the amount of verification that must occur. Maybe
> that's something you're interested in, because you want to
> characterize the worst case. If the average case is more interesting,
> you could try applying this patch:
>
> https://commitfest.postgresql.org/action/patch_view?id=1240
>
> I don't know if the patch is any good, having not looked at the code,
> but surely as the original author of pgbench you are eminently
> qualified to judge this. I think that in general I prefer a uniform
> distribution, because most often I look to pgbench to satisfy myself
> that certain types of regressions have not occurred. That's quite a
> different thing to a representative workload, obviously.

Ok, I will look into this when I have enough time.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp