Обсуждение: Re: [HACKERS] [COMMITTERS] pgsql: pageinspect: Try to fix some bugs in previous commit.
On Fri, Feb 3, 2017 at 10:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Robert Haas <robertmhaas@gmail.com> writes: >> On Thu, Feb 2, 2017 at 11:16 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: >>> I just made the C code agree with what the SQL declarations for the >>> functions say. > >> Doesn't look like it to me. You changed a bunch of places that say >> UInt32GetDatum to UInt64GetDatum, but the SQL type certainly isn't >> unsigned. > > The machines don't care about that. They do care about the width of > the datum. Particularly on 32-bit hardware, where one width is > pass-by-val and the other isn't. (Also, if your point is that you > wish we had a uint64 SQL type, I doubt we're going there.) I know the machines don't care about that, but I still think it'd be a better idea to be consistent with the SQL types throughout rather than only in places where failing to do so actually breaks something. It's not much of an abstraction layer if we're only kinda-sorta rigorous about using it correctly. If nothing else, it obscures best practice for new patch authors. > What needs to be resolved to decide if any of this is actually sane is to > figure out which of these values need to be int64 on the SQL side because > (a) they could practically exceed the range of signed int32 and (b) it > would bother us to show such values as negative rather than large > positive. I suspect that not all the things currently declared as int64 > really need to be. I also remain unhappy that we can't manage to be > consistent about what a BlockNumber parameter is represented as. The existing usage is mixed. For example, gin_metapage_info() returns pending_head and pending_tail as bigint, and those are BlockNumber at the C level. But bt_metap returns fastroot as int4, and that's also a BlockNumber at the C level. For my money, int8 is a better choice because I don't like the idea of the block number rolling over to a negative value for a large relation, but I probably wouldn't bother breaking compatibility for the existing cases where it's been done otherwise, because relations of at least 16TB in size are not yet very common. At some point we may have to bite that bullet but maybe not today. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, Feb 3, 2017 at 10:54 AM, Robert Haas <robertmhaas@gmail.com> wrote: >> What needs to be resolved to decide if any of this is actually sane is to >> figure out which of these values need to be int64 on the SQL side because >> (a) they could practically exceed the range of signed int32 and (b) it >> would bother us to show such values as negative rather than large >> positive. I suspect that not all the things currently declared as int64 >> really need to be. I also remain unhappy that we can't manage to be >> consistent about what a BlockNumber parameter is represented as. > > The existing usage is mixed. For example, gin_metapage_info() returns > pending_head and pending_tail as bigint, and those are BlockNumber at > the C level. But bt_metap returns fastroot as int4, and that's also a > BlockNumber at the C level. For my money, int8 is a better choice > because I don't like the idea of the block number rolling over to a > negative value for a large relation, but I probably wouldn't bother > breaking compatibility for the existing cases where it's been done > otherwise, because relations of at least 16TB in size are not yet very > common. At some point we may have to bite that bullet but maybe not > today. So based on that theory, here's a patch. - In hash_page_items, the returned columns are declared as itemoffset int2, ctid tid, and data int8 at the SQL level. There's existing precedent for returning itemoffset as either int2 or int4, but since at the C level it is uint16 it seems best to stick with int4, so the patch changes that. The other types seem OK: data is an int8 because it's reporting the hash code, and that's uint32 at the C level. - In hash_bitmap_info, the returned columns are declared as bitmapblkno int8, bitmapbit int4, and bitstatus bool. Since a block number is a uint32 at the C level, it seems right to use int8 at the SQL level per the above, because of signedness. The other two return columns are fine also. The BlahGetDatum macros mostly match the types, although in the case of bitmapblkno it differs in signedness (UInt64GetDatum rather than Int64GetDatum). - hash_metapage_info() returns a whole bunch of columns, all of which are unsigned quantities, except for hashm_ntuples, which is a double. With the attached patch, all of those unsigned quantities get promoted to the next larger signed type at the SQL level (uint32 -> int8, uint16 -> int4); exceptionally, hashm_procid is reported as an OID, since it is. In short, this patch makes hashfuncs.c consistent about (1) using the next wider signed type to report unsigned values and (2) using the GetDatum macro that matches the SQL return type in width and signedness. Objections? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Вложения
Robert Haas <robertmhaas@gmail.com> writes: > So based on that theory, here's a patch. > ... > In short, this patch makes hashfuncs.c consistent about (1) using the > next wider signed type to report unsigned values and (2) using the > GetDatum macro that matches the SQL return type in width and > signedness. Objections? I haven't actually reviewed the patch, but your description of it sounds sane. One thing to think about is what will happen if someday we want to use 64-bit hash codes (a day I think is not that far away). It sounds like you've already chosen bigint for any output field that represents a hash code or a related value such as a mask ... but it wouldn't hurt to look through the fields with that in mind. regards, tom lane
On Fri, Feb 3, 2017 at 12:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > One thing to think about is what will happen if someday we want to use > 64-bit hash codes (a day I think is not that far away). It sounds like > you've already chosen bigint for any output field that represents a > hash code or a related value such as a mask ... but it wouldn't hurt > to look through the fields with that in mind. Yeah, I think we're fine on that score. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company