Обсуждение: NUMERIC needs OID's
Hi all, the exact NUMERIC datatype materializes more and more. What I got so far are the four arithmetic base operators, all six comparision operators and these functions: ABS(n) CEIL(n) EXP(n) FLOOR(n) LN(n) MOD(m,n) ROUND(n[,scale]) SIGN(n) SQRT(n) TRUNC(n[,scale]) The trigonometric ones I left out for now, but since SQRT(), EXP() and LN() work, it wouldn't be that hard to do them later (the former produce the same results as bc(1) does - shortly tested up to 400 digits after the decimal point). The speed of the complex functions is IMHO acceptable. For example EXP() is 25% better than bc(1) on small numbers but needs up to 3 times on very big ones (999). Who ever needs EXP(999) or more? The result is about 7.2e433! SQRT() is a bit slow - so be it. Postgres shouldn't become a substitute for arbitrary precision calculators. So I think it's time now to move the stuff into the backend. Therefor I first need a bunch of OID's (about 30 C functions, 20 SQL functions and 10 operators for now). Should I fill up all the holes? For the next step then I need some help in the parser. I think it would be good to have anything that looks like a double/float and any integer that doesn't fit into an int4 parsed into a NUMERIC, and if it really should be one of the others type-coerced back later. This is the only way to get rid of the single quotes for literal NUMERIC values without the danger of loosing digits (due to double's limited precision). But I'm not sure about all the things that might break if doing so. Rereading this mail I see that I forgot the aggregates. Just some more OID's that I need :-). Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #
> Hi all, > > the exact NUMERIC datatype materializes more and more. What I > got so far are the four arithmetic base operators, all six > comparision operators and these functions: > > ABS(n) > CEIL(n) > EXP(n) > FLOOR(n) > LN(n) > MOD(m,n) > ROUND(n[,scale]) > SIGN(n) > SQRT(n) > TRUNC(n[,scale]) > > The trigonometric ones I left out for now, but since SQRT(), > EXP() and LN() work, it wouldn't be that hard to do them > later (the former produce the same results as bc(1) does - > shortly tested up to 400 digits after the decimal point). > > The speed of the complex functions is IMHO acceptable. For > example EXP() is 25% better than bc(1) on small numbers but > needs up to 3 times on very big ones (999). Who ever needs > EXP(999) or more? The result is about 7.2e433! SQRT() is a > bit slow - so be it. Postgres shouldn't become a substitute > for arbitrary precision calculators. > > So I think it's time now to move the stuff into the backend. > Therefor I first need a bunch of OID's (about 30 C functions, > 20 SQL functions and 10 operators for now). Should I fill up > all the holes? No, I don't recommend it. I recommend getting a range of oid's. Oids are confusing enough, without trying to collect them scattered all over a range of values. The problem is that we don't have a range of 30 left anymore, and the needs of future development are surely going to eat up the rest. One idea is to remove some of the conversion functions we have defined that are rarely used. We don't NEED them with Thomas's conversion stuff. Thomas says native conversion is faster, but if we find some that we almost never use, we could rip them out and use those. However, my recommendation is that we have to start thinking about increasing the maximum allowable system oid. 1 - 10100 - 1017548429491288 - 12951597 - 15991608 - 16101619 - 16391644 - The max system oid is stored in transam.h as: /* ---------- * note: we reserve the first 16384 object ids for internal use. * oid's less than this appear inthe .bki files. the choice of * 16384 is completely arbitrary. * ---------- */#define BootstrapObjectIdData 16384 This is 2^14. We can increase this to 32k without any noticable problem, as far as I know. That will give us a nice range of availble oids, and allow renumbering if people want to clean up some of current oid mess. The only problem is that loading a pg_dump -o is going to probably cause rows with oids in the 16k-32k range to duplicate those in the system tables. Is that a problem? I am not sure. As far as I know, there is no reason oid's have to be unique, especially if they are in different tables. contrib/findoidjoins will get messed up by this, but I am not sure if that is a serious problem. I can't figure out another way around it. We could expand by going very high, near 2 billion, assuming no one is up there yet, but the code will get very messy doing that, and I don't recommend it. I think going to 2^15 is going to become necessary someday. The question is, do we do it for 6.5, and if so, how will the duplicate oids affect our users? A new system would start assigning rows with oids > 2^15, so only system table oids installed via pg_dump -o or COPY WITH OIDS FROM would be in the range 2^14-2^15. pg_dump uses the max system oid to determine of a function is a user function, but because it gets the max oid from the template1 table oid, this should be portable across the two oid systems. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Neat stuff! > > So I think it's time now to move the stuff into the backend. > > Therefor I first need a bunch of OID's (about 30 C functions, > > 20 SQL functions and 10 operators for now). Should I fill up > > all the holes? > No, I don't recommend it. I recommend getting a range of oid's. Oids > are confusing enough, without trying to collect them scattered all > over a range of values. > The problem is that we don't have a range of 30 left anymore, and the > needs of future development are surely going to eat up the rest. One > idea is to remove some of the conversion functions we have defined > that are rarely used. We don't NEED them with Thomas's conversion > stuff. Thomas says native conversion is faster, but if we find some > that we almost never use, we could rip them out and use those. That won't get us very many OIDs, so doesn't solve the problem in the long run. > However, my recommendation is that we have to start thinking about > increasing the maximum allowable system oid. > The max system oid is stored in transam.h as: > #define BootstrapObjectIdData 16384 Jan is asking for 20-50 OIDs. That is not enough to significantly affect the size of the available "reserved OID" space. I'd recommend taking a block of OIDs from somewhere just above the largest current pre-assigned OID. > I think going to 2^15 is going to become necessary someday. The > question is, do we do it for 6.5, and if so, how will the duplicate > oids affect our users? This seems like a good subject for v7.0, when users might expect a more troublesome transition. Anyway, by then we might have a different solution. - Tom
> For the next step then I need some help in the parser. I > think it would be good to have anything that looks like a > double/float and any integer that doesn't fit into an int4 > parsed into a NUMERIC, and if it really should be one of the > others type-coerced back later. This is the only way to get > rid of the single quotes for literal NUMERIC values without > the danger of loosing digits (due to double's limited > precision). But I'm not sure about all the things that might > break if doing so. I can help with this. I would suggest starting conservatively, converting to NUMERIC only when one of the other existing types does not work. Later, we can more tightly couple the numeric types to make this more transparent. When you have the numeric stuff ready to go, commit it to the source tree and we can start working on the parsing issues. - Tom
> > The problem is that we don't have a range of 30 left anymore, and the > > needs of future development are surely going to eat up the rest. One > > idea is to remove some of the conversion functions we have defined > > that are rarely used. We don't NEED them with Thomas's conversion > > stuff. Thomas says native conversion is faster, but if we find some > > that we almost never use, we could rip them out and use those. > > That won't get us very many OIDs, so doesn't solve the problem in the > long run. > > > However, my recommendation is that we have to start thinking about > > increasing the maximum allowable system oid. > > The max system oid is stored in transam.h as: > > #define BootstrapObjectIdData 16384 I am a dope. The first user oid is 16k, and unused_oids says: $ unused_oids1 - 10100 - 1017548429491288 - 12951597 - 15991608 - 16101619 - 16391644 - There is nothing magical about 1644. We have always been packing the oids into that range, while we have never used oids over 1644. Just start at 1650 and head toward 16384. We have roughly 16200 unused system oids left. -- Bruce Momjian | http://www.op.net/~candle maillist@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Bruce Momjian wrote: > [...] > oids into that range, while we have never used oids over 1644. Just > start at 1650 and head toward 16384. We have roughly 16200 unused > system oids left. I've just committed the NUMERIC stuff to the CURRENT tree. Used OID's 1700-1767 up to now (some more will follow when implementing the trigonometric functions). Well folks - it's there now and can be stressed! Don't forget to initdb. I'll add a first precision check using some of the complex math functions with scale 1000 to the regression soon. Jan -- #======================================================================# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #======================================== jwieck@debis.com (Jan Wieck) #