Обсуждение: alpha/64bit weirdness
I've made a little headway -- it can't find the mkoidname function because the attributes that it looks up for the argument types have a atttypid of 0 (see the following example): also, other information that should be in there is not, so it makes me suspect something wrong with insertion of attributes? I don't know enough to be able to see if this is affecting all attributes or just some of them. Does anyone have any pointers to where to check this problem out? $4 = {attrelid = 1249, attname = { data = "\000\000\000\000attrelid", '\000' <repeats 19 times>}, atttypid = 0, attdisbursion = 6.89648632e-314, attlen = 0, attnum = 0, attnelems = 65540, attcacheoff = 0, atttypmod = -1, attbyval = -1 ', attisset = -1 ', attalign = -1 ', attnotnull = -1 ', atthasdef = 1 '\001'}
okay.. it would appear that the bootstrap process has access to information (about attributes) that it shouldn't, quite immediately. i.e. when I breakpoint in DataFill, it has valid attribute structures, but they haven't been inserted yet (?!?) I'm not quite sure how this works. On Wed, 4 March 1998, at 15:24:00, Brett McCormick wrote: > I've made a little headway -- it can't find the mkoidname function > because the attributes that it looks up for the argument types have a > atttypid of 0 (see the following example): > > also, other information that should be in there is not, so it makes me > suspect something wrong with insertion of attributes? I don't know > enough to be able to see if this is affecting all attributes or just > some of them. > > Does anyone have any pointers to where to check this problem out? > > $4 = {attrelid = 1249, attname = { > data = "\000\000\000\000attrelid", '\000' <repeats 19 times>}, > atttypid = 0, attdisbursion = 6.89648632e-314, attlen = 0, attnum = 0, > attnelems = 65540, attcacheoff = 0, atttypmod = -1, attbyval = -1 ', > attisset = -1 ', attalign = -1 ', attnotnull = -1 ', > atthasdef = 1 '\001'}
> > > I've made a little headway -- it can't find the mkoidname function > because the attributes that it looks up for the argument types have a > atttypid of 0 (see the following example): > > also, other information that should be in there is not, so it makes me > suspect something wrong with insertion of attributes? I don't know > enough to be able to see if this is affecting all attributes or just > some of them. > > Does anyone have any pointers to where to check this problem out? > > $4 = {attrelid = 1249, attname = { > data = "\000\000\000\000attrelid", '\000' <repeats 19 times>}, > atttypid = 0, attdisbursion = 6.89648632e-314, attlen = 0, attnum = 0, > attnelems = 65540, attcacheoff = 0, atttypmod = -1, attbyval = -1 ', > attisset = -1 ', attalign = -1 ', attnotnull = -1 ', > atthasdef = 1 '\001'} > > I have an idea. Edit initdb and add a '-d 3' option to each 'postgres', then run initdb. You will see dumps of all the structures as things are happening, I think. Give it a try. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > > I've made a little headway -- it can't find the mkoidname function > because the attributes that it looks up for the argument types have a > atttypid of 0 (see the following example): > > also, other information that should be in there is not, so it makes me > suspect something wrong with insertion of attributes? I don't know > enough to be able to see if this is affecting all attributes or just > some of them. > > Does anyone have any pointers to where to check this problem out? > > $4 = {attrelid = 1249, attname = { > data = "\000\000\000\000attrelid", '\000' <repeats 19 times>}, > atttypid = 0, attdisbursion = 6.89648632e-314, attlen = 0, attnum = 0, > attnelems = 65540, attcacheoff = 0, atttypmod = -1, attbyval = -1 ', > attisset = -1 ', attalign = -1 ', attnotnull = -1 ', > atthasdef = 1 '\001'} > > Now that I am looking at this, I see that the attname has four bytes of NULL's before it. This looks like some kind of alignment error, perhaps, like the previous entry is writing past its end and into the this one. Everything after the 'data' element shows garbage because it is all shifted over. I did add the atttypmod field to the pg_attribute structure, and it is an int2/short. Wonder is that threw off some alignment, and only Alpha has a problem with it. Please try with Assert on: configure --enable-cassert Man, if I introduced this problem somehow, I am going to be upset with myself, and I am sure a few Alpha users will join me. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
I just noticed that.. I recompiled & rerean initdb with assert checking on, and, well, no change in output. here it is. i've stuck in my own elog check for a value of 0 for atttypid.. suggestions? initdb: using /usr/local/pgsql.test/lib/local1_template1.bki.source as input to create the template database. initdb: using /usr/local/pgsql.test/lib/global1.bki.source as input to create the global classes. initdb: using /usr/local/pgsql.test/lib/pg_hba.conf.sample as the host-based authentication control file. We are initializing the database system with username postgres (uid=1706). This user will own all the files and must also own the server process. initdb: creating template database in /usr/local/pgsql.test/data/base/template1 Running: postgres -boot -C -F -D/usr/local/pgsql.test/data -Q template1 ERROR: DefineIndex: woah, att->atttypid = 0 for attribute "attrelid" ERROR: DefineIndex: woah, att->atttypid = 0 for attribute "attrelid" longjmp or siglongjmp function used outside of saved context initdb: could not create template database initdb: cleaning up by wiping out /usr/local/pgsql.test/data/base/template1 On Wed, 4 March 1998, at 20:33:42, Bruce Momjian wrote: > Now that I am looking at this, I see that the attname has four bytes of > NULL's before it. This looks like some kind of alignment error, > perhaps, like the previous entry is writing past its end and into the > this one. Everything after the 'data' element shows garbage because it > is all shifted over. I did add the atttypmod field to the pg_attribute > structure, and it is an int2/short. Wonder is that threw off some > alignment, and only Alpha has a problem with it. > > Please try with Assert on: > > configure --enable-cassert > > Man, if I introduced this problem somehow, I am going to be upset with > myself, and I am sure a few Alpha users will join me. > > -- > Bruce Momjian | 830 Blythe Avenue > maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 > + If your life is a hard drive, | (610) 353-9879(w) > + Christ can be your backup. | (610) 853-3000(h)
Why would the atttypmod affect anything before it in the struct? I have verified that everything is shifted over for bytes, but that would lead be to beleive that somewhere the length of the first attribute (Oid) is being miscalculated? Where would the code write to this data structure without using a pointer to actual struct for obtaining the correct memory structure? I checked for offsetof macro calls that might cause this effect, to no avail.. We're a lot closer, though.. right? On Wed, 4 March 1998, at 20:33:42, Bruce Momjian wrote: > Now that I am looking at this, I see that the attname has four bytes of > NULL's before it. This looks like some kind of alignment error, > perhaps, like the previous entry is writing past its end and into the > this one. Everything after the 'data' element shows garbage because it > is all shifted over. I did add the atttypmod field to the pg_attribute > structure, and it is an int2/short. Wonder is that threw off some > alignment, and only Alpha has a problem with it. > > Please try with Assert on: > > configure --enable-cassert > > Man, if I introduced this problem somehow, I am going to be upset with > myself, and I am sure a few Alpha users will join me. > > -- > Bruce Momjian | 830 Blythe Avenue > maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 > + If your life is a hard drive, | (610) 353-9879(w) > + Christ can be your backup. | (610) 853-3000(h)
I did that.. Postgres doesn't take the option -d 3 however, just '-d'.. Have any idea when the pg_attribute cache is populated? On Wed, 4 March 1998, at 18:31:34, Bruce Momjian wrote: > I have an idea. Edit initdb and add a '-d 3' option to each > 'postgres', then run initdb. You will see dumps of all the structures > as things are happening, I think. Give it a try. > > -- > Bruce Momjian | 830 Blythe Avenue > maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 > + If your life is a hard drive, | (610) 353-9879(w) > + Christ can be your backup. | (610) 853-3000(h)
> > > I did that.. Postgres doesn't take the option -d 3 however, just > '-d'.. Have any idea when the pg_attribute cache is populated? Looks like it should: $ postgres -d 3 -D /u/pg/data test ---debug info--- Quiet = f Noversion = f timings = f dates = Normal bufsize = 64 sortmem = 512 query echo = f DatabaseName = [test] ---------------- InitPostgres().. POSTGRES backend interactive interface $Revision: 1.67 $ $Date: 1998/02/26 04:36:31 $ Not sure when it is initialized. > > > On Wed, 4 March 1998, at 18:31:34, Bruce Momjian wrote: > > > I have an idea. Edit initdb and add a '-d 3' option to each > > 'postgres', then run initdb. You will see dumps of all the structures > > as things are happening, I think. Give it a try. > > > > -- > > Bruce Momjian | 830 Blythe Avenue > > maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 > > + If your life is a hard drive, | (610) 353-9879(w) > > + Christ can be your backup. | (610) 853-3000(h) > -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> Why would the atttypmod affect anything before it in the struct? I > have verified that everything is shifted over for bytes, but that > would lead be to beleive that somewhere the length of the first > attribute (Oid) is being miscalculated? Where would the code write to > this data structure without using a pointer to actual struct for > obtaining the correct memory structure? I checked for offsetof macro > calls that might cause this effect, to no avail. Just speculating here, but I do know that the Alpha will force alignment within structures. So, if the structure is filled by reading a byte stream from a file, rather than filled field-by-field, it will misalign if it has integers < 4 bytes. During the initialization phase, the backend probably does not go through the file manager, but does some brute-force reading of each file on disk. - Tom
It couldn't! The inital values (as far as I can tell) are either compiled in, or fed in through the bootstrap process (text). this would have caused problems in previous releases as well, if it is the case. On Thu, 5 March 1998, at 03:47:25, Thomas G. Lockhart wrote: > > Why would the atttypmod affect anything before it in the struct? I > > have verified that everything is shifted over for bytes, but that > > would lead be to beleive that somewhere the length of the first > > attribute (Oid) is being miscalculated? Where would the code write to > > this data structure without using a pointer to actual struct for > > obtaining the correct memory structure? I checked for offsetof macro > > calls that might cause this effect, to no avail. > > Just speculating here, but I do know that the Alpha will force alignment > within structures. So, if the structure is filled by reading a byte stream > from a file, rather than filled field-by-field, it will misalign if it has > integers < 4 bytes. During the initialization phase, the backend probably does > not go through the file manager, but does some brute-force reading of each > file on disk. > > - Tom
> > > Why would the atttypmod affect anything before it in the struct? I > have verified that everything is shifted over for bytes, but that > would lead be to beleive that somewhere the length of the first > attribute (Oid) is being miscalculated? Where would the code write to > this data structure without using a pointer to actual struct for > obtaining the correct memory structure? I checked for offsetof macro > calls that might cause this effect, to no avail.. > > We're a lot closer, though.. right? I think your dump tells up something. Can you put a beak on the failure line, then do a backtrace after the elog(), and start putting breaks in the functions called on that structure, and see if we can find how that relation name is getting messed up. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
I suspect that it is getting messed up in a function that has since been called and returned.. I'll give it a shot though On Wed, 4 March 1998, at 23:09:05, Bruce Momjian wrote: > > We're a lot closer, though.. right? > > I think your dump tells up something. Can you put a beak on the failure > line, then do a backtrace after the elog(), and start putting breaks in > the functions called on that structure, and see if we can find how that > relation name is getting messed up. > > -- > Bruce Momjian | 830 Blythe Avenue > maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 > + If your life is a hard drive, | (610) 353-9879(w) > + Christ can be your backup. | (610) 853-3000(h)
Can you post a backtrace of the problem area. > > > I suspect that it is getting messed up in a function that has since > been called and returned.. I'll give it a shot though > > On Wed, 4 March 1998, at 23:09:05, Bruce Momjian wrote: > > > > We're a lot closer, though.. right? > > > > I think your dump tells up something. Can you put a beak on the failure > > line, then do a backtrace after the elog(), and start putting breaks in > > the functions called on that structure, and see if we can find how that > > relation name is getting messed up. > > > > -- > > Bruce Momjian | 830 Blythe Avenue > > maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 > > + If your life is a hard drive, | (610) 353-9879(w) > > + Christ can be your backup. | (610) 853-3000(h) > -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > Can you post a backtrace of the problem area. If the bad value is coming from the cache, can you add an elog(NOTICE) to the cache insert code, so you can see where the bad value is going in or out. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)
> > > > > Can you post a backtrace of the problem area. > > If the bad value is coming from the cache, can you add an elog(NOTICE) > to the cache insert code, so you can see where the bad value is going > in or out. > Another thing you could try is to add an Assert() in the cache input/output functions, to check for a leading null in the name, and generate the error at that point. Would help track it down. -- Bruce Momjian | 830 Blythe Avenue maillist@candle.pha.pa.us | Drexel Hill, Pennsylvania 19026 + If your life is a hard drive, | (610) 353-9879(w) + Christ can be your backup. | (610) 853-3000(h)