Обсуждение: Contributing some code
Hi,
This is my first post on the list. My name is Antonio. I am a CS grad student and my field of study is about databases and information retrieval. To get some practical knowledge, I've been studying Postgresql codebase for a while.
Now I would like to contribute with some code and I've chosen the following topic of the TODO list :
- Allow reporting of which objects are in which tablespaces
- This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.
- The topic suggests to use the pg_tablespace_databases to discover which database is using a specific tablespace and then connect to each database and find the objects in the tablespaces.
- I checked the code of pg_tablespace_databases, defined in src/backend/utils/adt/misc.c, and see that it uses a much simpler approach : It just reads the tablespaces directories and return the name of the directories that represents databases OIDs.
- Although the function works as expected, I can see some issues not addressed in the code :
- - It does not check for permissions. Any user can execute it;
- - It does not check if the platform supports symlinks, which can cause an error because the function is trying to follow the links defined in base/pg_tblspc.
- I could use the same approach and write a function that goes down one more level in the directory structure and find the objects' OIDs inside each database directory, but I don't know if this is the better way to do that.
- Please, could someone give me feedback and help me with this topic ?
- Regards,
- Antonio Belloni
On 28 December 2017 at 01:40, Antonio Belloni <antonio.belloni@gmail.com> wrote:
Hi,This is my first post on the list. My name is Antonio. I am a CS grad student and my field of study is about databases and information retrieval. To get some practical knowledge, I've been studying Postgresql codebase for a while.Now I would like to contribute with some code and I've chosen the following topic of the TODO list :
- Allow reporting of which objects are in which tablespaces
- This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.
- The topic suggests to use the pg_tablespace_databases to discover which database is using a specific tablespace and then connect to each database and find the objects in the tablespaces.
- I checked the code of pg_tablespace_databases, defined in src/backend/utils/adt/misc.c, and see that it uses a much simpler approach : It just reads the tablespaces directories and return the name of the directories that represents databases OIDs.
- Although the function works as expected, I can see some issues not addressed in the code :
- - It does not check for permissions. Any user can execute it;
- - It does not check if the platform supports symlinks, which can cause an error because the function is trying to follow the links defined in base/pg_tblspc.
- I could use the same approach and write a function that goes down one more level in the directory structure and find the objects' OIDs inside each database directory, but I don't know if this is the better way to do that.
There's a bit of a trap hidden here. The names of relation extents look like oids, possibly with an extent number for relations bigger than 1GB. But they aren't. They're relfilenode numbers.
PostgreSQL maps relation oids to relfilenodes. By default on a new system, relations will often have the same relfilenode as oid. That's a pity IMO; it'd be way less confusing if we allocated relfilenodes from a wholly different counter, because as it is, it gives people the false impression they can expect the filename relfilenode to be the relation oid.
In fact, what happens (per my probably imperfect understanding) is that PostgreSQL checks pg_class (via the relcache) for the oid of the table. It then uses RelationIsMapped to see if it's a normal relation with the filenode number in pg_class or not. If it's a normal (non-mapped) relation, it uses the Relation's rd_node to find the relation's physical address tablespace, dboid, and relfilenode. If it's a mapped relation, it instead consults the relmapper to find the relation's storage; see src/backend/utils/cache/relmapper.c .
See also src/backend/storage/smgr/README,
Hi Craig,
Ok. I see your point.
I'll study the storage internals a little bit more and try to figure out a better approach using the directions you have made available.
Regards,
Antonio Belloni
On Thu, Dec 28, 2017 at 12:51 AM, Craig Ringer <craig@2ndquadrant.com> wrote:
On 28 December 2017 at 01:40, Antonio Belloni <antonio.belloni@gmail.com> wrote:Hi,This is my first post on the list. My name is Antonio. I am a CS grad student and my field of study is about databases and information retrieval. To get some practical knowledge, I've been studying Postgresql codebase for a while.Now I would like to contribute with some code and I've chosen the following topic of the TODO list :
- Allow reporting of which objects are in which tablespaces
- This item is difficult because a tablespace can contain objects from multiple databases. There is a server-side function that returns the databases which use a specific tablespace, so this requires a tool that will call that function and connect to each database to find the objects in each database for that tablespace.
- The topic suggests to use the pg_tablespace_databases to discover which database is using a specific tablespace and then connect to each database and find the objects in the tablespaces.
- I checked the code of pg_tablespace_databases, defined in src/backend/utils/adt/misc.c, and see that it uses a much simpler approach : It just reads the tablespaces directories and return the name of the directories that represents databases OIDs.
- Although the function works as expected, I can see some issues not addressed in the code :
- - It does not check for permissions. Any user can execute it;
- - It does not check if the platform supports symlinks, which can cause an error because the function is trying to follow the links defined in base/pg_tblspc.
- I could use the same approach and write a function that goes down one more level in the directory structure and find the objects' OIDs inside each database directory, but I don't know if this is the better way to do that.
There's a bit of a trap hidden here. The names of relation extents look like oids, possibly with an extent number for relations bigger than 1GB. But they aren't. They're relfilenode numbers.PostgreSQL maps relation oids to relfilenodes. By default on a new system, relations will often have the same relfilenode as oid. That's a pity IMO; it'd be way less confusing if we allocated relfilenodes from a wholly different counter, because as it is, it gives people the false impression they can expect the filename relfilenode to be the relation oid.In fact, what happens (per my probably imperfect understanding) is that PostgreSQL checks pg_class (via the relcache) for the oid of the table. It then uses RelationIsMapped to see if it's a normal relation with the filenode number in pg_class or not. If it's a normal (non-mapped) relation, it uses the Relation's rd_node to find the relation's physical address tablespace, dboid, and relfilenode. If it's a mapped relation, it instead consults the relmapper to find the relation's storage; see src/backend/utils/cache/relmapper.c . See also src/backend/storage/smgr/README, This means you can't determine relation oids from ondisk state without scanning pg_class. And pg_class is per-database, not a shared relation, so you must look at each db in turn, since Pg doesn't support cross-DB queries. Logical decoding handles this with the RelidByRelfilenode function, but there are issues there around making sure you have the right snapshot etc.--