Обсуждение: general questions on Postgresql and deployment on win32 platform
I have gone through the documentation that come with version 8 beta 4 and I have a number of questions. (1) backup/restore I notice that in the documentation, it seems to suggest that an online backup, made via pg_start_backup() and pg_stop_backup() functions would back up all databases running on the server, rather than any specific instance. If that's the case, is it correct to state that WAL is instance specific rather than database specific? Does it also mean that I must back up and restore all the databases (or database cluster in Postgresql Speak) even if I am only interested in 1 database? (2) WAL location I can't seem to find a way to move WAL log to a different location other than data/pg_xlog. On *nix platform, it's a simple matter of creating a symbolic link. But on win32, there is no equivalent. However, even under *nix system, I believe symbolic link can only be created for directories on the same hard drive. This seems less than optimal. Typically, one would place database files on RAID 5 drives (to maximize random access speed) and log files on mirrored drives (to maximize sequential access speed). (3) Trigger Is it correct to state that old and new keywords are only available to PL/pgSQL function (but not a SQL function) and to row level trigger? Is there anyway to refer to old and new rows for statement level trigger? The visibility rule is also quite confusing. Can someone please clarify it in terms of before/after and row/statement level triggers (i.e., 4 possible permutations). (4) Process/thread on win32 platform On win32 platform, a process simply provides environment for a thread to run (whereas a thread is more like a *nix process, win32 has something called fibre that is more lightweight than thread). Typically, it's not recommended to start a new process per connection for DBMS running on win32 lest its performance would suffer. For example, SQL Server actually spawns threads to manage connections rather than processes. Does the Postgresql on win32 platform use process or thread when a connection is made? I notice that the configuration file lets you specify maximum number of connections. If I have a very high number of persistent concurrent users (thus preventing connection pooling from helping much), would performance suffer significantly? Thanks for any responses.
Calvin Wood wrote: > Does it also mean that I must back up and restore all the databases (or > database cluster in Postgresql Speak) even if I am only interested in 1 > database? You can use pg_dump to backup individual databases (or tables etc). A file-level backup does require all databases in one "cluster" (as we call it). In fact, if you can use pg_dump to backup your databases I would recommend doing so. It gives you much more control over restoring partial databases or to other systems. > (2) WAL location > I can't seem to find a way to move WAL log to a different location other > than data/pg_xlog. You can move everything else (using tablespaces). There is a symlink maker tool for Windows (sorry, can't remember where from) but installing to the logging drivers and then setting the default tablespace to the RAID drives is probably simplest. > (3) Trigger > Is it correct to state that old and new keywords are only available to > PL/pgSQL function (but not a SQL function) and to row level trigger? Is > there anyway to refer to old and new rows for statement level trigger? The > visibility rule is also quite confusing. Can someone please clarify it in > terms of before/after and row/statement level triggers (i.e., 4 possible > permutations). There has been some discussion of what OLD/NEW would mean in a statement level trigger. I don't believe any work has been done on this yet though. Trigger functions can be written in plpgsql, plperl, pltcl, plpython or C (although I'm not sure which of these are supported on Win32 yet). As far as change visibility is concerned, statement triggers are simple. Statement/Before = no changes Statement/After = all changes For row triggers Row/Before = some changes (as the statement progresses) Row/After = all changes So - for row/before, the first row affected in a 1000-row update will see the old version of all 1000 rows. The last row affected will see the new version of 999 rows. Which order the rows are updated in is not defined. > (4) Process/thread on win32 platform > Does the Postgresql on win32 > platform use process or thread when a connection is made? I notice that the > configuration file lets you specify maximum number of connections. If I have > a very high number of persistent concurrent users (thus preventing > connection pooling from helping much), would performance suffer > significantly? PostgreSQL is process-based and this is unlikely to change (you can read the hackers mailing list archives for details). If you have a very high number of persistent concurrent users then win32 is probably not the best platform to run on - your hardware is too constrained. However, in your scenario process-based systems don't really suffer - the real bottleneck is in starting up a new process which is noticably slower on Win32 than most other platforms. Since 8.0 will be the first release of PG on Windows, I wouldn't expect performance to be as good as on the various *nix ports. It's probably too early to say where bottlenecks will be - I would recommend testing against your specific requirements. If you could share the results of such tests, I'm sure they'd be gratefully received by the community. -- Richard Huxton Archonet Ltd
> I have gone through the documentation that come with version > 8 beta 4 and I have a number of questions. > > (1) backup/restore > I notice that in the documentation, it seems to suggest that > an online backup, made via pg_start_backup() and > pg_stop_backup() functions would back up all databases > running on the server, rather than any specific instance. > If that's the case, is it correct to state that WAL is > instance specific rather than database specific? > > Does it also mean that I must back up and restore all the > databases (or database cluster in Postgresql Speak) even if I > am only interested in 1 database? If you want to use PITR based backup/recovery, yes. If you use the "classic" way of doing pg_dump of the database, then you can do it on individual database location. > (2) WAL location > I can't seem to find a way to move WAL log to a different > location other than data/pg_xlog. On *nix platform, it's a > simple matter of creating a symbolic link. But on win32, > there is no equivalent. However, even under *nix system, I > believe symbolic link can only be created for directories on > the same hard drive. This seems less than optimal. Typically, > one would place database files on RAID 5 drives (to maximize > random access speed) and log files on mirrored drives (to > maximize sequential access speed). You can use either Directory Junctions (use the junction tool from www.sysinternals.com for example), or you can mount the new disk directly into the WAL directory (this can be done using the disk administrator mmc snapin, or using the diskpart.exe utility) > (3) Trigger > Is it correct to state that old and new keywords are only > available to PL/pgSQL function (but not a SQL function) and > to row level trigger? Is there anyway to refer to old and new > rows for statement level trigger? The visibility rule is also > quite confusing. Can someone please clarify it in terms of > before/after and row/statement level triggers (i.e., 4 > possible permutations). Can't speak for this one specifically. > (4) Process/thread on win32 platform > On win32 platform, a process simply provides environment for > a thread to run (whereas a thread is more like a *nix > process, win32 has something called fibre that is more > lightweight than thread). Typically, it's not recommended to > start a new process per connection for DBMS running on win32 > lest its performance would suffer. For example, SQL Server > actually spawns threads to manage connections rather than > processes. Does the Postgresql on win32 platform use process > or thread when a connection is made? Postgresql on win32 uses a process/connection model. This is far from ideal, but it's done to make sure you actually use the same database as you do on Unix. You can regain a large part of the loss by using pooled connections. (BTW, SQL Server doesn't necessarily use a thread/connection model either - it can use both fibres and some other hybrids. But you're right in that it only uses one process) > I notice that the > configuration file lets you specify maximum number of > connections. If I have a very high number of persistent > concurrent users (thus preventing connection pooling from > helping much), would performance suffer significantly? It would suffer some, but not very much I think. It uses more shared memory, but that's not a big issue on win32. Some scans have to be made over larger memory areas, and that's where you get a loss. I don't *think* it would be significant, but no testing has been done for that AFAIK. //Magnus
Calvin Wood <calvin.wood@gmail.com> writes: > symbolic link. But on win32, there is no equivalent. However, even under > *nix system, I believe symbolic link can only be created for directories on > the same hard drive. This seems less than optimal. Typically, one would > place database files on RAID 5 drives (to maximize random access speed) and > log files on mirrored drives (to maximize sequential access speed). On Unix, a symbolic link can point anywhere. It's hard links that are limited to the same filesystem (not necessarily a disk drive in the hardware sense). -Doug