Обсуждение: general questions on Postgresql and deployment on win32 platform

Поиск
Список
Период
Сортировка

general questions on Postgresql and deployment on win32 platform

От
Calvin Wood
Дата:
I have gone through the documentation that come with version 8 beta 4 and I
have a number of questions.

(1) backup/restore
I notice that in the documentation, it seems to suggest that an online
backup, made via pg_start_backup() and pg_stop_backup() functions would back
up all databases running on the server, rather than any specific instance.
If that's the case, is it correct to state that WAL is instance specific
rather than database specific?

Does it also mean that I must back up and restore all the databases (or
database cluster in Postgresql Speak) even if I am only interested in 1
database?

(2) WAL location
I can't seem to find a way to move WAL log to a different location other
than data/pg_xlog. On *nix platform, it's a simple matter of creating a
symbolic link. But on win32, there is no equivalent. However, even under
*nix system, I believe symbolic link can only be created for directories on
the same hard drive. This seems less than optimal. Typically, one would
place database files on RAID 5 drives (to maximize random access speed) and
log files on mirrored drives (to maximize sequential access speed).

(3) Trigger
Is it correct to state that old and new keywords are only available to
PL/pgSQL function (but not a SQL function) and to row level trigger? Is
there anyway to refer to old and new rows for statement level trigger? The
visibility rule is also quite confusing. Can someone please clarify it in
terms of before/after and row/statement level triggers (i.e., 4 possible
permutations).

(4) Process/thread on win32 platform
On win32 platform, a process simply provides environment for a thread to run
(whereas a thread is more like a *nix process, win32 has something called
fibre that is more lightweight than thread). Typically, it's not recommended
to start a new process per connection for DBMS running on win32 lest its
performance would suffer. For example, SQL Server actually spawns threads to
manage connections rather than processes. Does the Postgresql on win32
platform use process or thread when a connection is made? I notice that the
configuration file lets you specify maximum number of connections. If I have
a very high number of persistent concurrent users (thus preventing
connection pooling from helping much), would performance suffer
significantly?

Thanks for any responses.

Re: general questions on Postgresql and deployment on win32

От
Richard Huxton
Дата:
Calvin Wood wrote:
> Does it also mean that I must back up and restore all the databases (or
> database cluster in Postgresql Speak) even if I am only interested in 1
> database?

You can use pg_dump to backup individual databases (or tables etc). A
file-level backup does require all databases in one "cluster" (as we
call it).
In fact, if you can use pg_dump to backup your databases I would
recommend doing so. It gives you much more control over restoring
partial databases or to other systems.

> (2) WAL location
> I can't seem to find a way to move WAL log to a different location other
> than data/pg_xlog.

You can move everything else (using tablespaces). There is a symlink
maker tool for Windows (sorry, can't remember where from) but installing
to the logging drivers and then setting the default tablespace to the
RAID drives is probably simplest.

> (3) Trigger
> Is it correct to state that old and new keywords are only available to
> PL/pgSQL function (but not a SQL function) and to row level trigger? Is
> there anyway to refer to old and new rows for statement level trigger? The
> visibility rule is also quite confusing. Can someone please clarify it in
> terms of before/after and row/statement level triggers (i.e., 4 possible
> permutations).

There has been some discussion of what OLD/NEW would mean in a statement
level trigger. I don't believe any work has been done on this yet
though. Trigger functions can be written in plpgsql, plperl, pltcl,
plpython or C (although I'm not sure which of these are supported on
Win32 yet).

As far as change visibility is concerned, statement triggers are simple.
  Statement/Before = no changes
  Statement/After  = all changes
For row triggers
  Row/Before = some changes (as the statement progresses)
  Row/After  = all changes
So - for row/before, the first row affected in a 1000-row update will
see the old version of all 1000 rows. The last row affected will see the
new version of 999 rows. Which order the rows are updated in is not defined.

> (4) Process/thread on win32 platform
> Does the Postgresql on win32
> platform use process or thread when a connection is made? I notice that the
> configuration file lets you specify maximum number of connections. If I have
> a very high number of persistent concurrent users (thus preventing
> connection pooling from helping much), would performance suffer
> significantly?

PostgreSQL is process-based and this is unlikely to change (you can read
the hackers mailing list archives for details). If you have a very high
number of persistent concurrent users then win32 is probably not the
best platform to run on - your hardware is too constrained. However, in
your scenario process-based systems don't really suffer - the real
bottleneck is in starting up a new process which is noticably slower on
Win32 than most other platforms.

Since 8.0 will be the first release of PG on Windows, I wouldn't expect
performance to be as good as on the various *nix ports. It's probably
too early to say where bottlenecks will be - I would recommend testing
against your specific requirements. If you could share the results of
such tests, I'm sure they'd be gratefully received by the community.

--
   Richard Huxton
   Archonet Ltd

Re: general questions on Postgresql and deployment on win32 platform

От
"Magnus Hagander"
Дата:
> I have gone through the documentation that come with version
> 8 beta 4 and I have a number of questions.
>
> (1) backup/restore
> I notice that in the documentation, it seems to suggest that
> an online backup, made via pg_start_backup() and
> pg_stop_backup() functions would back up all databases
> running on the server, rather than any specific instance.
> If that's the case, is it correct to state that WAL is
> instance specific rather than database specific?
>
> Does it also mean that I must back up and restore all the
> databases (or database cluster in Postgresql Speak) even if I
> am only interested in 1 database?

If you want to use PITR based backup/recovery, yes. If you use the
"classic" way of doing pg_dump of the database, then you can do it on
individual database location.


> (2) WAL location
> I can't seem to find a way to move WAL log to a different
> location other than data/pg_xlog. On *nix platform, it's a
> simple matter of creating a symbolic link. But on win32,
> there is no equivalent. However, even under *nix system, I
> believe symbolic link can only be created for directories on
> the same hard drive. This seems less than optimal. Typically,
> one would place database files on RAID 5 drives (to maximize
> random access speed) and log files on mirrored drives (to
> maximize sequential access speed).

You can use either Directory Junctions (use the junction tool from
www.sysinternals.com for example), or you can mount the new disk
directly into the WAL directory (this can be done using the disk
administrator mmc snapin, or using the diskpart.exe utility)


> (3) Trigger
> Is it correct to state that old and new keywords are only
> available to PL/pgSQL function (but not a SQL function) and
> to row level trigger? Is there anyway to refer to old and new
> rows for statement level trigger? The visibility rule is also
> quite confusing. Can someone please clarify it in terms of
> before/after and row/statement level triggers (i.e., 4
> possible permutations).

Can't speak for this one specifically.


> (4) Process/thread on win32 platform
> On win32 platform, a process simply provides environment for
> a thread to run (whereas a thread is more like a *nix
> process, win32 has something called fibre that is more
> lightweight than thread). Typically, it's not recommended to
> start a new process per connection for DBMS running on win32
> lest its performance would suffer. For example, SQL Server
> actually spawns threads to manage connections rather than
> processes. Does the Postgresql on win32 platform use process
> or thread when a connection is made?

Postgresql on win32 uses a process/connection model. This is far from
ideal, but it's done to make sure you actually use the same database as
you do on Unix.
You can regain a large part of the loss by using pooled connections.

(BTW, SQL Server doesn't necessarily use a thread/connection model
either - it can use both fibres and some other hybrids. But you're right
in that it only uses one process)


> I notice that the
> configuration file lets you specify maximum number of
> connections. If I have a very high number of persistent
> concurrent users (thus preventing connection pooling from
> helping much), would performance suffer significantly?

It would suffer some, but not very much I think. It uses more shared
memory, but that's not a big issue on win32. Some scans have to be made
over larger memory areas, and that's where you get a loss. I don't
*think* it would be significant, but no testing has been done for that
AFAIK.

//Magnus

Re: general questions on Postgresql and deployment on

От
Doug McNaught
Дата:
Calvin Wood <calvin.wood@gmail.com> writes:

> symbolic link. But on win32, there is no equivalent. However, even under
> *nix system, I believe symbolic link can only be created for directories on
> the same hard drive. This seems less than optimal. Typically, one would
> place database files on RAID 5 drives (to maximize random access speed) and
> log files on mirrored drives (to maximize sequential access speed).

On Unix, a symbolic link can point anywhere.  It's hard links that are
limited to the same filesystem (not necessarily a disk drive in the
hardware sense).

-Doug