Re: unorthodox use of PG for a customer

Поиск
Список
Период
Сортировка
От Gary M
Тема Re: unorthodox use of PG for a customer
Дата
Msg-id CAGwOJnzCPyO1_an1NEhOcW4NZ8Di5eW+nOEgPq_3nZTxX8jPfw@mail.gmail.com
обсуждение исходный текст
Ответ на unorthodox use of PG for a customer  (David Gauthier <davegauthierpg@gmail.com>)
Список pgsql-general
Hmmm  I usually don't participate in forums and accidentally sent my reply to  David Gauthier directly. This issue sounds interesting I'll give it another try and send the reply to the group.
 
From my simple, engineering hat perspective I feel this question is difficult to answer without hard numbers which include:

1) the amount of data stored in the db including indexes
2) Expected peak number and type of db operations per second
3) Max latency the operations can consume
4) Number and type of CPUs on each workstation
5) Available RAM for each workstation
6) Data storage utilization (MBs, IOPs)
7) Data storage capacity and type (SSD/spinning/network)
8) Type of network ( number of networks and bw)
9)  Network bw utilization 
10) during operation what is the swap space utilization

From these numbers the approach and architecture can be determined.

For example, if the available ram has sufficient capacity, file system cache and pgcache can be tuned to keep all the data in ram likely to significantly improve write and query performance.  Another option is to have a dedicate SSD for data and/or swap space. 

I prefer these analytical "rules of 10" as  starting point to guide alternatives

b/r
gary 


On Fri, Aug 24, 2018 at 12:18 PM, David Gauthier <davegauthierpg@gmail.com> wrote:
Hi Everyone:

I'm going to throw this internal customer request out for ideas, even though I think it's a bit crazy.  I'm on the brink of telling him it's impractical and/or inadvisable.  But maybe someone has a solution.

He's writing a script/program that runs on a workstation and needs to write data to a DB.  This process also sends work to a batch system on a server farm external to the workstation that will create multiple, parallel jobs/processes that also have to write to the DB as well. The workstation may have many of these jobs running at the same time.  And there are 58 workstation which all have/use locally mounted disks for this work. 

At first blush, this is easy.  Just create a DB on a server and have all those clients work with it.  But he's also adamant about having the DB on the same server(s) that ran the script AND on the locally mounted disk.  He said he doesn't want the overhead, dependencies and worries of anything like an external DB with a DBA, etc... . He also wants this to be fast.
 
My first thought was SQLite.  Apparently, they now have some sort of multiple, concurrent write ability.  But there's no way those batch jobs on remote machines are going to be able to get at the locally mounted disk on the workstation. So I dismissed that idea.  Then I thought about having 58 PG installs, one per workstation, each serving all the jobs pertaining to that workstation.  That could work.  But 58 DB instances ?  If he didn't like the ideal of one DBA, 58 can't be good.  Still, the DB would be on the workstation which seems to be what he wants.
 
I can't think of anything better.  Does anyone have any ideas?

Thanks in Advance !


В списке pgsql-general по дате отправления:

Предыдущее
От: TalGloz
Дата:
Сообщение: Re: Returning Vector of Pairs with a PostgreSQL C ExtensionFunction
Следующее
От: TalGloz
Дата:
Сообщение: Re: Linker errors while creating a PostgreSQL C extension function.