Обсуждение: [GENERAL] Multiple Schemas vs. Multiple Databases

Поиск
Список
Период
Сортировка

[GENERAL] Multiple Schemas vs. Multiple Databases

От
"Igal @ Lucee.org"
Дата:

Hello,

I have read quite a few articles about multiple schemas vs. multiple databases, but they are all very generic so I wanted to ask here for a specific use case:

I am migrating a Web Application from MS SQL Server to PostgreSQL.  For the sake of easier maintenance, on SQL Server I have two separate databases:

  1) Primary database containing the data for the application

  2) Secondary database containing "transient" data, e.g. logging of different activities on the website in order to generate statistics etc.

Both databases belong to the same application with the same roles and permissions.

The secondary database grows much faster, but the data in it is not mission-critical , and so the data is aggregated daily and the summaries are posted to the primary database, because only the aggregates are important here.

To keep the database sizes from growing too large, I periodically delete old data from the secondary database since the data becomes obsolete after a certain period of time.

At first I thought of doing the same in Postgres, but now it seems like the better way to go would be to keep one database with two schemas: primary and transient.

The main things that I need to do is:

  a) Be able to backup/restore each "part" separately.  Looks like pg_dump allows that for schemas via the --schema=schema argument.

  b) Be able to query aggregates from the secondary "part" and store the results in the primary one, which also seems easier with multiple schemas than multiple databases.

Am I right to think that two schemas are better in this use case or am I missing something important?

Thanks,

Igal Sapir
Lucee Core Developer
Lucee.org

Re: [GENERAL] Multiple Schemas vs. Multiple Databases

От
Melvin Davidson
Дата:


On Fri, Oct 13, 2017 at 3:29 PM, Igal @ Lucee.org <igal@lucee.org> wrote:

Hello,

I have read quite a few articles about multiple schemas vs. multiple databases, but they are all very generic so I wanted to ask here for a specific use case:

I am migrating a Web Application from MS SQL Server to PostgreSQL.  For the sake of easier maintenance, on SQL Server I have two separate databases:

  1) Primary database containing the data for the application

  2) Secondary database containing "transient" data, e.g. logging of different activities on the website in order to generate statistics etc.

Both databases belong to the same application with the same roles and permissions.

The secondary database grows much faster, but the data in it is not mission-critical , and so the data is aggregated daily and the summaries are posted to the primary database, because only the aggregates are important here.

To keep the database sizes from growing too large, I periodically delete old data from the secondary database since the data becomes obsolete after a certain period of time.

At first I thought of doing the same in Postgres, but now it seems like the better way to go would be to keep one database with two schemas: primary and transient.

The main things that I need to do is:

  a) Be able to backup/restore each "part" separately.  Looks like pg_dump allows that for schemas via the --schema=schema argument.

  b) Be able to query aggregates from the secondary "part" and store the results in the primary one, which also seems easier with multiple schemas than multiple databases.

Am I right to think that two schemas are better in this use case or am I missing something important?

Thanks,

Igal Sapir
Lucee Core Developer
Lucee.org


>b) Be able to query aggregates from the secondary "part" and store the results in the primary one, which also seems easier with multiple >schemas than multiple databases.

If that is what you need to do, then definitely use multiple schemas. In PostgreSQL, the only way to do cross db queries / DML, is with the dblink extension, and from personal use, it is a PIA to use.

--
Melvin Davidson
I reserve the right to fantasize.  Whether or not you
wish to share my fantasy is entirely up to you.

Re: [GENERAL] Multiple Schemas vs. Multiple Databases

От
John R Pierce
Дата:
On 10/13/2017 12:29 PM, Igal @ Lucee.org wrote:
>
> I have read quite a few articles about multiple schemas vs. multiple 
> databases, but they are all very generic so I wanted to ask here for a 
> specific use case:
>
> I am migrating a Web Application from MS SQL Server to PostgreSQL.  
> For the sake of easier maintenance, on SQL Server I have two separate 
> databases:
>
>   1) Primary database containing the data for the application
>
>   2) Secondary database containing "transient" data, e.g. logging of 
> different activities on the website in order to generate statistics etc.
>
> Both databases belong to the same application with the same roles and 
> permissions.
>
> The secondary database grows much faster, but the data in it is not 
> mission-critical , and so the data is aggregated daily and the 
> summaries are posted to the primary database, because only the 
> aggregates are important here.
>
> To keep the database sizes from growing too large, I periodically 
> delete old data from the secondary database since the data becomes 
> obsolete after a certain period of time.
>
> At first I thought of doing the same in Postgres, but now it seems 
> like the better way to go would be to keep one database with two 
> schemas: primary and transient.
>
> The main things that I need to do is:
>
>   a) Be able to backup/restore each "part" separately.  Looks like 
> pg_dump allows that for schemas via the --schema=schema argument.
>
>   b) Be able to query aggregates from the secondary "part" and store 
> the results in the primary one, which also seems easier with multiple 
> schemas than multiple databases.
>
> Am I right to think that two schemas are better in this use case or am 
> I missing something important?
>

generally, yeah, unless you eventually decide to split off the two 
databases onto separate servers for performance reasons.   Of course, to 
access the 'other' database, you'd need to use postgres_fdw or dblink.


-- 
john r pierce, recycling bits in santa cruz



-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Re: [GENERAL] Multiple Schemas vs. Multiple Databases

От
"Igal @ Lucee.org"
Дата:
On 10/13/2017 12:47 PM, John R Pierce wrote:
On 10/13/2017 12:29 PM, Igal @ Lucee.org wrote:

The main things that I need to do is:

  a) Be able to backup/restore each "part" separately.  Looks like pg_dump allows that for schemas via the --schema=schema argument.

  b) Be able to query aggregates from the secondary "part" and store the results in the primary one, which also seems easier with multiple schemas than multiple databases.

Am I right to think that two schemas are better in this use case or am I missing something important?


generally, yeah, unless you eventually decide to split off the two databases onto separate servers for performance reasons.   Of course, to access the 'other' database, you'd need to use postgres_fdw or dblink.

Thank you both for confirming,


Igal Sapir
Lucee Core Developer
Lucee.org

Re: [GENERAL] Multiple Schemas vs. Multiple Databases

От
Thomas Kellerer
Дата:
Melvin Davidson schrieb am 13.10.2017 um 21:42:
> If that is what you need to do, then definitely use multiple schemas.
> In PostgreSQL, the only way to do cross db queries / DML, is with the
> dblink extension, and from personal use, it is a PIA to use.

dblink is not the only way to do that.

Nowadays, cross-DB queries can quite easily be done using foreign tables (and they are quite efficient as well - much
moreefficient then dblink)
 

Thomas




-- 
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general