Обсуждение: Clustered index to preserve data locality in a multitenant application?
On Tue, Aug 30, 2016 at 7:10 AM, Nicolas Grilly <nicolas@vocationcity.com> wrote: > Let's say we have a table containing data for 10,000 tenants and 10,000 rows > per tenant, for a total of 100,000,000 rows. Let's say each 8 KB block > contains ~10 rows. Let's way we want to compute the sum of an integer column > for all rows belonging to a given tenant ID. I'll assume you have an index on the tenant ID. In that case, your queries will be pretty fast. On some instances, we have multi-column indexes starting with the tenant ID, and those are used very effectively as well. I never worry about data locality. Depending on your data distribution, you may want to consider table partitions based on the tenant id. I personally never bother with that, but split based on some other key in the data.
I'll assume you have an index on the tenant ID. In that case, your
queries will be pretty fast.
On some instances, we have multi-column indexes starting with the
tenant ID, and those are used very effectively as well.
I never worry about data locality.
Depending on your data distribution, you may want to consider table
partitions based on the tenant id. I personally never bother with
that, but split based on some other key in the data.
Re: Clustered index to preserve data locality in a multitenant application?
On Tue, Aug 30, 2016 at 07:51:58PM +0200, Nicolas Grilly wrote: > On Tue, Aug 30, 2016 at 7:26 PM, Vick Khera <vivek@khera.org> wrote: > > > I'll assume you have an index on the tenant ID. In that case, your > > queries will be pretty fast. > > > > On some instances, we have multi-column indexes starting with the > > tenant ID, and those are used very effectively as well. > > > > I never worry about data locality. > > > > Yes, we have an index starting with the tenant ID, and the query uses the > index. > > But I'm still worried about PostgreSQL having to fetch 10 times more pages > from the disk than MySQL. If each 8K page contains approximately 10 rows, > fetching one page in MySQL will return 10 "useful" rows belonging to the > tenant. By comparison, fetching one page in PostgreSQL will probably return > only 1 "useful" row belonging to the tenant. In terms of IO, it's a big > difference. > > > > Depending on your data distribution, you may want to consider table > > partitions based on the tenant id. I personally never bother with > > that, but split based on some other key in the data. > > > > You're right. I don't really need a clustered index (like InnoDB). What I > need is to store rows belonging to the same tenant close from each other. > Partitioning can help with that. But the lack of declarative partitioning > makes it cumbersome (I've read this is worked on). Hi, We have been using the extension pg_repack to keep a table groomed into cluster order. With an appropriate FILLFACTOR to keep updates on the same page, it works well. The issue is that it needs space to rebuild the new index/table. If you have that, it works well. Regards, Ken
We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.
Interesting!
Check BRIN indexs, they are "designed for handling very large tables in
which certain columns have some natural correlation with their physical
location within the table", I think they fit your needs.
For Nicolas’s situation, that would require 10,000 partitions – not very useful, and each partition would be very small.
In Postgres, as you mentioned, clustering is a “one time” operation but only in the sense that after you add more rows, you’ll need to re-cluster the table. Depending on the activity model for that table, that may be feasible/ok. For example, if you load it via regular batch scripts, then the clustering could be done after those loads. If you add rows only rarely but then do lots of updates, then the clustering would work great. If this is an active real time data table, then clustering would not be viable.
Re: Clustered index to preserve data locality in a multitenant application?
On Wed, Aug 31, 2016 at 05:23:50PM +0200, Nicolas Grilly wrote: > On Tue, Aug 30, 2016 at 8:17 PM, Kenneth Marshall <ktm@rice.edu> wrote: > > > We have been using the extension pg_repack to keep a table groomed into > > cluster order. With an appropriate FILLFACTOR to keep updates on the same > > page, it works well. The issue is that it needs space to rebuild the new > > index/table. If you have that, it works well. > > > > Interesting! > Do you run pg_repack on a regular schedule using something like cron, or > does it run automatically in the background? > Is it compatible with PostgreSQL 9.6? Hi, We just run it via cron. In our case, we run it once a day, but depending on your churn, it could be run once a week or more. Regards, Ken
We just run it via cron. In our case, we run it once a day, but depending on
your churn, it could be run once a week or more.
Could you provide some numbers: what is the size of the tables or tables that are repacked? how long does it take? is there a performance impact during the repack?
Re: Clustered index to preserve data locality in a multitenant application?
On Wed, Aug 31, 2016 at 06:06:54PM +0200, Nicolas Grilly wrote: > On Wed, Aug 31, 2016 at 6:05 PM, Kenneth Marshall <ktm@rice.edu> wrote: > > > We just run it via cron. In our case, we run it once a day, but depending > > on > > your churn, it could be run once a week or more. > > > > Could you provide some numbers: what is the size of the tables or tables > that are repacked? how long does it take? is there a performance impact > during the repack? Hi, The key is that there must be enough I/O capacity to handle your routine DB access needs plus the additional usage by the repack operation. Running the repack during a slow period does not tangibly impact DB performance and completes in a reasonable period of time. Your best best would be to set up a test environment. By way of example, running our repack during peak usage can take over 5 hours and less than 30m outside of peak when there is excess I/O capacity. You should be able to use your I/O monitoring system to get an idea about your I/O headroom. Regards, Ken
We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.
It looks like Instagram has been using pg_reorg (the ancestor of pg_repack) to keep all likes from the same user contiguous on disk, in order to minimize disk seeks.
It looks like Instagram has been using pg_reorg (the ancestor of pg_repack) to keep all likes from the same user contiguous on disk, in order to minimize disk seeks.This is very similar to what I'm trying to achieve.The article is 3 years old. I'd be curious to know if they still do that.
If what they did 3 years ago is similar to what you are trying to do today, who cares what they are doing today? (Besides using pg_repack instead of pg_reorg, of course.)
For what it's worth, we use pg_repack on a regular basis and it works exactly as advertised.
Thanks.
We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.
It looks like a kind of "continuous CLUSTER/pg_repack". Is there something similar available or planned for PostgreSQL?
In DB2, it seems possible to define a "clustering index" that determines how rows are physically ordered in the "table space" (the heap).The documentation says: "When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values."
It looks like a kind of "continuous CLUSTER/pg_repack". Is there something similar available or planned for PostgreSQL?
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Nicolas Grilly
Sent: Wednesday, August 31, 2016 6:32 PM
To: Kenneth Marshall <ktm@rice.edu>
Cc: Vick Khera <vivek@khera.org>; pgsql-general <pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Clustered index to preserve data locality in a multitenant application?
On Tue, Aug 30, 2016 at 8:17 PM, Kenneth Marshall <ktm@rice.edu> wrote:
We have been using the extension pg_repack to keep a table groomed into
cluster order. With an appropriate FILLFACTOR to keep updates on the same
page, it works well. The issue is that it needs space to rebuild the new
index/table. If you have that, it works well.
In DB2, it seems possible to define a "clustering index" that determines how rows are physically ordered in the "table space" (the heap).
The documentation says: "When a table has a clustering index, an INSERT statement causes DB2 to insert the records as nearly as possible in the order of their index values."
It looks like a kind of "continuous CLUSTER/pg_repack". Is there something similar available or planned for PostgreSQL?
Don’t know about plans to implement clustered indexes in PostgreSQL.
Not sure if this was mentioned, MS SQL Server has clustered indexes, where heap row is just stored on the leaf level of the index.
Oracle also has similar feature: IOT, Index Organized Table.
It seems to me (may be I’m wrong), that in PostgreSQL it should be much harder to implement clustered index (with the heap row stored in the index leaf) because of the way how MVCC implemented: multiple row versions are stored in the table itself (e.g. Oracle for that purpose keeps table “clean” and stores multiple row versions in UNDO tablespace/segment).
Regards,
Igor Neyman
On Wed, 31 Aug 2016 17:33:18 +0200 Nicolas Grilly <nicolas@vocationcity.com> wrote: > Eduardo Morras wrote: > > > > Check BRIN indexs, they are "designed for handling very large > > tables in which certain columns have some natural correlation with > > their physical location within the table", I think they fit your > > needs. > > > Yes, a BRIN index on the tenant ID would be very useful if the rows > in the heap were naturally sorted by the tenant ID, but they are not. > They are naturally sorted by their order of insertion, which is > completely unrelated. The first step in solving this is to find a way > to keep rows belonging to the same tenant close to each other. The > second step could be to use a BRIN index. Then you can make multiple column partial indexes: CREATE INDEX CONCURRENTLY tenant_01_idx ON big_tenant_table (the_columns_with_data_you_need, tenant_id) WHERE tenant_id =1; CREATE INDEX CONCURRENTLY tenant_02_idx ON big_tenant_table (the_columns_with_data_you_need, tenant_id) WHERE tenant_id =2; This way each index has the data for a tenant, is updated only when the data for that tenant is updated and each index hasit own files and you can reindex to clean index content and debloat. REINDEX INDEX tenant_01_idx; Or grouping them if there are too much indexes: CREATE INDEX CONCURRENTLY tenant_01_idx ON big_tenant_table (the_columns_with_data_you_need, tenant_id) WHERE tenant_id <=300; CREATE INDEX CONCURRENTLY tenant_02_idx ON big_tenant_table (the_columns_with_data_you_need, tenant_id) WHERE tenant_id >300 AND tenant_id <= 600; --- --- Eduardo Morras <emorrasg@yahoo.es>
Don’t know about plans to implement clustered indexes in PostgreSQL.
Not sure if this was mentioned, MS SQL Server has clustered indexes, where heap row is just stored on the leaf level of the index.
Oracle also has similar feature: IOT, Index Organized Table.
It seems to me (may be I’m wrong), that in PostgreSQL it should be much harder to implement clustered index (with the heap row stored in the index leaf) because of the way how MVCC implemented: multiple row versions are stored in the table itself (e.g. Oracle for that purpose keeps table “clean” and stores multiple row versions in UNDO tablespace/segment).
We are developing a multitenant application which is currently based on MySQL, but we're thinking of migrating to PostgreSQL.We rely on clustered indexes to preserve data locality for each tenant. Primary keys start with the tenant ID. This way, rows belonging to the same tenant are stored next to each other. Because all requests hit only one tenant, this is a great performance improvement.PostgreSQL doesn't have clustered indexes — I'm aware of the CLUSTER command but it's a one-time operation — and I'm wondering if this can be a problem or not.Let's say we have a table containing data for 10,000 tenants and 10,000 rows per tenant, for a total of 100,000,000 rows. Let's say each 8 KB block contains ~10 rows. Let's way we want to compute the sum of an integer column for all rows belonging to a given tenant ID.In MySQL/InnoDB, rows are stored in the leaf nodes of a B-tree. To compute the sum, MySQL has to read at least 1,000 blocks (each block containing ~10 rows). I deliberately neglect the cost of walking the B-tree intermediate nodes.By comparison, PostgreSQL has to read at least 10,000 blocks (each block containing ~10 rows, but most of the time, only one row will match the tenant ID, other rows belonging to other tenants).A few questions:- Am I missing something?- Am I overestimating the benefit of a clustered index in our case, and the cost of not having one in PostgreSQL?- Is there another technical solution to this problem?
On 9/6/16 11:21 AM, Nicolas Grilly wrote: > It looks like I can move forward with our migration from MySQL to > PostgreSQL, without worrying about the lack of clustered indexes, > because there are better solutions to keep tenant data contiguous! First rule of performance tuning: don't. :) There are lots of areas where Postgres can be expected to perform better than MySQL, so without testing your app you really don't know how it's going to fare. There's also another option: use a logical replication system (such as pg_logical, BDR, londiste or Slony) to maintain at least one replica. You can take that replica down to perform maintenance (such as a database-wide CLUSTER) as needed, and let replication catch up once you bring it back online. That, combined with scripted failover makes a lot of database maintenance items far easier, at the cost of having to maintain the replication. Depending on your needs, a major benefit to this method is it makes major version upgrades very simple: you just stand up a new replica on the new version and then failover to it. If anything goes wrong, you can fail back to the old version without losing any data. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com 855-TREBLE2 (855-873-2532) mobile: 512-569-9461