Обсуждение: [PROPOSAL] Covering + unique indexes.
Hi, hackers!<br /><div class="moz-forward-container"><br /> Use case:<br /> Index-only scans is a wonderful feature thatallows to speed up select queries of indexed columns.<br /> Therefore some users want to create multicolumn indexes oncolumns which are queried often.<br /> But if there's unique constraint on some column, they have to maintain another uniqueindex.<br /> Even if the column is already in covering index.<br /> This adds overhead to data manipulation operationsand database size.<br /><br /> I've started work on a patch that allows to combine covering and unique functionality.<br/> The main idea is to allow user create multicolumn indexes with a definite number of unique columns.<br/> For example (don't mind SQL syntax here, please):<br /> CREATE INDEX index ON table (c1, c2, c3) UNIQUE ON(c1, c2);<br /> Created index has three columns, but only two of them have unique constraint.<br /><br /> This idea hasobvious restriction. We can set unique only for first index columns.<br /> There is no clear way to maintain followingindex.<br /> CREATE INDEX index ON table (c1, c2, c3) UNIQUE ON (c1, c3);<br /><br /> So I suggest following syntax:<br/> CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}} INDEX ON table_name (column_name1, column_name2...);<br /><br /> Examples:<br /> CREATE UNIQUE INDEX ON table_name (c1, c2, c3); // (c1,c2, c3) must be UNIQUE.That's how it works now.<br /><br /> CREATE UNIQUE ON FIRST COLUMN INDEX ON table_name (c1, c2, c3); // (c1) mustbe UNIQUE<br /> CREATE UNIQUE ON FIRST 2 COLUMNS INDEX ON table_name (c1, c2, c3); // (c1,c2) must be UNIQUE<br /> CREATEUNIQUE ON FIRST 3 COLUMNS INDEX ON table_name (c1, c2, c3); // (c1,c2, c3) must be UNIQUE<br /><br /><div class="line">Nextissue is pg_index changes.<br /> Now there's only a boolean flag<br /><div class="line"><ul><li><span class="keywordtype">bool</span>indisunique; <span class="comment">/* is this a unique index? */</span></ul><span class="comment">Butnew algorithm requires to store a single number<br /></span><ul><li>unit16<span class="comment"> n_unique_columns;/* number of first columns of index which has unique constrains. */<br /></span></ul> I think, that numbersof all attributes themselves are not needed. Is it right?<br /></div></div><br /> I'd like to see your suggestionsabout syntax changes. <br /> And of course any other comments are welcome.<br /><pre class="moz-signature" cols="72">-- Anastasia Lubennikova Postgres Professional: <a class="moz-txt-link-freetext" href="http://www.postgrespro.com" moz-do-not-send="true">http://www.postgrespro.com</a> The Russian Postgres Company</pre></div>
On 9/11/15 7:45 AM, Anastasia Lubennikova wrote: > This idea has obvious restriction. We can set unique only for first > index columns. > There is no clear way to maintain following index. > CREATE INDEX index ON table (c1, c2, c3) UNIQUE ON (c1, c3); > > So I suggest following syntax: > CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}} INDEX ON > table_name (column_name1, column_name2 ...); I would use the first (simple) syntax and just throw an error if the user tries to skip a column on the UNIQUE clause. Have you by chance looked to see what other databases have done for syntax? I'm guessing this isn't covered by ANSI but maybe there's already an industry consensus. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
>> CREATE INDEX index ON table (c1, c2, c3) UNIQUE ON (c1, c3); >> >> CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}} INDEX ON >> table_name (column_name1, column_name2 ...); > > I would use the first (simple) syntax and just throw an error if the > user tries to skip a column on the UNIQUE clause. Seems, second option looks as more natural extension of CREATE UNIQUE INDEX > > Have you by chance looked to see what other databases have done for > syntax? I'm guessing this isn't covered by ANSI but maybe there's > already an industry consensus. MS SQL and DB/2 suggests (with changes for postgresql): CREATE UNIQUE INDEX i ON t (a,b) INCLUDE (c) MS SQL supports both unique and non-unique indexes, DB/2 only unique indexes. Oracle/MySQL doesn't support covering indexes. Readed at http://use-the-index-luke.com/sql/clustering/index-only-scan-covering-index MSSQL/DB/2 variants looks good too. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Tue, Sep 15, 2015 at 6:08 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
-- Seems, second option looks as more natural extension of CREATE UNIQUE INDEXCREATE INDEX index ON table (c1, c2, c3) UNIQUE ON (c1, c3);
CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}} INDEX ON
table_name (column_name1, column_name2 ...);
I would use the first (simple) syntax and just throw an error if the
user tries to skip a column on the UNIQUE clause.
Have you by chance looked to see what other databases have done for
syntax? I'm guessing this isn't covered by ANSI but maybe there's
already an industry consensus.
MS SQL and DB/2 suggests (with changes for postgresql):
CREATE UNIQUE INDEX i ON t (a,b) INCLUDE (c)
MS SQL supports both unique and non-unique indexes, DB/2 only unique indexes. Oracle/MySQL doesn't support covering indexes. Readed at http://use-the-index-luke.com/sql/clustering/index-only-scan-covering-index
It surprised me that you can INCLUDE extra columns on non-UNIQUE indexes, since you could just add them as regular indexed columns for the same effect. It looks like when you do that in SQL Server, the extra columns are only stored on btree leaf pages and so can't be used for searching or ordering. I don't know how useful that is or if we would ever want it... but I just wanted to note that difference, and that the proposed UNIQUE ON FIRST n COLUMNS syntax and catalog change can't express that.
http://sqlperformance.com/2014/07/sql-indexes/new-index-columns-key-vs-include
http://sqlperformance.com/2014/07/sql-indexes/new-index-columns-key-vs-include
Thomas Munro
http://www.enterprisedb.com
http://www.enterprisedb.com
> It surprised me that you can INCLUDE extra columns on non-UNIQUE > indexes, since you could just add them as regular indexed columns for > the same effect. It looks like when you do that in SQL Server, the > extra columns are only stored on btree leaf pages and so can't be used > for searching or ordering. I don't know how useful that is or if we > would ever want it... but I just wanted to note that difference, and Agree > that the proposed UNIQUE ON FIRST n COLUMNS syntax and catalog change > can't express that. Proposal suggests to work only with unique index by exactly your reasons above. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 9/14/15 1:50 PM, Thomas Munro wrote: > CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}} > INDEX ON > table_name (column_name1, column_name2 ...); > > > I would use the first (simple) syntax and just throw an error if the > user tries to skip a column on the UNIQUE clause. > > Seems, second option looks as more natural extension of CREATE > UNIQUE INDEX True, but it's awefully verbose. :( And... > It surprised me that you can INCLUDE extra columns on non-UNIQUE > indexes, since you could just add them as regular indexed columns for > the same effect. It looks like when you do that in SQL Server, the > extra columns are only stored on btree leaf pages and so can't be used > for searching or ordering. I don't know how useful that is or if we > would ever want it... but I just wanted to note that difference, and > that the proposed UNIQUE ON FIRST n COLUMNS syntax and catalog change > can't express that. ... we might want to support INCLUDE at some point. It enhances covering scans without bloating the heck out of the btree. (I'm not sure if it would help other index types...) So it seems like a bad idea to preclude that. I don't see that UNIQUE ON FIRST precludes also supporting INCLUDE. Presumably we could do either CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4); or CREATE UNIQUE ON FIRST 2 COLUMNS INDEX ... ON table (f1, f2, f3) INCLUDE(f4); Personally, I find the first form easier to read. Are we certain that no index type could ever support an index on (f1, f2, f3) UNIQUE(f1, f3)? Even if it doesn't make sense for btree, perhaps some other index could handle it. -- Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX Experts in Analytics, Data Architecture and PostgreSQL Data in Trouble? Get it in Treble! http://BlueTreble.com
On Tue, Sep 15, 2015 at 12:44 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 9/14/15 1:50 PM, Thomas Munro wrote:CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}}
INDEX ON
table_name (column_name1, column_name2 ...);
I would use the first (simple) syntax and just throw an error if the
user tries to skip a column on the UNIQUE clause.
Seems, second option looks as more natural extension of CREATE
UNIQUE INDEX
True, but it's awefully verbose. :( And...It surprised me that you can INCLUDE extra columns on non-UNIQUE
indexes, since you could just add them as regular indexed columns for
the same effect. It looks like when you do that in SQL Server, the
extra columns are only stored on btree leaf pages and so can't be used
for searching or ordering. I don't know how useful that is or if we
would ever want it... but I just wanted to note that difference, and
that the proposed UNIQUE ON FIRST n COLUMNS syntax and catalog change
can't express that.
... we might want to support INCLUDE at some point. It enhances covering scans without bloating the heck out of the btree. (I'm not sure if it would help other index types...) So it seems like a bad idea to preclude that.
I don't see that UNIQUE ON FIRST precludes also supporting INCLUDE. Presumably we could do either
CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4);
or
CREATE UNIQUE ON FIRST 2 COLUMNS INDEX ... ON table (f1, f2, f3) INCLUDE(f4);
Personally, I find the first form easier to read.
Why not normal syntax with optional INCLUDE ?
CREATE UNIQUE INDEX ON table (f1,f2,f3) INCLUDE (f4)
Are we certain that no index type could ever support an index on (f1, f2, f3) UNIQUE(f1, f3)? Even if it doesn't make sense for btree, perhaps some other index could handle it.
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 14 September 2015 at 22:44, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
+1
On 9/14/15 1:50 PM, Thomas Munro wrote:CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}}
INDEX ON
table_name (column_name1, column_name2 ...);
I would use the first (simple) syntax and just throw an error if the
user tries to skip a column on the UNIQUE clause.
Seems, second option looks as more natural extension of CREATE
UNIQUE INDEX
True, but it's awefully verbose. :( And...It surprised me that you can INCLUDE extra columns on non-UNIQUE
indexes, since you could just add them as regular indexed columns for
the same effect. It looks like when you do that in SQL Server, the
extra columns are only stored on btree leaf pages and so can't be used
for searching or ordering. I don't know how useful that is or if we
would ever want it... but I just wanted to note that difference, and
that the proposed UNIQUE ON FIRST n COLUMNS syntax and catalog change
can't express that.
... we might want to support INCLUDE at some point. It enhances covering scans without bloating the heck out of the btree. (I'm not sure if it would help other index types...) So it seems like a bad idea to preclude that.
I don't see that UNIQUE ON FIRST precludes also supporting INCLUDE. Presumably we could do either
CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4);
or
CREATE UNIQUE ON FIRST 2 COLUMNS INDEX ... ON table (f1, f2, f3) INCLUDE(f4);
Personally, I find the first form easier to read.
+1
I guess the standard CREATE UNIQUE INDEX can be seen as shorthand for CREATE INDEX with all columns listed in the UNIQUE clause.
Are we certain that no index type could ever support an index on (f1, f2, f3) UNIQUE(f1, f3)? Even if it doesn't make sense for btree, perhaps some other index could handle it.
That's certainly an interesting question. At the moment, only btree is capable of enforcing uniqueness, but that's not to say it will always be that way. But I guess you'd need a way for the access method list of defining whether it's capable of multi-column indexes with out-of-order unique columns. (or some more sensible way of describing it)
Thom
On 14 September 2015 at 23:12, Oleg Bartunov <obartunov@gmail.com> wrote:
On Tue, Sep 15, 2015 at 12:44 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:On 9/14/15 1:50 PM, Thomas Munro wrote:CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}}
INDEX ON
table_name (column_name1, column_name2 ...);
I would use the first (simple) syntax and just throw an error if the
user tries to skip a column on the UNIQUE clause.
Seems, second option looks as more natural extension of CREATE
UNIQUE INDEX
True, but it's awefully verbose. :( And...It surprised me that you can INCLUDE extra columns on non-UNIQUE
indexes, since you could just add them as regular indexed columns for
the same effect. It looks like when you do that in SQL Server, the
extra columns are only stored on btree leaf pages and so can't be used
for searching or ordering. I don't know how useful that is or if we
would ever want it... but I just wanted to note that difference, and
that the proposed UNIQUE ON FIRST n COLUMNS syntax and catalog change
can't express that.
... we might want to support INCLUDE at some point. It enhances covering scans without bloating the heck out of the btree. (I'm not sure if it would help other index types...) So it seems like a bad idea to preclude that.
I don't see that UNIQUE ON FIRST precludes also supporting INCLUDE. Presumably we could do either
CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4);
or
CREATE UNIQUE ON FIRST 2 COLUMNS INDEX ... ON table (f1, f2, f3) INCLUDE(f4);
Personally, I find the first form easier to read.Why not normal syntax with optional INCLUDE ?CREATE UNIQUE INDEX ON table (f1,f2,f3) INCLUDE (f4)
That's not the same thing. Then you're including f3 in the unique constraint, which you may not want for uniqueness purposes, but may want in the index for sorting. But then saying that, if f1 and f2 are unique, you'd only get 1 value of f3 for each combination of f1 and f2, so sorting probably isn't useful. You might as well only INCLUDE f3 rather than have it in the multi-column index for sorting. So to adjust your example:
CREATE UNIQUE INDEX ON table (f1,f2) INCLUDE (f3, f4);
CREATE UNIQUE INDEX ON table (f1,f2) INCLUDE (f3, f4);
Is there a scenario anyone can think of where f3 here:
CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4);
CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4);
would be advantageous outside of INCLUDE?
Out of curiosity, why is this only being discussed for unique indexes? What if you want additional columns included on non-unique indexes?
Thom
On 15/09/15 09:44, Jim Nasby wrote: > On 9/14/15 1:50 PM, Thomas Munro wrote: >> CREATE [UNIQUE {ON FIRST {COLUMN | n_unique_column COLUMNS}} >> INDEX ON >> table_name (column_name1, column_name2 ...); >> >> >> I would use the first (simple) syntax and just throw an error >> if the >> user tries to skip a column on the UNIQUE clause. >> >> Seems, second option looks as more natural extension of CREATE >> UNIQUE INDEX > > True, but it's awefully verbose. :( And... > >> It surprised me that you can INCLUDE extra columns on non-UNIQUE >> indexes, since you could just add them as regular indexed columns for >> the same effect. It looks like when you do that in SQL Server, the >> extra columns are only stored on btree leaf pages and so can't be used >> for searching or ordering. I don't know how useful that is or if we >> would ever want it... but I just wanted to note that difference, and >> that the proposed UNIQUE ON FIRST n COLUMNS syntax and catalog change >> can't express that. > > ... we might want to support INCLUDE at some point. It enhances > covering scans without bloating the heck out of the btree. (I'm not > sure if it would help other index types...) So it seems like a bad > idea to preclude that. > > I don't see that UNIQUE ON FIRST precludes also supporting INCLUDE. > Presumably we could do either > > CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4); Of the formats I've seen so far, I prefer this one. I think using "[ALSO] INCLUDE(f4)" - might be potentially more readable than using just "INCLUDE(f4)". even if not used, the noise word also would help people understand that the other fields mentioned are already covered. If not too difficult then allowing the unique fields to be separated by other fields could be useful - in the example allowing "UNIQUE(f1, f3)". Especially if the index is likely to be used to CLUSTER a table, where the order f1, f2, ... is important. > or > CREATE UNIQUE ON FIRST 2 COLUMNS INDEX ... ON table (f1, f2, f3) > INCLUDE(f4); > > Personally, I find the first form easier to read. > > Are we certain that no index type could ever support an index on (f1, > f2, f3) UNIQUE(f1, f3)? Even if it doesn't make sense for btree, > perhaps some other index could handle it.
> CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4); I don't see an advantage this form. What is f3 column? just order? and f4 will not be used for compare? At least now it requires additional checks that UNIQUE() fields are the same as first columns in definition. Non ordering field f4 will require invasive intervention in planner because now it believes that all columns in btree are ordered. I agree, that form CREATE UNIQUE INDEX i ON t (f1, f2, f3) INCLUDE (f4) is clear. f4 will be used in row compare and actually planner will be able to use it as unique index (f1, f2, f3) with additional f4 or as as unique index (f1, f2, f3, f4), because uniqueness on (f1, f2, f3) gives warranty of uniqueness on (f1, f2, f3, f4) -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
> Why not normal syntax with optional INCLUDE ? > > CREATE UNIQUE INDEX ON table (f1,f2,f3) INCLUDE (f4) > > > That's not the same thing. Then you're including f3 in the unique > constraint, which you may not want for uniqueness purposes, but may want > in the index for sorting. But then saying that, if f1 and f2 are > unique, you'd only get 1 value of f3 for each combination of f1 and f2, > so sorting probably isn't useful. You might as well only INCLUDE f3 > rather than have it in the multi-column index for sorting. So to adjust > your example: > > CREATE UNIQUE INDEX ON table (f1,f2) INCLUDE (f3, f4); > > Is there a scenario anyone can think of where f3 here: > > CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4); > > would be advantageous outside of INCLUDE? > > Out of curiosity, why is this only being discussed for unique indexes? > What if you want additional columns included on non-unique indexes? Because there is no difference for non-unique indexes between (f1,f2,f3) and (f1, f2) INCLUDE (f3). In second case we just got index with unordered f3 column. Oh no, it's possible that f3 column type has not btree operator class... If we want to support this case then intervention in planner will be a bit invasive. BTW, I don't see in foreseen future another unique access methods. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 15 September 2015 at 18:16, Teodor Sigaev <teodor@sigaev.ru> wrote:
CREATE INDEX ... ON table (f1, f2, f3) UNIQUE(f1, f2) INCLUDE(f4);
I don't see an advantage this form. What is f3 column? just order? and f4 will not be used for compare? At least now it requires additional checks that UNIQUE() fields are the same as first columns in definition. Non ordering field f4 will require invasive intervention in planner because now it believes that all columns in btree are ordered.
I'm also a bit confused where f3 comes in here. If it's UNIQUE on (f1,f2) and we include f4. Where's f3?
I agree, that form
CREATE UNIQUE INDEX i ON t (f1, f2, f3) INCLUDE (f4)
is clear. f4 will be used in row compare and actually planner will be able to use it as unique index (f1, f2, f3) with additional f4 or as
as unique index (f1, f2, f3, f4), because uniqueness on (f1, f2, f3) gives warranty of uniqueness on (f1, f2, f3, f4)
I'd vote for this too. However, INCLUDE does not seem to be a reserved word at the moment.
I think this syntax fits in nicely to with non-unique indexes too.
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On 09/15/2015 10:57 AM, David Rowley wrote: >> I agree, that form >> > CREATE UNIQUE INDEX i ON t (f1, f2, f3) INCLUDE (f4) >> > is clear. f4 will be used in row compare and actually planner will be able >> > to use it as unique index (f1, f2, f3) with additional f4 or as >> > as unique index (f1, f2, f3, f4), because uniqueness on (f1, f2, f3) gives >> > warranty of uniqueness on (f1, f2, f3, f4) >> > >> > > I'd vote for this too. However, INCLUDE does not seem to be a reserved word > at the moment. What about CREATE UNIQUE INDEX i ON t (f1, f2, f3) WITH (f4); ? -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
On 12 September 2015 at 00:45, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
I've started work on a patch that allows to combine covering and unique functionality.
Great to hear someone is working on this!
Next issue is pg_index changes.
Now there's only a boolean flagBut new algorithm requires to store a single number
- bool indisunique; /* is this a unique index? */
I think, that numbers of all attributes themselves are not needed. Is it right?
- unit16 n_unique_columns; /* number of first columns of index which has unique constrains. */
I think the total number of attributes is already in indnatts.
I imagine you're planning to keep the indexed columns at the start of the indkey[] array, and just use n_unique_columns to mark how many of the indkey[] attributes to check as indexed columns. I'd imagine the change would be fairly simple from a planner point of view as you'd just need to check columns 1 to n_unique_columns instead of 1 to indnatts. Although I have a tendency to under estimate these things :(
I imagine you don't want to name the new column n_unique_columns, since it does not fit too well with non-unique indexes.
Perhaps just indindexedatts, or something slightly along those lines. But perhaps it would be a good idea to also rename "ncolumns" in code, to ensure that any non-updated code does not even compile. Perhaps including "tot" or "total" in there might help indicate it's new meaning.
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
15.09.2015 12:18, David Rowley:
Thank you! It looks like very interesting project to me)
I've asked that for the same reason. I'm not too deep in access method and btree code, so I don't want to miss any hidden dependencies.
As I see it, changes are required in _bt_isequal(). Instead of
for (i = 1; i <= keysz; i++) {} // where keysz is actually natts = rel->rd_rel->relnatts;
Look at _bt_check_uinque and pg_class for more info.
cycle should be
for (i = 1; i <= nuinique; i++) {} // where nunique is value of rel->rd_index->indnuinque
indnuinque is a new field, which I suggest to add to pg_index instead of indisunique (or may be in addition to it).
But I'm doubt about the problem which Jim has mentioned.
>Are we certain that no index type could ever support an index on (f1, f2, f3) UNIQUE(f1, f3)? Even if it >doesn't make sense for btree, perhaps some other index could handle it.
If it's really so, we certainly have to keep in pg_index not just number of unique columns (indnunique), but their numbers themselves (an array indnuniqueatts on the model of indnatts).
Hm, I think that it would be quite clear to set it to zero for non-unique indexes.
(nunique == 0) is equal to (indisunique==false).
But maybe I've missed some reason why we should to save indisunique flag.
On 12 September 2015 at 00:45, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:I've started work on a patch that allows to combine covering and unique functionality.Great to hear someone is working on this!
Thank you! It looks like very interesting project to me)
Next issue is pg_index changes.
Now there's only a boolean flagBut new algorithm requires to store a single number
- bool indisunique; /* is this a unique index? */
I think, that numbers of all attributes themselves are not needed. Is it right?
- unit16 n_unique_columns; /* number of first columns of index which has unique constrains. */
I think the total number of attributes is already in indnatts.I imagine you're planning to keep the indexed columns at the start of the indkey[] array, and just use n_unique_columns to mark how many of the indkey[] attributes to check as indexed columns. I'd imagine the change would be fairly simple from a planner point of view as you'd just need to check columns 1 to n_unique_columns instead of 1 to indnatts. Although I have a tendency to under estimate these things :(
I've asked that for the same reason. I'm not too deep in access method and btree code, so I don't want to miss any hidden dependencies.
As I see it, changes are required in _bt_isequal(). Instead of
for (i = 1; i <= keysz; i++) {} // where keysz is actually natts = rel->rd_rel->relnatts;
Look at _bt_check_uinque and pg_class for more info.
cycle should be
for (i = 1; i <= nuinique; i++) {} // where nunique is value of rel->rd_index->indnuinque
indnuinque is a new field, which I suggest to add to pg_index instead of indisunique (or may be in addition to it).
But I'm doubt about the problem which Jim has mentioned.
>Are we certain that no index type could ever support an index on (f1, f2, f3) UNIQUE(f1, f3)? Even if it >doesn't make sense for btree, perhaps some other index could handle it.
If it's really so, we certainly have to keep in pg_index not just number of unique columns (indnunique), but their numbers themselves (an array indnuniqueatts on the model of indnatts).
I imagine you don't want to name the new column n_unique_columns, since it does not fit too well with non-unique indexes.
Hm, I think that it would be quite clear to set it to zero for non-unique indexes.
(nunique == 0) is equal to (indisunique==false).
But maybe I've missed some reason why we should to save indisunique flag.
I hope, that all these changes are not needed. Just continue to use ncolumns for all indexed columns. And new variable nuinique (or something like that) is for number of unique columns in the index.Perhaps just indindexedatts, or something slightly along those lines. But perhaps it would be a good idea to also rename "ncolumns" in code, to ensure that any non-updated code does not even compile. Perhaps including "tot" or "total" in there might help indicate it's new meaning.
-- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
15.09.2015 12:11, Vik Fearing: > On 09/15/2015 10:57 AM, David Rowley wrote: >>> I agree, that form >>>> CREATE UNIQUE INDEX i ON t (f1, f2, f3) INCLUDE (f4) >>>> is clear. f4 will be used in row compare and actually planner will be able >>>> to use it as unique index (f1, f2, f3) with additional f4 or as >>>> as unique index (f1, f2, f3, f4), because uniqueness on (f1, f2, f3) gives >>>> warranty of uniqueness on (f1, f2, f3, f4) >>>> >>>> >> I'd vote for this too. However, INCLUDE does not seem to be a reserved word >> at the moment. > What about CREATE UNIQUE INDEX i ON t (f1, f2, f3) WITH (f4); ? WITH seems ambiguity to me. It refers to CTE, so I expect to see after that a kind of query expression. But maybe that's just matter of habit. BTW, that's the first syntax change I'm working with. Is there any convention in PostgreSQL about new keywords and so on? Where can I find it? -- Anastasia Lubennikova Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
On 15 September 2015 at 22:20, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:
Hm, I think that it would be quite clear to set it to zero for non-unique indexes.
(nunique == 0) is equal to (indisunique==false).
But maybe I've missed some reason why we should to save indisunique flag.
I'd say that Jim summed this one up well, with:
>... we might want to support INCLUDE at some point. It enhances covering scans without bloating the heck out of the btree. (I'm not sure if it would help other index types...) So it seems like a bad idea to preclude that.
Which I take to mean non-unique indexes.
So if you just kept the indisunique flag, and added a column to state the number of columns that are actually in the "index" (not INCLUDE columns). Then your code would probably work for both unique and non-unique index. This way users don't have to pay the price of index bloat if they tag on high cardinality columns onto the end of the index's column list.
Perhaps it would be easier just to add a new column to pg_index which stores the total attrs, that way you could get away with not having to edit each of the existing for() loop that go over the index attributes. This would just store the idxnattrs + number of included columns. Perhaps something named idxtotnatts or idxtotalnatts.
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Seems, final form is CREATE INDEX idx ON tbl (f1, f2, f3) [UNIQUE ON (f1, f2)] [INCLUDE (f4)] f1, f2, f3 are participated in index row comparence (btre, gist etc) f1, f2 are participated in unique constrain and it gives warranty for (f1, f2, f3[, f4]) uniqueness. Now supported by Btreeonly f4 doesn't participate in row comparence and could even do not have an operator class. Btree and GiST could support that. The form CREATE UNIQUE INDEX ON tbl (f1, f2, f3) is exact equivalent of form CREATE INDEX idx ON tbl (f1, f2, f3) UNIQUE ON (f1, f2, f3) I hope, that it's doible without a lot of difficulties. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On 09/15/2015 12:45 PM, Anastasia Lubennikova wrote: > 15.09.2015 12:11, Vik Fearing: >> On 09/15/2015 10:57 AM, David Rowley wrote: >>>> I agree, that form >>>>> CREATE UNIQUE INDEX i ON t (f1, f2, f3) INCLUDE (f4) >>>>> is clear. f4 will be used in row compare and actually planner will >>>>> be able >>>>> to use it as unique index (f1, f2, f3) with additional f4 or as >>>>> as unique index (f1, f2, f3, f4), because uniqueness on (f1, f2, >>>>> f3) gives >>>>> warranty of uniqueness on (f1, f2, f3, f4) >>>>> >>>>> >>> I'd vote for this too. However, INCLUDE does not seem to be a >>> reserved word >>> at the moment. >> What about CREATE UNIQUE INDEX i ON t (f1, f2, f3) WITH (f4); ? > > WITH seems ambiguity to me. It refers to CTE, so I expect to see after > that a kind of query expression. But maybe that's just matter of habit. Not necessarily. See WITH ORDINALITY, for example. However, now that I've looked at it, index creation already takes a WITH clause for storage parameters, so that's out. > BTW, that's the first syntax change I'm working with. > Is there any convention in PostgreSQL about new keywords and so on? > Where can I find it? I don't think it's written anywhere except peppered throughout the archives. New keywords are greatly frowned upon. INCLUDING is already an unreserved keyword, and sounds more natural than INCLUDE anyway. Maybe that could work? -- Vik Fearing +33 6 46 75 15 36 http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
2015-09-15 David Rowley <david.rowley@2ndquadrant.com>: > I'm also a bit confused where f3 comes in here. If it's UNIQUE on (f1,f2) > and we include f4. Where's f3? Columns f1, f2, f3 are in the internal nodes of the tree (i.e., they are used to find the ultimate leaf nodes). f4 is only in the leaf nodes. If f4 are typically big values, and they are typically not used in the search predicate, it makes the upper part of the index (which determines how many levels the index has) larger for no good reason. f4 can still be retrieved without going to the heap, so including it in the leaf nodes makes it possible to do index-only scans more often. Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad?
On 15 September 2015 at 23:51, Nicolas Barbier <nicolas.barbier@gmail.com> wrote:
2015-09-15 David Rowley <david.rowley@2ndquadrant.com>:
> I'm also a bit confused where f3 comes in here. If it's UNIQUE on (f1,f2)
> and we include f4. Where's f3?
Columns f1, f2, f3 are in the internal nodes of the tree (i.e., they
are used to find the ultimate leaf nodes). f4 is only in the leaf
nodes. If f4 are typically big values, and they are typically not used
in the search predicate, it makes the upper part of the index (which
determines how many levels the index has) larger for no good reason.
f4 can still be retrieved without going to the heap, so including it
in the leaf nodes makes it possible to do index-only scans more often.
Hmm, ok, I guess I was unable to see any advantage to having f3 in the btree, if it's not to be enforced as part of the unique constraint.
I now see that this is probably to allow pre-sorted paths without having to enforce uniqueness over all of the indexed columns.
If that's the case then I assume that we'd also want something to allow that to be done when creating a PRIMARY KEY constraint
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Proposal Clarification.<br /> I see that discussion become too <span class="b-translation__text">complicated. So, I'd liketo clarify what we are talking about.<br /><br /> We are discussing 2 different improvements of index.<br /> The one is "partially unique index" and the other "index with included columns".<br /> Let's look at example.<br /><br /> -We have a table tbl(f1, f2, f3, f4).<br /> - We want to have an unique index on (f1,f2).<br /> - We want to have an indexon (f1, f2, f3) which allow us to use index for complex "where" clauses.<br /> - We would like to have a covering indexon all columns to avoid reading of heap pages.<br /><br /> What are we doing now:<br /> CREATE UNIQUE INDEX on tbl(f1,f2);<br/> CREATE INDEX on tbl(f1, f2, f3, f4);<br /><br /> What's wrong? <br /> - Two indexes with repeated data.</span>Overhead to data manipulation operations and database size.<br /> - We don't need f4 as index key. But we haveto...<br /> - Problem related to previous. It's possible that f4 has no opclass for our index. So there's no way to includeit to index.<br /> While we don't need any opclass at all.<br /><br /> Suggestions.<br /> CREATE INDEX idx ON tbl(f1, f2, f3) [UNIQUE ON (f1, f2)] [INCLUDE (f4)];<br /> Let's review it step<span class="b-translation__translation-words"><spanclass="b-translation__text"> by step.<br /><br /></span></span>1. "<span class="b-translation__text">partiallyunique index</span>"<br /> CREATE INDEX idx ON tbl (f1, f2, f3) UNIQUE ON (f1, f2);<br/> It means that we want to have columns (f1, f2, f3) as index keys in our index. <br /> But we want enforce uniquenessonly on first two.<br /> It allows us insert (1,1,1), (1,2,1) and restricts insert (1,1,1), (1,1,2).<br /><br />It doesn't affect "select" queries.<br /> Following query can use index-only scan.<br /> SELECT f1,f2, f3 where f1 ...and f2 ... and f3 ....;<br /><br /> We haven't to maintain two indexes now. Just one!<br /><br /><a href="http://doxygen.postgresql.org/nbtinsert_8c.html#abcfb7a3dcd40a5d1759652239f3a0115">_bt_iseual()</a>works with (f1,f2) <br /><a href="http://doxygen.postgresql.org/nbtsearch_8c.html#af689dabb25e99f551b68aa9b7a1e6ea6">_bt_compare()</a>works with (f1,f2,f3)<br/><br /> 2. <span class="b-translation__text">"index with included columns" It goes well with both unique andnon-unique indexes.</span><br /> CREATE INDEX idx ON tbl (f1, f2, f3) INCLUDE (f4);<br /> What we get here:<br /> - f4is not search key.<br /> - f4 could not have opclass for our index<br /> - f4 is only in the leaf pages and it's not bloatinginternal nodes of b-tree.<br /> - f4 can still be retrieved without going to the heap, so including it<br /> in theleaf nodes makes it possible to do index-only scans more often<br /><br /> Following query can use index-only scan:<br/> SELECT f1,f2, f3, f4 where f1 ... and f2 ... and f3 ....;<br /><br /> And this one wouldn't use index-only scanbecause recheck on f4 is required.<br /> SELECT f1,f2, f3, f4 where f1 ... and f2 ... and f3 .... and f4;<br /><br /><br/> Catalog changes:<br /> Now:<span class="comment"></span><br /><a href="http://doxygen.postgresql.org/pg__index_8h.html#a5065be0408f03576083a30c97b43a09a">pg_index</a><br/><span class="lineno"></span>int16 indnatts; <span class="comment">/* number of columns in index */</span><span class="lineno"><br/></span><span class="keywordtype">bool</span> indisunique; <span class="comment">/* is this a unique index?*/</span><span class="comment"></span><br /><br /> New:<br /> pg_index<br /> int16 ind_n_unique_atts; /* number ofunique columns in index. counted from begin of index. 0 means that index is not unique */<br /> int16 ind_n_key_atts; /*number of index key columns in index. counted from begin of index.*/<br /> int16 ind_n_total_atts; /* number of columnsin index.*/<br /><br /> In our case:<br /> ind_n_unique_atts = 2; // f1, f2<br /> ind_n_key_atts = 3; // f1, f2, f3<br/> ind_n_total_atts = 4; // f1, f2, f3, f4<br /><br /><br /> P.S. <span class="b-translation__text">I use many ideasfrom discussion without quotations just because I'd like to keep this message readable. Thanks to everyone.</span> <preclass="moz-signature" cols="72">-- Anastasia Lubennikova Postgres Professional: <a class="moz-txt-link-freetext" href="http://www.postgrespro.com">http://www.postgrespro.com</a> The Russian Postgres Company</pre>
<div dir="ltr"><br /><div class="gmail_extra"><br /><div class="gmail_quote">On Tue, Sep 15, 2015 at 12:57 PM, AnastasiaLubennikova <span dir="ltr"><<a href="mailto:a.lubennikova@postgrespro.ru" target="_blank">a.lubennikova@postgrespro.ru</a>></span>wrote:<br /><blockquote class="gmail_quote" style="margin:0px0px 0px 0.8ex;border-left:1px solid rgb(204,204,204);padding-left:1ex"><br /><div bgcolor="#FFFFFF" text="#000000">Proposal Clarification.<br /> I see that discussion become too <span>complicated. So, I'd like to clarifywhat we are talking about.<br /><br /> We are discussing 2 different improvements of index.<br /> The one is "partiallyunique index" and the other "index with included columns".<br /> Let's look at example.<br /><br /> - We havea table tbl(f1, f2, f3, f4).<br /> - We want to have an unique index on (f1,f2).<br /> - We want to have an index on(f1, f2, f3) which allow us to use index for complex "where" clauses.<br /></span><span class=""></span></div></blockquote></div><br/><br /></div><div class="gmail_extra">Can someone write a query where F3 beingordered is a contribution?<br /><br /></div><div class="gmail_extra">If F1 and F2 are unique, adding F3 to a where ororder by clause doesn't seem to contribute anything.<br /><br /> -- Already fully ordered by F1,F2<br /></div><div class="gmail_extra">SELECT... ORDER BY F1, F2, F3;<br /><br /><br />-- F3 isn't in a known order without specifying F2<br/></div><div class="gmail_extra">SELECT ... WHERE F1 = ? ORDER BY F1, F3;<br /></div><div class="gmail_extra"><br /><br/>-- Index resolves to a single record; nothing to order<br /></div><div class="gmail_extra">SELECT ... WHERE F1 = ?AND F2 = ? ORDER BY F3;<br /><br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">-- Without a whereclause, the index isn't helpful unless F3 is the first column<br /></div><div class="gmail_extra">SELECT ... ORDER BYF3; <br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra"><br /></div><div class="gmail_extra">Whatis it that I'm missing?<br /></div><div class="gmail_extra"><br /></div></div>
On 16 September 2015 at 10:38, Rod Taylor <rod.taylor@gmail.com> wrote:
On Tue, Sep 15, 2015 at 12:57 PM, Anastasia Lubennikova <a.lubennikova@postgrespro.ru> wrote:Proposal Clarification.
I see that discussion become too complicated. So, I'd like to clarify what we are talking about.
We are discussing 2 different improvements of index.
The one is "partially unique index" and the other "index with included columns".
Let's look at example.
- We have a table tbl(f1, f2, f3, f4).
- We want to have an unique index on (f1,f2).
- We want to have an index on (f1, f2, f3) which allow us to use index for complex "where" clauses.Can someone write a query where F3 being ordered is a contribution?If F1 and F2 are unique, adding F3 to a where or order by clause doesn't seem to contribute anything.
-- Already fully ordered by F1,F2SELECT ... ORDER BY F1, F2, F3;
-- F3 isn't in a known order without specifying F2SELECT ... WHERE F1 = ? ORDER BY F1, F3;
-- Index resolves to a single record; nothing to orderSELECT ... WHERE F1 = ? AND F2 = ? ORDER BY F3;-- Without a where clause, the index isn't helpful unless F3 is the first columnSELECT ... ORDER BY F3;What is it that I'm missing?
Joining relations may have more than one matching tuple for any given unique tuple, therefore the tuples may no longer be unique on the columns which are in the unique index.
https://commitfest.postgresql.org/6/129/ takes steps to add infrastructure to the planner to allow it to know when this happens. Although I'm currently "selling" it as a performance improvement patch.
Regards
David Rowley
--
David Rowley http://www.2ndQuadrant.com/
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
<div class="moz-cite-prefix">On 09/15/2015 06:57 PM, Anastasia Lubennikova wrote:<br /></div><blockquote cite="mid:55F84DF4.5030207@postgrespro.ru"type="cite"> Proposal Clarification.<br /> I see that discussion become too <spanclass="b-translation__text">complicated. So, I'd like to clarify what we are talking about.<br /><br /> [snip]<br />What are we doing now:<br /> CREATE UNIQUE INDEX on tbl(f1,f2);<br /> CREATE INDEX on tbl(f1, f2, f3, f4);<br /><br />[snip]</span><br /> Suggestions.<br /> CREATE INDEX idx ON tbl (f1, f2, f3) [UNIQUE ON (f1, f2)] [INCLUDE (f4)];<br /></blockquote><br/> Summarizing some suggestions upthread, it seems like the "best" syntax would be something similar to:<br/><br /> -- Non-unique index + "leaf" information (f4)<br /> CREATE INDEX idx ON tbl (f1, f2, f3) [INCLUDING (f4)]<br/><br /> -- Unique index on f1,f2, + leaf information (f3)<br /> CREATE UNIQUE INDEX idx ON tbl (f1, f2) [INCLUDING(f3)]<br /><br /> And, even:<br /> ALTER INDEX idx INCLUDING (f4)<br /> ... which would trigger a REINDEX CONCURRENTLYinternally ?<br /><br /><br /> FWIW this would include also the functionality provided by the suggested<br /> CREATE INDEX idx ON tbl (f1, f2, f3) UNIQUE ON (f1, f2);<br /><br /> while being less verbose, IMHO.<br /><br /><br/> The obvious inconvenient being that the indexes will grow a bit, so fewer entries will fit in memory.<br /><br /><br/> Also, we don't meddle with WITH clauses (for smgr parameters or the like) nor USING <method> clauses.<br />I reckon that implementation might be a bit intrusive (requiring changes to most index AMs even?)<br /><br /><br /><br/> Thanks!<br /><br /> / J.L.<br /><br />
On 16 September 2015 at 14:03, José Luis Tallón <jltallon@adv-solutions.net> wrote:
-- On 09/15/2015 06:57 PM, Anastasia Lubennikova wrote:Proposal Clarification.
I see that discussion become too complicated. So, I'd like to clarify what we are talking about.
[snip]
What are we doing now:
CREATE UNIQUE INDEX on tbl(f1,f2);
CREATE INDEX on tbl(f1, f2, f3, f4);
[snip]
Suggestions.
CREATE INDEX idx ON tbl (f1, f2, f3) [UNIQUE ON (f1, f2)] [INCLUDE (f4)];
Summarizing some suggestions upthread, it seems like the "best" syntax would be something similar to:
-- Non-unique index + "leaf" information (f4)
CREATE INDEX idx ON tbl (f1, f2, f3) [INCLUDING (f4)]
I guess this would possibly be a path to create indexes... without any actual data in the table. Sequential scans would be costly, but queries with a predicate matching the sorted columns would be very quick. Not sure how REINDEX could be made to work with that though. And might end up requiring something like partitioned indexes.
Disclaimer: I *really* haven't thought this through.
-- Unique index on f1,f2, + leaf information (f3)
CREATE UNIQUE INDEX idx ON tbl (f1, f2) [INCLUDING (f3)]
And, even:
ALTER INDEX idx INCLUDING (f4)
... which would trigger a REINDEX CONCURRENTLY internally ?
We don't currently have REINDEX CONCURRENTLY.
Thom
2015-09-16 Rod Taylor <rod.taylor@gmail.com>: > 2015-09-15 Anastasia Lubennikova <a.lubennikova@postgrespro.ru>: > >> - We have a table tbl(f1, f2, f3, f4). >> - We want to have an unique index on (f1,f2). >> - We want to have an index on (f1, f2, f3) which allow us to use index for >> complex "where" clauses. > > Can someone write a query where F3 being ordered is a contribution? > > If F1 and F2 are unique, adding F3 to a where or order by clause doesn't > seem to contribute anything. After thinking about it a bit more, it indeed seems never useful to have f3 in the internal nodes if it is not part of the columns that determine the UNIQUE property. It could as well be pushed out of the internal nodes and only appear in the leaf nodes. In other words: It seems only useful to have a list of columns that appear in the internal nodes AND to which the UNIQUE property applies, plus an addition list of columns whose values are only stored in the leaf nodes (to create a “covering index”). For non-UNIQUE indexes, there is also only need for two lists of columns. I don’t understand the case where it is useful anyway, according to David: 2015-09-16 David Rowley <david.rowley@2ndquadrant.com>: > Joining relations may have more than one matching tuple for any given unique > tuple, therefore the tuples may no longer be unique on the columns which are > in the unique index. Could you elaborate a bit on how this is relevant to Rod’s question? I seem to be missing something here. greetings, Nicolas -- A. Because it breaks the logical sequence of discussion. Q. Why is top posting bad?
On Wed, Sep 16, 2015 at 8:53 AM, Nicolas Barbier <nicolas.barbier@gmail.com> wrote: > After thinking about it a bit more, it indeed seems never useful to > have f3 in the internal nodes if it is not part of the columns that > determine the UNIQUE property. It could as well be pushed out of the > internal nodes and only appear in the leaf nodes. Correct. That's a standard technique in B-Tree implementations like our own; suffix truncation can remove unneeded information from the end of values, possibly including entire attributes, which can be truncated in a way that is similar to this patch. The difference here is only that this patch does not dynamically determine which attributes can be removed while re-finding the parent downlink in the second phase of a page split (the usual place it happens with standard suffix truncation). Rather, this patch knows "a priori" that it can truncate attributes that are merely "included" attributes. That means that this patch has as much to do with increasing B-Tree fan-out for these indexes as it does for making unique indexes more usable for index-only scans. Both of those goals are important, IMV. This patch seems pretty cool. I noticed some issues following a quick read though the patch including_columns_6.0.patch that Anastasia should look into: * You truncate (remove suffix attributes -- the "included" attributes) within _bt_insertonpg(): - right_item = CopyIndexTuple(item); + indnatts = IndexRelationGetNumberOfAttributes(rel); + indnkeyatts = IndexRelationGetNumberOfKeyAttributes(rel); + + if (indnatts != indnkeyatts) + { + right_item = index_reform_tuple(rel, item, indnatts, indnkeyatts); + right_item_sz = IndexTupleDSize(*right_item); + right_item_sz = MAXALIGN(right_item_sz); + } + else + right_item = CopyIndexTuple(item); ItemPointerSet(&(right_item->t_tid), rbkno, P_HIKEY); I suggest that you do this within _bt_insert_parent(), instead, iff the original target page is know to be a leaf page. That's where it needs to happen for conventional suffix truncation, which has special considerations when determining which attributes are safe to truncate (or even which byte in the first distinguishing attribute it is okay to truncate past). Conventional suffix truncation (not this patch) could truncate, say, "C" collation text past the first distinguishing byte, where the byte distinguishes the new downlink being inserted into the parent page compared to both the left downlink and right downlink in the parent page-- the minimum amount of information to correctly guide later index scans is only stored. But it isn't correct (again, with conventional suffix truncation) to do this passed the leaf level. It must end there. It isn't safe past the leaf level (by which I mean when inserting a downlink into its parent, one level up) because applying suffix truncation again for the next level up might guide a search to the highest node in the left sub-tree rather than to the lowest node in the right sub-tree, or vice versa. In general, we must be careful about "cousin" nodes, that are beside each other but are not "siblings" due to not sharing the same parent. It doesn't really matter that this restriction exists, because you get almost all the benefit at the leaf -> immediate parent level anyway. Higher levels will reuse already truncated Index Tuples, that are typically "truncated enough". So, this should work in a similar way to conventional suffix truncation (BTW, you should work on that later). And so, it should just do it there. Besides, checking it only where it could possibly help is clearer, since as written the code in _bt_insertonpg() will never need to truncate following a non-leaf/internal page split. * I think the comparison logic may have a bug. Does this work with amcheck? Maybe it works with bt_index_check(), but not bt_index_parent_check()? I think that you need to make sure that _bt_compare() knows about this, too. That's because it isn't good enough to let a truncated internal IndexTuple compare equal to a scankey when non-truncated attributes are equal. I think you need to have an imaginary "minus infinity" attribute past the first non-truncated attribute (i.e. "minus infinity value" for the first *truncated* attribute). That way, the downlinks will always be lower bounds when the non-"included"/truncated attributes are involved. This seems necessary. No? It's necessary because you aren't storing any attributes, so it's not acceptable to even attempt a comparison -- I think that will segfault (doesn't matter that the index scan wouldn't have returned anything anyway). I think it's also necessary because of issues with "cousin" nodes making index scans lose their way. That's all I have right now. Nice work. -- Peter Geoghegan
On Mon, Mar 14, 2016 at 8:43 PM, Peter Geoghegan <pg@heroku.com> wrote: > > Does this work with amcheck? Maybe it works with bt_index_check(), but > not bt_index_parent_check()? I think that you need to make sure that > _bt_compare() knows about this, too. That's because it isn't good > enough to let a truncated internal IndexTuple compare equal to a > scankey when non-truncated attributes are equal. I think you need to > have an imaginary "minus infinity" attribute past the first > non-truncated attribute (i.e. "minus infinity value" for the first > *truncated* attribute). That way, the downlinks will always be lower > bounds when the non-"included"/truncated attributes are involved. This > seems necessary. No? Maybe can store information about minus infinity attributes in "itup->t_tid.ip_posid". As you know, this is unused within internal/non-leaf pages, whose downlink items only need a block number (the child's block number/location on disk for that particular downlink). That's a bit ugly, but there are plenty of bits available from there, so use them if you need them. -- Peter Geoghegan
On Mon, Mar 14, 2016 at 8:43 PM, Peter Geoghegan <pg@heroku.com> wrote: > * I think the comparison logic may have a bug. > > Does this work with amcheck? Maybe it works with bt_index_check(), but > not bt_index_parent_check()? I think that you need to make sure that > _bt_compare() knows about this, too. That's because it isn't good > enough to let a truncated internal IndexTuple compare equal to a > scankey when non-truncated attributes are equal. I think you need to > have an imaginary "minus infinity" attribute past the first > non-truncated attribute (i.e. "minus infinity value" for the first > *truncated* attribute). That way, the downlinks will always be lower > bounds when the non-"included"/truncated attributes are involved. This > seems necessary. No? Oh, BTW: You probably need to worry about high key items as a special case, too. Note that there is a special case when the ScanKey is equal to the high key on a page during insertion. As the nbtree README puts it: """ An insertion that sees the high key of its target page is equal to the key to be inserted has a choice whether or not to move right, since the new key could go on either page. (Currently, we try to find a page where there is room for the new key without a split.) """ Just something to watch out for if you add "minus infinity" attributes as I suggested. Not exactly sure what to do about this other problem, but it seems manageable. -- Peter Geoghegan
On Mon, Mar 14, 2016 at 8:43 PM, Peter Geoghegan <pg@heroku.com> wrote: > Does this work with amcheck? Maybe it works with bt_index_check(), but > not bt_index_parent_check()? I think that you need to make sure that > _bt_compare() knows about this, too. That's because it isn't good > enough to let a truncated internal IndexTuple compare equal to a > scankey when non-truncated attributes are equal. I think you need to > have an imaginary "minus infinity" attribute past the first > non-truncated attribute (i.e. "minus infinity value" for the first > *truncated* attribute). That way, the downlinks will always be lower > bounds when the non-"included"/truncated attributes are involved. This > seems necessary. No? Actually, I now think I got this slightly wrong. What's at issue is this (from nbtree README): """ Lehman and Yao assume that the key range for a subtree S is described by Ki < v <= Ki+1 where Ki and Ki+1 are the adjacent keys in the parent page. This does not work for nonunique keys (for example, if we have enough equal keys to spread across several leaf pages, there *must* be some equal bounding keys in the first level up). Therefore we assume Ki <= v <= Ki+1 instead. A search that finds exact equality to a bounding key in an upper tree level must descend to the left of that key to ensure it finds any equal keys in the preceding page. """ Today, nbtree needs to check the page to the left of an equal internal page downlink child anyway. That isn't hard-coded into _bt_compare(), though. If it was, it would be a "positive infinity" attribute, not "negative infinity" as I incorrectly said. This is because the equal IndexTuples might easily not begin exactly at the beginning of the downlink's child page (often, we should begin in the left page instead, by following the previous downlink in the parent instead -- just in case). Any kind of "infinity" attribute probably isn't necessary for your patch today, since, as referenced in the README extract above, our slightly non-standard L&Y does this in _bt_binsrch(): /* * At this point we have high == low, but be careful: they could point * past the last slot on the page. * * On a leaf page, we always return the first key >= scan key (resp. > * scan key), which could be the last slot + 1. */ if (P_ISLEAF(opaque)) return low; However, I think it's still a good idea to have a special integer in the IndexTuple explicitly indicating the attribute at which the "suffix" is truncated, even if the "suffix truncation" happens at a consistent point based on an index in your patch. That will change in the future, and we should be prepared. Even though I was partially mistaken, clearly it still wasn't okay to even try to compare non-existent attributes in internal pages, since that segfaulted. So a (mostly imaginary) "positive infinity" attribute can still exist, initially just to make _bt_compare() not crash. This attribute number (stored in "itup->t_tid.ip_posid") effectively tells the binary search code to look at the child to the left of the compared downlink (not the downlink child itself), even though that's already going to happen per the code above. So, thinking about it once more (uh, sorry), _bt_compare() has to "indicate equality"/return 0, *despite* being *logically* a "positive infinity" comparison from a higher level, in order to let the code above to handle it instead, so it isn't handled more than once. Also, not sure if less common "nextkey == true" case needs some further consideration (lets forget that detail for time being, though). Phew! So, as I said, _bt_binsrch() and/or _bt_compare() can be fixed to make sure that the scan arrives on the correct leaf page (the first leaf page that an matching IndexTuple could be on). What then, though? What about leaf pages, that *do* have the extra attributes ("INCLUDING" attributes) represented in their tuples, and *don't* "return OffsetNumberPrev(low)" at the end of _bt_binsrch() (they do the P_LEAF() thing quoted above)? Are they safe? Remember: * For nextkey=false (cmpval=1), the loop invariant is: all slots before * 'low' are < scan key, all slots at or after 'high' are >= scan key. I think this means that you need to be very careful about leaf pages, too. Speculative insertion (used by UPSERT) may not even have the extra attributes, during the precheck that starts from within check_exclusion_or_unique_constraint() -- it needs to start from the very start at the leaf level, without regard for the particular details of the non-constrained extra columns. I see that you take the number of attributes a new way, so that ultimately _bt_compare()'s "keysz" argument only includes "full" attributes for cases like the UPSERT one. That seems okay. However, what about the case where a tuple is inserted? We need to do unique enforcement on the one hand, but need to start at the right place for insertion on the other hand. Is it okay that _bt_findinsertloc() is passed "indnkeyatts" rather than "natts" now? Now, it looks okay that _bt_check_unique() has also been directly fixed to do the same, but I don't think that _bt_findinsertloc() should have received this treatment. Otherwise, the ordering on leaf pages is subtly broken, which I don't think is okay. I don't see tests for NULL values, which are a little special. Does _bt_isequal() really not require additional checks? It looks correct at a quick glance, but you should definitely have tests for this. Lots of them. Testing ------- Maybe you should add this to your tools used for testing, just customized a bit to use your new feature: https://github.com/petergeoghegan/jjanes_upsert . This stress-testing tool could be very interesting -- it was extremely valuable during the development of UPSERT (code is a bit rough, though -- I don't really know Perl). It should really stress the implementation, and show bugs. I suggest hacking amcheck to only check the leaf level, and make sure it alone is sane/in full order, correcting the patch as needed. This is a small step, and so possibly a good next step. Once we know the keyspace is at least sane at the leaf level (which I think it isn't right now, due to the _bt_findinsertloc() issue), we can build on it. After all, the keyspace at the leaf level should be the same with or without this patch -- right? We can fix that *in isolation*, gaining confidence with amcheck (hacked to just care about leaf pages), which is relatively straightforward. Afterwards, we move on to the more complicated matter of internal pages. Also, I attach a file with a selection of SQL queries that use pageinspect. I wrote these to get a quick sense of the keyspace of a B-Tree index, and a few other interesting things -- seeing how the keyspace looks by looking at internal page highkeys (the last query) is probably particularly valuable for testing a patch like this, or a more general suffix truncation patch. You might have written queries like this yourself already, and if not you can start with these. I was actually thinking of putting them on Github or something myself, but they're a bit rough at the moment. Getting a "quick sense of the keyspace" with the last query lets you see when it has suddenly become unbalanced during development -- it's a good thing to keep an eye on. That's all I have for today. I still haven't actually built the patch myself -- this feedback is all based on reading the code. So hope I missed nothing obvious due to that. -- Peter Geoghegan