Обсуждение: patch: xmltable - proof concept

Поиск
Список
Период
Сортировка

patch: xmltable - proof concept

От
Pavel Stehule
Дата:
Hi

I am sending a initial implementation of xmltable function:

The code is not clean now, but it does almost of expected work. The usage is simple. It is fast - 16K entries in 400ms.

I invite any help with documentation and testing.

The full ANSI/SQL, or Oracle compatible implementation is not possible due limits of libxml2, but for typical usage it should to work well. It doesn't need any new reserved keyword, so there should not be hard barriers for accepting (when this work will be complete).

Example:

postgres=# SELECT * FROM xmldata;
┌──────────────────────────────────────────────────────────────────┐
│                               data                               │
╞══════════════════════════════════════════════════════════════════╡
│ <ROWS>                                                          ↵│
│ <ROW id="1">                                                    ↵│
│   <COUNTRY_ID>AU</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>Australia</COUNTRY_NAME>                        ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="2">                                                    ↵│
│   <COUNTRY_ID>CN</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>China</COUNTRY_NAME>                            ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="3">                                                    ↵│
│   <COUNTRY_ID>HK</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>HongKong</COUNTRY_NAME>                         ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="4">                                                    ↵│
│   <COUNTRY_ID>IN</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>India</COUNTRY_NAME>                            ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="5">                                                    ↵│
│   <COUNTRY_ID>JP</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>Japan</COUNTRY_NAME>                            ↵│
│   <REGION_ID>3</REGION_ID><PREMIER_NAME>Sinzo Abe</PREMIER_NAME>↵│
│ </ROW>                                                          ↵│
│ <ROW id="6">                                                    ↵│
│   <COUNTRY_ID>SG</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>Singapore</COUNTRY_NAME>                        ↵│
│   <REGION_ID>3</REGION_ID><SIZE unit="km">791</SIZE>            ↵│
│ </ROW>                                                          ↵│
│ </ROWS>                                                          │
└──────────────────────────────────────────────────────────────────┘
(1 row)

postgres=# SELECT  xmltable.*
postgres-#    FROM (SELECT data FROM xmldata) x,
postgres-#         LATERAL xmltable('/ROWS/ROW'
postgres(#                          PASSING data
postgres(#                          COLUMNS id int PATH '@id',
postgres(#                                   country_name text PATH 'COUNTRY_NAME',
postgres(#                                   country_id text PATH 'COUNTRY_ID',
postgres(#                                   region_id int PATH 'REGION_ID',
postgres(#                                   size float PATH 'SIZE',
postgres(#                                   unit text PATH 'SIZE/@unit',
postgres(#                                   premier_name text PATH 'PREMIER_NAME' DEFAULT 'not specified');
┌────┬──────────────┬────────────┬───────────┬──────┬──────┬───────────────┐
│ id │ country_name │ country_id │ region_id │ size │ unit │ premier_name  │
╞════╪══════════════╪════════════╪═══════════╪══════╪══════╪═══════════════╡
│  1 │ Australia    │ AU         │         3 │    ¤ │ ¤    │ not specified │
│  2 │ China        │ CN         │         3 │    ¤ │ ¤    │ not specified │
│  3 │ HongKong     │ HK         │         3 │    ¤ │ ¤    │ not specified │
│  4 │ India        │ IN         │         3 │    ¤ │ ¤    │ not specified │
│  5 │ Japan        │ JP         │         3 │    ¤ │ ¤    │ Sinzo Abe     │
│  6 │ Singapore    │ SG         │         3 │  791 │ km   │ not specified │
└────┴──────────────┴────────────┴───────────┴──────┴──────┴───────────────┘
(6 rows)

Regards

Pavel
Вложения

Re: patch: xmltable - proof concept

От
Pavel Stehule
Дата:
Hi

2016-08-07 11:15 GMT+02:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi

I am sending a initial implementation of xmltable function:

The code is not clean now, but it does almost of expected work. The usage is simple. It is fast - 16K entries in 400ms.

I invite any help with documentation and testing.

The full ANSI/SQL, or Oracle compatible implementation is not possible due limits of libxml2, but for typical usage it should to work well. It doesn't need any new reserved keyword, so there should not be hard barriers for accepting (when this work will be complete).

Example:

postgres=# SELECT * FROM xmldata;
┌──────────────────────────────────────────────────────────────────┐
│                               data                               │
╞══════════════════════════════════════════════════════════════════╡
│ <ROWS>                                                          ↵│
│ <ROW id="1">                                                    ↵│
│   <COUNTRY_ID>AU</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>Australia</COUNTRY_NAME>                        ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="2">                                                    ↵│
│   <COUNTRY_ID>CN</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>China</COUNTRY_NAME>                            ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="3">                                                    ↵│
│   <COUNTRY_ID>HK</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>HongKong</COUNTRY_NAME>                         ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="4">                                                    ↵│
│   <COUNTRY_ID>IN</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>India</COUNTRY_NAME>                            ↵│
│   <REGION_ID>3</REGION_ID>                                      ↵│
│ </ROW>                                                          ↵│
│ <ROW id="5">                                                    ↵│
│   <COUNTRY_ID>JP</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>Japan</COUNTRY_NAME>                            ↵│
│   <REGION_ID>3</REGION_ID><PREMIER_NAME>Sinzo Abe</PREMIER_NAME>↵│
│ </ROW>                                                          ↵│
│ <ROW id="6">                                                    ↵│
│   <COUNTRY_ID>SG</COUNTRY_ID>                                   ↵│
│   <COUNTRY_NAME>Singapore</COUNTRY_NAME>                        ↵│
│   <REGION_ID>3</REGION_ID><SIZE unit="km">791</SIZE>            ↵│
│ </ROW>                                                          ↵│
│ </ROWS>                                                          │
└──────────────────────────────────────────────────────────────────┘
(1 row)

postgres=# SELECT  xmltable.*
postgres-#    FROM (SELECT data FROM xmldata) x,
postgres-#         LATERAL xmltable('/ROWS/ROW'
postgres(#                          PASSING data
postgres(#                          COLUMNS id int PATH '@id',
postgres(#                                   country_name text PATH 'COUNTRY_NAME',
postgres(#                                   country_id text PATH 'COUNTRY_ID',
postgres(#                                   region_id int PATH 'REGION_ID',
postgres(#                                   size float PATH 'SIZE',
postgres(#                                   unit text PATH 'SIZE/@unit',
postgres(#                                   premier_name text PATH 'PREMIER_NAME' DEFAULT 'not specified');
┌────┬──────────────┬────────────┬───────────┬──────┬──────┬───────────────┐
│ id │ country_name │ country_id │ region_id │ size │ unit │ premier_name  │
╞════╪══════════════╪════════════╪═══════════╪══════╪══════╪═══════════════╡
│  1 │ Australia    │ AU         │         3 │    ¤ │ ¤    │ not specified │
│  2 │ China        │ CN         │         3 │    ¤ │ ¤    │ not specified │
│  3 │ HongKong     │ HK         │         3 │    ¤ │ ¤    │ not specified │
│  4 │ India        │ IN         │         3 │    ¤ │ ¤    │ not specified │
│  5 │ Japan        │ JP         │         3 │    ¤ │ ¤    │ Sinzo Abe     │
│  6 │ Singapore    │ SG         │         3 │  791 │ km   │ not specified │
└────┴──────────────┴────────────┴───────────┴──────┴──────┴───────────────┘
(6 rows)

Regards


I am sending updated version - the code is not better, but there is full functionality implemented.

* xmlnamespaces,
* default xmlnamespace,
* ordinality column,
* NOT NULL constraint,
* mode without explicitly defined columns.

Lot of bugs was fixed - it is ready for some playing.

tests, comments, notes, comparing with other db are welcome. Some behave is based by libxml2 possibilities - so only XPath is supported. 

Regards

Pavel
 
Pavel

Вложения

Re: patch: xmltable - proof concept

От
Alvaro Herrera
Дата:
Pavel Stehule wrote:

> postgres=# SELECT  xmltable.*
> postgres-#    FROM (SELECT data FROM xmldata) x,
> postgres-#         LATERAL xmltable('/ROWS/ROW'
> postgres(#                          PASSING data
> postgres(#                          COLUMNS id int PATH '@id',
> postgres(#                                   country_name text PATH
> 'COUNTRY_NAME',
> postgres(#                                   country_id text PATH
> 'COUNTRY_ID',
> postgres(#                                   region_id int PATH 'REGION_ID',
> postgres(#                                   size float PATH 'SIZE',
> postgres(#                                   unit text PATH 'SIZE/@unit',
> postgres(#                                   premier_name text PATH
> 'PREMIER_NAME' DEFAULT 'not specified');
> ┌────┬──────────────┬────────────┬───────────┬──────┬──────┬───────────────┐
> │ id │ country_name │ country_id │ region_id │ size │ unit │ premier_name  │
> ╞════╪══════════════╪════════════╪═══════════╪══════╪══════╪═══════════════╡
> │  1 │ Australia    │ AU         │         3 │    ¤ │ ¤    │ not specified │
> │  2 │ China        │ CN         │         3 │    ¤ │ ¤    │ not specified │
> │  3 │ HongKong     │ HK         │         3 │    ¤ │ ¤    │ not specified │
> │  4 │ India        │ IN         │         3 │    ¤ │ ¤    │ not specified │
> │  5 │ Japan        │ JP         │         3 │    ¤ │ ¤    │ Sinzo Abe     │
> │  6 │ Singapore    │ SG         │         3 │  791 │ km   │ not specified │
> └────┴──────────────┴────────────┴───────────┴──────┴──────┴───────────────┘
> (6 rows)

Nice work!

-- 
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: patch: xmltable - proof concept

От
Pavel Stehule
Дата:


2016-08-09 19:30 GMT+02:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
Pavel Stehule wrote:

> postgres=# SELECT  xmltable.*
> postgres-#    FROM (SELECT data FROM xmldata) x,
> postgres-#         LATERAL xmltable('/ROWS/ROW'
> postgres(#                          PASSING data
> postgres(#                          COLUMNS id int PATH '@id',
> postgres(#                                   country_name text PATH
> 'COUNTRY_NAME',
> postgres(#                                   country_id text PATH
> 'COUNTRY_ID',
> postgres(#                                   region_id int PATH 'REGION_ID',
> postgres(#                                   size float PATH 'SIZE',
> postgres(#                                   unit text PATH 'SIZE/@unit',
> postgres(#                                   premier_name text PATH
> 'PREMIER_NAME' DEFAULT 'not specified');
> ┌────┬──────────────┬────────────┬───────────┬──────┬──────┬───────────────┐
> │ id │ country_name │ country_id │ region_id │ size │ unit │ premier_name  │
> ╞════╪══════════════╪════════════╪═══════════╪══════╪══════╪═══════════════╡
> │  1 │ Australia    │ AU         │         3 │    ¤ │ ¤    │ not specified │
> │  2 │ China        │ CN         │         3 │    ¤ │ ¤    │ not specified │
> │  3 │ HongKong     │ HK         │         3 │    ¤ │ ¤    │ not specified │
> │  4 │ India        │ IN         │         3 │    ¤ │ ¤    │ not specified │
> │  5 │ Japan        │ JP         │         3 │    ¤ │ ¤    │ Sinzo Abe     │
> │  6 │ Singapore    │ SG         │         3 │  791 │ km   │ not specified │
> └────┴──────────────┴────────────┴───────────┴──────┴──────┴───────────────┘
> (6 rows)

Nice work!

Thank you

Pavel
 

--
Álvaro Herrera                http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services