Обсуждение: json api WIP patch

Поиск

Список

Период

Сортировка

Re: json api WIP patch

От

Robert Haas

Дата:

03 января 2013 г., 00:45:39

On Wed, Dec 26, 2012 at 3:33 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> Here is a patch for the first part of the JSON API that was recently
> discussed. It includes the json  parser hook infrastructure and functions
> for json_get and friends, plus json_keys.
>
> As is, this exposes the json lexer fully for use by the hook functions. But
> I could easily be persuaded that this should be an opaque structure with
> some constructor and getter functions - I don't think the hook functions
> need or should be able to set anything in the lexer.
>
> Work is proceeding on some of the more advanced functionality discussed.

This seems to contain a large number of spurious whitespace changes.

And maybe some other spurious changes.  For example, I'm not sure why
this comment is truncated:
         }     }

!     /*
!      * Check for trailing garbage.  As in json_lex(), any alphanumeric stuff
!      * here should be considered part of the token for error-reporting
!      * purposes.
!      */     for (p = s; JSON_ALPHANUMERIC_CHAR(*p); p++)         error = true;     lex->token_terminator = p;     if
(error)        report_invalid_token(lex);
 
--- 730,739 ----         }     }

!     /* Check for trailing garbage. */     for (p = s; JSON_ALPHANUMERIC_CHAR(*p); p++)         error = true;
+     lex->prev_token_terminator = lex->token_terminator;     lex->token_terminator = p;     if (error)
report_invalid_token(lex);


-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: json api WIP patch

От

Andrew Dunstan

Дата:

03 января 2013 г., 01:52:09

On 01/02/2013 04:45 PM, Robert Haas wrote:
> On Wed, Dec 26, 2012 at 3:33 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Here is a patch for the first part of the JSON API that was recently
>> discussed. It includes the json  parser hook infrastructure and functions
>> for json_get and friends, plus json_keys.
>>
>> As is, this exposes the json lexer fully for use by the hook functions. But
>> I could easily be persuaded that this should be an opaque structure with
>> some constructor and getter functions - I don't think the hook functions
>> need or should be able to set anything in the lexer.
>>
>> Work is proceeding on some of the more advanced functionality discussed.
> This seems to contain a large number of spurious whitespace changes.

I'm glad you're looking at it :-)

I did do a run of pgindent on the changed files before I cut the patch, 
which might have made some of those changes.

I notice a couple of other infelicities too, which are undoubtedly my fault.

The original prototype of this was cut against some older code, and I 
then tried to merge it with the current code base, and make sure that 
all the regression tests passed. That might well have resulted in a 
number of things that need review.

>
> And maybe some other spurious changes.  For example, I'm not sure why
> this comment is truncated:
>
>    

Another good question.

I'll make another pass over this and try to remove some of what's 
annoying you.

cheers

andrew

Re: json api WIP patch

От

Pavel Stehule

Дата:

04 января 2013 г., 23:36:52

Hello



2013/1/4 Andrew Dunstan <andrew@dunslane.net>:
>
> On 01/02/2013 05:51 PM, Andrew Dunstan wrote:
>>
>>
>> On 01/02/2013 04:45 PM, Robert Haas wrote:
>>>
>>> On Wed, Dec 26, 2012 at 3:33 PM, Andrew Dunstan <andrew@dunslane.net>
>>> wrote:
>>>>
>>>> Here is a patch for the first part of the JSON API that was recently
>>>> discussed. It includes the json  parser hook infrastructure and
>>>> functions
>>>> for json_get and friends, plus json_keys.
>
>
>
> Udated patch that contains most of the functionality I'm after. One piece
> left is populate_recordset (populate a set of records from a single json
> datum which is an array of objects, in one pass). That requires a bit of
> thought.
>
> I hope most of the whitespace issues are fixed.
>

it is looking well

I have one note - is it "json_each" good name?

Regards

Pavel

> cheers
>
> andrew
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>

Re: json api WIP patch

От

Andrew Dunstan

Дата:

04 января 2013 г., 23:52:13

On 01/04/2013 03:36 PM, Pavel Stehule wrote:
> Hello
>
>
>
> 2013/1/4 Andrew Dunstan <andrew@dunslane.net>:
>> On 01/02/2013 05:51 PM, Andrew Dunstan wrote:
>>>
>>> On 01/02/2013 04:45 PM, Robert Haas wrote:
>>>> On Wed, Dec 26, 2012 at 3:33 PM, Andrew Dunstan <andrew@dunslane.net>
>>>> wrote:
>>>>> Here is a patch for the first part of the JSON API that was recently
>>>>> discussed. It includes the json  parser hook infrastructure and
>>>>> functions
>>>>> for json_get and friends, plus json_keys.
>>
>>
>> Udated patch that contains most of the functionality I'm after. One piece
>> left is populate_recordset (populate a set of records from a single json
>> datum which is an array of objects, in one pass). That requires a bit of
>> thought.
>>
>> I hope most of the whitespace issues are fixed.
>>
> it is looking well
>
> I have one note - is it "json_each" good name?
>


Possibly not, although hstore has each(). json_unnest might be even less 
felicitous.

I'm expecting a good deal of bikeshedding - I'm trying to ignore those 
issues for the most part because the functionality seems much more 
important to me than the names.

The more people that pound on it and try to break it the happier I'll be.

cheers

andrew

Re: json api WIP patch

От

Pavel Stehule

Дата:

05 января 2013 г., 00:04:25

2013/1/4 Andrew Dunstan <andrew@dunslane.net>:
>
> On 01/04/2013 03:36 PM, Pavel Stehule wrote:
>>
>> Hello
>>
>>
>>
>> 2013/1/4 Andrew Dunstan <andrew@dunslane.net>:
>>>
>>> On 01/02/2013 05:51 PM, Andrew Dunstan wrote:
>>>>
>>>>
>>>> On 01/02/2013 04:45 PM, Robert Haas wrote:
>>>>>
>>>>> On Wed, Dec 26, 2012 at 3:33 PM, Andrew Dunstan <andrew@dunslane.net>
>>>>> wrote:
>>>>>>
>>>>>> Here is a patch for the first part of the JSON API that was recently
>>>>>> discussed. It includes the json  parser hook infrastructure and
>>>>>> functions
>>>>>> for json_get and friends, plus json_keys.
>>>
>>>
>>>
>>> Udated patch that contains most of the functionality I'm after. One piece
>>> left is populate_recordset (populate a set of records from a single json
>>> datum which is an array of objects, in one pass). That requires a bit of
>>> thought.
>>>
>>> I hope most of the whitespace issues are fixed.
>>>
>> it is looking well
>>
>> I have one note - is it "json_each" good name?
>>
>
>
> Possibly not, although hstore has each(). json_unnest might be even less
> felicitous.

I understand - but hstore isn't in core  - so it should not be precedent

regexp_split_to_table

I am not native speaker, it sounds little bit strange - but maybe
because I am not native speaker :)

Regards

Pavel

>
> I'm expecting a good deal of bikeshedding - I'm trying to ignore those
> issues for the most part because the functionality seems much more important
> to me than the names.
>
> The more people that pound on it and try to break it the happier I'll be.
>
> cheers
>
> andrew

Re: json api WIP patch

От

Merlin Moncure

Дата:

07 января 2013 г., 18:25:21

On Fri, Jan 4, 2013 at 3:03 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
> I understand - but hstore isn't in core  - so it should not be precedent
>
> regexp_split_to_table
>
> I am not native speaker, it sounds little bit strange - but maybe
> because I am not native speaker :)

it's common usage: http://api.jquery.com/jQuery.each/

the patch looks fabulous.  There are a few trivial whitespace issues
yet and I noticed a leaked hstore comment@ 2440:
+     /*
+      * if the input hstore is empty, we can only skip the rest if we were
+      * passed in a non-null record, since otherwise there may be issues with
+      * domain nulls.
+      */

merlin

Re: json api WIP patch

От

Pavel Stehule

Дата:

07 января 2013 г., 22:32:02

2013/1/7 Merlin Moncure <mmoncure@gmail.com>:
> On Fri, Jan 4, 2013 at 3:03 PM, Pavel Stehule <pavel.stehule@gmail.com> wrote:
>> I understand - but hstore isn't in core  - so it should not be precedent
>>
>> regexp_split_to_table
>>
>> I am not native speaker, it sounds little bit strange - but maybe
>> because I am not native speaker :)
>
> it's common usage: http://api.jquery.com/jQuery.each/
>

ook

Regards

Pavel

> the patch looks fabulous.  There are a few trivial whitespace issues
> yet and I noticed a leaked hstore comment@ 2440:
> +       /*
> +        * if the input hstore is empty, we can only skip the rest if we were
> +        * passed in a non-null record, since otherwise there may be issues with
> +        * domain nulls.
> +        */
>
> merlin

Re: json api WIP patch

От

james

Дата:

08 января 2013 г., 09:45:58

> The processing functions have been extended to provide populate_record() and populate_recordset() functions.The
latterin particular could be useful in decomposing a piece of json representing an array of flat objects (a fairly
commonpattern) into a set of Postgres records in a single pass.
 

So this would allow an 'insert into ... select ... from 
<unpack-the-JSON>(...)'?

I had been wondering how to do such an insertion efficiently in the 
context of SPI, but it seems that there is no SPI_copy equiv that would 
allow a query parse and plan to be avoided.

Is this mechanism likely to be as fast as we can get at the moment in 
contexts where copy is not feasible?

Re: json api WIP patch

От

Andrew Dunstan

Дата:

08 января 2013 г., 17:59:10

On 01/08/2013 01:45 AM, james wrote:
>> The processing functions have been extended to provide 
>> populate_record() and populate_recordset() functions.The latter in 
>> particular could be useful in decomposing a piece of json 
>> representing an array of flat objects (a fairly common pattern) into 
>> a set of Postgres records in a single pass.
>
> So this would allow an 'insert into ... select ... from 
> <unpack-the-JSON>(...)'?

Yes.

>
> I had been wondering how to do such an insertion efficiently in the 
> context of SPI, but it seems that there is no SPI_copy equiv that 
> would allow a query parse and plan to be avoided.

Your query above would need to be planned too, although the plan will be 
trivial.

>
> Is this mechanism likely to be as fast as we can get at the moment in 
> contexts where copy is not feasible?
>

You should not try to use it as a general bulk load facility. And it 
will not be as fast as COPY for several reasons, including that the Json 
parsing routines are necessarily much heavier than the COPY parse 
routines, which have in any case been optimized over quite a long 
period. Also, a single json datum is limited to no more than 1Gb. If you 
have such a datum, parsing it involves having it in memory and then 
taking a copy (I wonder if we could avoid that step - will take a look). 
Then each object is decomposed into a hash table of key value pairs, 
which it then used to construct the record datum. Each field name  in 
the result record is used to look up the value in the hash table - this 
happens once in the case of populate_record() and once per object in the 
array in the case of populate_recordset(). In the latter case the 
resulting records are put into a tuplestore structure (which spills to 
disk if necessary) which is then returned to the caller when all the 
objects in the json array are processed. COPY doesn't have these sorts 
of issues. It knows without having to look things up where each datum is 
in each record, and it stashes the result straight into the target 
table. It can read and insert huge numbers of rows without significant 
memory implications.

Both these routines and COPY in non-binary mode use the data type input 
routines to convert text values. In some cases (very notably timestamps) 
these routines can easily be shown to be fantastically expensive 
compared to binary input. This is part of what has led to the creation 
of utilities like pg_bulkload.

Perhaps if you give us a higher level view of what you're trying to 
achieve we can help you better.

cheers

andrew

Re: json api WIP patch

От

james

Дата:

08 января 2013 г., 23:07:22

>>
>> I had been wondering how to do such an insertion efficiently in the context of SPI, but it seems that there is no
SPI_copyequiv that would allow a query parse and plan to be avoided.
 
>
> Your query above would need to be planned too, although the plan will be trivial.

Ah yes, I meant that I had not found a way to avoid it (for multi-row 
inserts etc) from a stored proc context where I have SPI functions 
available.

> You should not try to use it as a general bulk load facility. And it will not be as fast as COPY for several reasons,
includingthat the Json parsing routines are necessarily much heavier than the COPY parse routines, which have in any
casebeen optimized over quite a long period. Also, a single json datum is limited to no more than 1Gb. If you have such
adatum, parsing it involves having it in memory and then taking a copy (I wonder if we could avoid that step - will
takea look). Then each object is decomposed into a hash table of key value pairs, which it then used to construct the
recorddatum. Each field name  in the result record is used to look up the value in the hash table - this happens once
inthe case of populate_record() and once per object in the array in the case of populate_recordset(). In the latter
casethe resulting records are put into a tuplestore structure (which spills to disk if necessary) which is then
returnedto the caller when all the objects in the js
 
on array are processed. COPY doesn't have these sorts of issues. It knows without having to look things up where each
datumis in each record, and it stashes the result straight into the target table. It can read and insert huge numbers
ofrows without significant memory implications.
 

Yes - but I don't think I can use COPY from a stored proc context can I?  If I could use binary COPY from a stored proc
thathas received a 
 
binary param and unpacked to the data, it would be handy.

If SPI provided a way to perform a copy to a temp table and then some 
callback on an iterator that yields rows to it, that would do the trick 
I guess.

> Perhaps if you give us a higher level view of what you're trying to achieve we can help you better.

I had been trying to identify a way to work with record sets where the 
records might be used for insert, or for updates or deletion statements, 
preferably without forming a large custom SQL statement that must then 
be parsed and planned (and which would be a PITA if I wanted to use the 
SQL-C preprocessor or some language bindings that like to prepare a 
statement and execute with params).

The data I work with has a master-detail structure and insertion 
performance matters, so I'm trying to limit manipulations to one 
statement per table per logical operation even where there are multiple 
detail rows.

Sometimes the network latency can be a pain too and that also suggests 
an RPC with unpack and insert locally.

Cheers
James

Re: json api WIP patch

От

Andrew Dunstan

Дата:

08 января 2013 г., 23:13:05

On 01/08/2013 09:58 AM, Andrew Dunstan wrote:
>
> If you have such a datum, parsing it involves having it in memory and 
> then taking a copy (I wonder if we could avoid that step - will take a 
> look).

Here is a Proof Of Concept patch against my development tip on what's 
involved in getting the JSON lexer not to need a nul-terminated string 
to parse. This passes regression, incidentally. The downside is that 
processing is very slightly more complex, and that json_in() would need 
to call strlen() on its input. The upside would be that the processing 
routines I've been working on would no longer need to create copies of 
their json arguments using text_to_cstring() just so they can get a 
null-terminated string to process.

Consequent changes would modify the signature of makeJsonLexContext() so 
it's first argument would be a text* instead of a char* (and of course 
its logic would change accordingly).

I could go either way. Thoughts?

cheers

andrew

Re: json api WIP patch

От

Andrew Dunstan

Дата:

08 января 2013 г., 23:26:54

On 01/08/2013 03:07 PM, james wrote:
>
>
> Yes - but I don't think I can use COPY from a stored proc context can 
> I?  If I could use binary COPY from a stored proc that has received a 
> binary param and unpacked to the data, it would be handy.
>

You can use COPY from a stored procedure, but only to and from files.

> If SPI provided a way to perform a copy to a temp table and then some 
> callback on an iterator that yields rows to it, that would do the 
> trick I guess.

SPI is useful, but it's certainly possible to avoid its use. After all, 
that what almost the whole backend does, including the COPY code. Of 
course, it's a lot harder to write that way, which is part of why SPI 
exists. Efficiency has its price.

cheers

andrew

Re: json api WIP patch

От

Peter Eisentraut

Дата:

09 января 2013 г., 00:18:55

On 1/7/13 5:15 PM, Andrew Dunstan wrote:
> You (Merlin) have kindly volunteered to work on documentation, so before
> we go too far with that any bikeshedding on names, or on the
> functionality being provided, should now take place.

Hmm, I was going to say, this patch contains no documentation, so I have
no idea what it is supposed to do.  "Recently discussed" isn't a good
substitute for describing what the patch is supposed to accomplish.

Re: json api WIP patch

От

james

Дата:

09 января 2013 г., 00:31:31

> You can use COPY from a stored procedure, but only to and from files.

I think that's in the chocolate fireguard realm though as far as 
efficiency for this sort of scenario goes, even if its handled by 
retaining an mmap'd file as workspace.

>
>> If SPI provided a way to perform a copy to a temp table and then some callback on an iterator that yields rows to
it,that would do the trick I guess.
 
>
> SPI is useful, but it's certainly possible to avoid its use. After all, that what almost the whole backend does,
includingthe COPY code. Of course, it's a lot harder to write that way, which is part of why SPI exists. Efficiency has
itsprice.
 

So it is possible to use a lower level interface from a C stored proc? 
SPI is the (only) documented direct function extension API isn't it?

Is the issue with using the JSON data-to-record set that the parsing can 
be costly?  Perhaps it can be achieved with B64 of compressed protobuf, 
or such.  I don't mind if it seems a bit messy - the code can be 
generated from the table easily enough, especially if I can use C++.  I 
guess an allocator that uses SPI_palloc would solve issues with memory 
management on error?

Re: json api WIP patch

От

Merlin Moncure

Дата:

09 января 2013 г., 00:32:49

On Tue, Jan 8, 2013 at 3:19 PM, Peter Eisentraut <peter_e@gmx.net> wrote:
> On 1/7/13 5:15 PM, Andrew Dunstan wrote:
>> You (Merlin) have kindly volunteered to work on documentation, so before
>> we go too far with that any bikeshedding on names, or on the
>> functionality being provided, should now take place.
>
> Hmm, I was going to say, this patch contains no documentation, so I have
> no idea what it is supposed to do.  "Recently discussed" isn't a good
> substitute for describing what the patch is supposed to accomplish.

Why not?  There are functional examples in the docs and the purpose of
the various functions was hashed out pretty well a couple weeks back,
deficiencies corrected, etc.

reference: http://postgresql.1045698.n5.nabble.com/json-accessors-td5733929.html

merlin

Re: json api WIP patch

От

Andrew Dunstan

Дата:

09 января 2013 г., 01:01:19

On 01/08/2013 04:32 PM, Merlin Moncure wrote:
> On Tue, Jan 8, 2013 at 3:19 PM, Peter Eisentraut<peter_e@gmx.net>  wrote:
>> On 1/7/13 5:15 PM, Andrew Dunstan wrote:
>>> You (Merlin) have kindly volunteered to work on documentation, so before
>>> we go too far with that any bikeshedding on names, or on the
>>> functionality being provided, should now take place.
>> Hmm, I was going to say, this patch contains no documentation, so I have
>> no idea what it is supposed to do.  "Recently discussed" isn't a good
>> substitute for describing what the patch is supposed to accomplish.
> Why not?  There are functional examples in the docs and the purpose of
> the various functions was hashed out pretty well a couple weeks back,
> deficiencies corrected, etc.
>
> reference:http://postgresql.1045698.n5.nabble.com/json-accessors-td5733929.html

Well, at a high level the patch is meant to do two things: provide an 
API that can be used to build JSON processing functions easily, and 
provide some basic json processing functions built on the API. Those 
functions provide similar capabilities to the accessor functions that 
hstore has.

Perhaps also this will help. Here is the list of functions and operators 
as currently implemented. I also have working operators for the get_path 
functions which will be in a future patch.

All these are used in the included regression tests.

          Name           | Result data type |                           Argument data types

-------------------------+------------------+------------------------------------------------------------------------
 json_array_length       | integer          | json
 json_each               | SETOF record     | from_json json, OUT key text, OUT value json
 json_each_as_text       | SETOF record     | from_json json, OUT key text, OUT value text
 json_get                | json             | json, integer
 json_get                | json             | json, text
 json_get_as_text        | text             | json, integer
 json_get_as_text        | text             | json, text
 json_get_path           | json             | from_json json, VARIADIC path_elems text[]
 json_get_path_as_text   | text             | from_json json, VARIADIC path_elems text[]
 json_object_keys        | SETOF text       | json
 json_populate_record    | anyelement       | anyelement, json
 json_populate_recordset | SETOF anyelement | base anyelement, from_json json, use_json_as_text boolean DEFAULT false
 json_unnest             | SETOF json       | from_json json, OUT value json
                                    
 Name | Left arg type | Right arg type | Result type |          Description

------+---------------+----------------+-------------+--------------------------------
 ->   | json          | integer        | json        | get json array element
 ->   | json          | text           | json        | get json object field
 ->>  | json          | integer        | text        | get json array element as text
 ->>  | json          | text           | text        | get json object field as text



cheers

andrew

Re: json api WIP patch

От

Merlin Moncure

Дата:

10 января 2013 г., 23:01:25

On Mon, Jan 7, 2013 at 4:15 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> You (Merlin) have kindly volunteered to work on documentation, so before we
> go too far with that any bikeshedding on names, or on the functionality
> being provided, should now take place.

Barring comment/complaint, I'm just going to roll with what we've got.It seems pretty good to me.

merlin

Re: json api WIP patch

От

Robert Haas

Дата:

14 января 2013 г., 19:32:53

On Thu, Jan 10, 2013 at 6:42 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> Udated patch that contains most of the functionality I'm after. One piece
>> left is populate_recordset (populate a set of records from a single json
>> datum which is an array of objects, in one pass). That requires a bit of
>> thought.
>>
>> I hope most of the whitespace issues are fixed.
>
>
> This updated patch contains all the intended functionality, including
> operators for the json_get_path functions, so you can say things like
>
>     select jsonval->array['f1','0','f2] ...
>
> It also removes any requirement to copy the json value before setting up the
> lexer by removing the lexer's requirement to have a nul terminated string.
> Instead the lexer is told the input length and relies on that. For this
> reason, json_in() now calls cstring_get_text() before rather than after
> calling the validation routine, but that's really not something worth
> bothering about.
>
> A couple of points worth noting: it's a pity that we have to run CREATE OR
> REPLACE FUNCTION in system_views.sql in order to set up default values for
> builtin functions. That feels very kludgy. Also, making operators for
> variadic functions is a bit of a pain. I had to set up non-variadic version
> of the same functions (see json_get_path_op and json_get_path_as_text_op)
> just so I could set up the operators. Neither of these are exactly
> showstopper items, just mild annoyances.
>
> I will continue hunting memory leaks, but when Merlin gets done with docco I
> think we'll be far enough advanced to add this to the commitfest.

So, how much performance does this lose on json_in() on a large
cstring, as compared with master?

I can't shake the feeling that this is adding a LOT of unnecessary
data copying.  For one thing, instead of copying every single lexeme
(including the single-character ones?) out of the original object, we
could just store a pointer to the offset where the object starts and a
length, instead of copying it.

This is also remarkably thin on comments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: json api WIP patch

От

Andrew Dunstan

Дата:

14 января 2013 г., 20:52:30

On 01/14/2013 11:32 AM, Robert Haas wrote:
>
> So, how much performance does this lose on json_in() on a large
> cstring, as compared with master?

That's a good question. I'll try to devise a test.

>
> I can't shake the feeling that this is adding a LOT of unnecessary
> data copying.  For one thing, instead of copying every single lexeme
> (including the single-character ones?) out of the original object, we
> could just store a pointer to the offset where the object starts and a
> length, instead of copying it.

In the pure pares case (json_in, json_reccv) nothing extra should be 
copied. On checking this after reading the above I found that wasn't 
quite the case, and some lexemes (scalars and field names, but not 
punctuation) were being copied when not needed. I have made a fix (see 
<https://bitbucket.org/adunstan/pgdevel/commits/139043dba7e6b15f1f9f7675732bd9dae1fb6497>) 
which I will include in the next version I publish.

In the case of string lexemes, we are passing back a de-escaped version, 
so just handing back pointers to the beginning and end in the input 
string doesn't work.

>
> This is also remarkably thin on comments.

Fair criticism. I'll work on that.

Thanks for looking at this.

cheers

andrew

Re: json api WIP patch

От

Merlin Moncure

Дата:

15 января 2013 г., 03:36:07

On Thu, Jan 10, 2013 at 5:42 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 01/04/2013 03:23 PM, Andrew Dunstan wrote:
>>
>>
>> On 01/02/2013 05:51 PM, Andrew Dunstan wrote:
>>>
>>>
>>> On 01/02/2013 04:45 PM, Robert Haas wrote:
>>>>
>>>> On Wed, Dec 26, 2012 at 3:33 PM, Andrew Dunstan <andrew@dunslane.net>
>>>> wrote:
>>>>>
>>>>> Here is a patch for the first part of the JSON API that was recently
>>>>> discussed. It includes the json  parser hook infrastructure and
>>>>> functions
>>>>> for json_get and friends, plus json_keys.
>>
>>
>>
>> Udated patch that contains most of the functionality I'm after. One piece
>> left is populate_recordset (populate a set of records from a single json
>> datum which is an array of objects, in one pass). That requires a bit of
>> thought.
>>
>> I hope most of the whitespace issues are fixed.
>>
>>
>
>
> This updated patch contains all the intended functionality, including
> operators for the json_get_path functions, so you can say things like
>
>     select jsonval->array['f1','0','f2] ...
>
> It also removes any requirement to copy the json value before setting up the
> lexer by removing the lexer's requirement to have a nul terminated string.
> Instead the lexer is told the input length and relies on that. For this
> reason, json_in() now calls cstring_get_text() before rather than after
> calling the validation routine, but that's really not something worth
> bothering about.
>
> A couple of points worth noting: it's a pity that we have to run CREATE OR
> REPLACE FUNCTION in system_views.sql in order to set up default values for
> builtin functions. That feels very kludgy. Also, making operators for
> variadic functions is a bit of a pain. I had to set up non-variadic version
> of the same functions (see json_get_path_op and json_get_path_as_text_op)
> just so I could set up the operators. Neither of these are exactly
> showstopper items, just mild annoyances.
>
> I will continue hunting memory leaks, but when Merlin gets done with docco I
> think we'll be far enough advanced to add this to the commitfest.

While testing this I noticed that integer based 'get' routines are
zero based -- was this intentional?  Virtually all other aspects of
SQL are 1 based:

postgres=# select json_get('[1,2,3]', 1);json_get
----------2
(1 row)

postgres=# select json_get('[1,2,3]', 0);json_get
----------1
(1 row)

merlin

Re: json api WIP patch

От

Andrew Dunstan

Дата:

15 января 2013 г., 03:53:16

On 01/14/2013 07:36 PM, Merlin Moncure wrote:
> While testing this I noticed that integer based 'get' routines are
> zero based -- was this intentional?  Virtually all other aspects of
> SQL are 1 based:
>
> postgres=# select json_get('[1,2,3]', 1);
>   json_get
> ----------
>   2
> (1 row)
>
> postgres=# select json_get('[1,2,3]', 0);
>   json_get
> ----------
>   1
> (1 row)
>

Yes. it's intentional. SQL arrays might be 1-based by default, but 
JavaScript arrays are not. JsonPath and similar gadgets treat the arrays 
as zero-based. I suspect the Json-using community would not thank us for 
being overly SQL-centric on this - and I say that as someone who has 
always thought zero based arrays were a major design mistake, 
responsible for countless off-by-one errors.

cheers

andrew

Re: json api WIP patch

От

David Fetter

Дата:

15 января 2013 г., 22:04:37

On Mon, Jan 14, 2013 at 07:52:56PM -0500, Andrew Dunstan wrote:
> 
> On 01/14/2013 07:36 PM, Merlin Moncure wrote:
> >While testing this I noticed that integer based 'get' routines are
> >zero based -- was this intentional?  Virtually all other aspects of
> >SQL are 1 based:
> >
> >postgres=# select json_get('[1,2,3]', 1);
> >  json_get
> >----------
> >  2
> >(1 row)
> >
> >postgres=# select json_get('[1,2,3]', 0);
> >  json_get
> >----------
> >  1
> >(1 row)
> 
> Yes. it's intentional. SQL arrays might be 1-based by default, but
> JavaScript arrays are not. JsonPath and similar gadgets treat the
> arrays as zero-based. I suspect the Json-using community would not
> thank us for being overly SQL-centric on this - and I say that as
> someone who has always thought zero based arrays were a major design
> mistake, responsible for countless off-by-one errors.

Perhaps we could compromise by making arrays 0.5-based.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

Re: json api WIP patch

От

Merlin Moncure

Дата:

15 января 2013 г., 22:47:26

On Tue, Jan 15, 2013 at 1:04 PM, David Fetter <david@fetter.org> wrote:
> On Mon, Jan 14, 2013 at 07:52:56PM -0500, Andrew Dunstan wrote:
>>
>> On 01/14/2013 07:36 PM, Merlin Moncure wrote:
>> >While testing this I noticed that integer based 'get' routines are
>> >zero based -- was this intentional?  Virtually all other aspects of
>> >SQL are 1 based:
>> >
>> >postgres=# select json_get('[1,2,3]', 1);
>> >  json_get
>> >----------
>> >  2
>> >(1 row)
>> >
>> >postgres=# select json_get('[1,2,3]', 0);
>> >  json_get
>> >----------
>> >  1
>> >(1 row)
>>
>> Yes. it's intentional. SQL arrays might be 1-based by default, but
>> JavaScript arrays are not. JsonPath and similar gadgets treat the
>> arrays as zero-based. I suspect the Json-using community would not
>> thank us for being overly SQL-centric on this - and I say that as
>> someone who has always thought zero based arrays were a major design
>> mistake, responsible for countless off-by-one errors.
>
> Perhaps we could compromise by making arrays 0.5-based.

Well, I'm not prepared to argue with Andrew in this one.  It was
surprising behavior to me, but that's sample size one.

merlin

Re: json api WIP patch

От

Andrew Dunstan

Дата:

15 января 2013 г., 23:17:31

On 01/15/2013 02:47 PM, Merlin Moncure wrote:
> On Tue, Jan 15, 2013 at 1:04 PM, David Fetter <david@fetter.org> wrote:
>> On Mon, Jan 14, 2013 at 07:52:56PM -0500, Andrew Dunstan wrote:
>>> On 01/14/2013 07:36 PM, Merlin Moncure wrote:
>>>> While testing this I noticed that integer based 'get' routines are
>>>> zero based -- was this intentional?  Virtually all other aspects of
>>>> SQL are 1 based:
>>>>
>>>> postgres=# select json_get('[1,2,3]', 1);
>>>>   json_get
>>>> ----------
>>>>   2
>>>> (1 row)
>>>>
>>>> postgres=# select json_get('[1,2,3]', 0);
>>>>   json_get
>>>> ----------
>>>>   1
>>>> (1 row)
>>> Yes. it's intentional. SQL arrays might be 1-based by default, but
>>> JavaScript arrays are not. JsonPath and similar gadgets treat the
>>> arrays as zero-based. I suspect the Json-using community would not
>>> thank us for being overly SQL-centric on this - and I say that as
>>> someone who has always thought zero based arrays were a major design
>>> mistake, responsible for countless off-by-one errors.
>> Perhaps we could compromise by making arrays 0.5-based.
> Well, I'm not prepared to argue with Andrew in this one.  It was
> surprising behavior to me, but that's sample size one.
>


I doubt I'm very representative either. People like David Wheeler, Taras 
Mitran, Joe Van Dyk, and the Heroku guys would be better people to ask 
than me. I'm quite prepared to change it if that's the consensus.

cheers

andrew

Re: json api WIP patch

От

Daniel Farina

Дата:

16 января 2013 г., 00:24:13

On Tue, Jan 15, 2013 at 12:17 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 01/15/2013 02:47 PM, Merlin Moncure wrote:
>>
>> On Tue, Jan 15, 2013 at 1:04 PM, David Fetter <david@fetter.org> wrote:
>>>
>>> On Mon, Jan 14, 2013 at 07:52:56PM -0500, Andrew Dunstan wrote:
>>>>
>>>> On 01/14/2013 07:36 PM, Merlin Moncure wrote:
>>>>>
>>>>> While testing this I noticed that integer based 'get' routines are
>>>>> zero based -- was this intentional?  Virtually all other aspects of
>>>>> SQL are 1 based:
>>>>>
>>>>> postgres=# select json_get('[1,2,3]', 1);
>>>>>   json_get
>>>>> ----------
>>>>>   2
>>>>> (1 row)
>>>>>
>>>>> postgres=# select json_get('[1,2,3]', 0);
>>>>>   json_get
>>>>> ----------
>>>>>   1
>>>>> (1 row)
>>>>
>>>> Yes. it's intentional. SQL arrays might be 1-based by default, but
>>>> JavaScript arrays are not. JsonPath and similar gadgets treat the
>>>> arrays as zero-based. I suspect the Json-using community would not
>>>> thank us for being overly SQL-centric on this - and I say that as
>>>> someone who has always thought zero based arrays were a major design
>>>> mistake, responsible for countless off-by-one errors.
>>>
>>> Perhaps we could compromise by making arrays 0.5-based.
>>
>> Well, I'm not prepared to argue with Andrew in this one.  It was
>> surprising behavior to me, but that's sample size one.
>
> I doubt I'm very representative either. People like David Wheeler, Taras
> Mitran, Joe Van Dyk, and the Heroku guys would be better people to ask than
> me. I'm quite prepared to change it if that's the consensus.

Hello.

I'm inclined to go with the same gut feeling you had (zero-based-indexing).

Here is the background for my reasoning:

The downside of zero-based-indexing is that people who want to use
multiple sequential container types will inevitably have to deal with
detailed and not easily type-checked integer coordinates that mean
different things in each domain that will, no doubt, lead to a number
of off-by-one errors.  Nevertheless, this cost is already paid because
one of the first things many people will do in programs generating SQL
queries is try to zero-index a SQL array, swear a bit after figuring
things out (because a NULL will be generated, not an error), and then
adjust all the offsets. So, this is not a new problem.  On many
occasions I'm sure this has caused off-by-one bugs, or the NULLs
slipped through testing and delivered funny results, yet the world
moves on.

On the other hand, the downside of going down the road of 1-based
indexing and attempting to attain relative sameness to SQL arrays, it
would also feel like one would be obliged to implement SQL array
infelicities like 'out of bounds' being SQL NULL rather than an error,
related to other spectres like non-rectangular nested arrays.  SQL
array semantics are complex and The Committee can change them or --
slightly more likely -- add interactions, so it seems like a general
expectation that Postgres container types that happen to have any
reasonable ordinal addressing will implement some level of same-ness
with SQL arrays is a very messy one.  As such, if it becomes customary
to implement one-based indexing of containers, I think such customs
are best carefully circumscribed so that attempts to be 'like' SQL
arrays are only as superficial as that.

What made me come down on the side of zero-based indexing in spite of
the weaknesses are these two reasons:

* The number of people who use JSON and zero-based-indexing is very large, and furthermore, within that set the number
thatknow that SQL even defines array support -- much less that Postgres implements it -- is much smaller. Thus, one is
targetingcohesion with a fairly alien concept that is not understood by the audience.

* Maintaining PL integrated code that uses both 1-based indexing in PG functions and 0-based indexing in embedded
languagesthat are likely to be combined with JSON -- doesn't sound very palatable, and the use of such PLs (e.g. plv8)
seemspretty likely, too.  That can probably be a rich source of bugs and frustration.

If one wants SQL array semantics, it seems like the right way to get
them is coercion to a SQL array value.  Then one will receive SQL
array semantics exactly.

--
fdr

Re: json api WIP patch

От

Gavin Flower

Дата:

16 января 2013 г., 00:35:00

<div class="moz-cite-prefix">On 16/01/13 08:04, David Fetter wrote:<br /></div><blockquote
cite="mid:20130115190424.GB32407@fetter.org"type="cite"><pre wrap="">On Mon, Jan 14, 2013 at 07:52:56PM -0500, Andrew
Dunstanwrote:
 
</pre><blockquote type="cite"><pre wrap="">
On 01/14/2013 07:36 PM, Merlin Moncure wrote:
</pre><blockquote type="cite"><pre wrap="">While testing this I noticed that integer based 'get' routines are
zero based -- was this intentional?  Virtually all other aspects of
SQL are 1 based:

postgres=# select json_get('[1,2,3]', 1);json_get
----------2
(1 row)

postgres=# select json_get('[1,2,3]', 0);json_get
----------1
(1 row)
</pre></blockquote><pre wrap="">
Yes. it's intentional. SQL arrays might be 1-based by default, but
JavaScript arrays are not. JsonPath and similar gadgets treat the
arrays as zero-based. I suspect the Json-using community would not
thank us for being overly SQL-centric on this - and I say that as
someone who has always thought zero based arrays were a major design
mistake, responsible for countless off-by-one errors.
</pre></blockquote><pre wrap="">
Perhaps we could compromise by making arrays 0.5-based.

Cheers,
David.
</pre></blockquote><font size="-1">I think that <font size="-1">is far to rational, perhaps the reciprocal of the
goldenratio<font size="-1"> (0.618033...)</font> </font></font>would be more appropriate?<br /><br /> I used to be
insistentthat arrays should start with 1, now I find starting at 0 far more natural - because evrytime you start an
arrayat 1, the computer has to subtract 1 in order to calculate the entry.  Also both Java & C are zero based.<br
/><br/> I first learnt FORTRAN IV which is 1 based, had a shock when I was learning Algol and found it was 0 based -
manymoons ago...<br /><br />  <br /> Cheers,<br /> Gavin<br />

Re: json api WIP patch

От

"David E. Wheeler"

Дата:

16 января 2013 г., 00:36:40

On Jan 15, 2013, at 12:17 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

> I doubt I'm very representative either. People like David Wheeler, Taras Mitran, Joe Van Dyk, and the Heroku guys
wouldbe better people to ask than me. I'm quite prepared to change it if that's the consensus. 

They’re JSON arrays, not SQL arrays, and JSON arrays are based on JavaScript, where they are 0-based.

Best,

David

Re: json api WIP patch

От

Robert Haas

Дата:

16 января 2013 г., 18:28:52

On Mon, Jan 14, 2013 at 11:02 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
> After a couple of iterations, some performance enhancements to the json
> parser and lexer have ended up with a net performance improvement over git
> tip. On our test rig, the json parse test runs at just over 13s per 10000
> parses on git tip and approx 12.55s per 10000 parses with the attached
> patch.
>
> Truth be told, I think the lexer changes have more than paid for the small
> cost of the switch to an RD parser. But since the result is a net
> performance win PLUS some enhanced functionality, I think we should be all
> good.

Yeah, that sounds great.  Thanks for putting in the effort.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: json api WIP patch

От

Peter Eisentraut

Дата:

01 февраля 2013 г., 01:06:21

On 1/10/13 6:42 PM, Andrew Dunstan wrote:
> This updated patch contains all the intended functionality, including
> operators for the json_get_path functions, so you can say things like
> 
>     select jsonval->array['f1','0','f2] ...

I would like to not create any -> operators, so that that syntax could
be used in the future for method invocation or something similar (it's
in the SQL standard).

I also don't find the proposed use to be very intuitive.  You invented
lots of other function names -- why not invent a few more for this
purpose that are clearer?

Re: json api WIP patch

От

Andrew Dunstan

Дата:

01 февраля 2013 г., 01:20:31

On 01/31/2013 05:06 PM, Peter Eisentraut wrote:
> On 1/10/13 6:42 PM, Andrew Dunstan wrote:
>> This updated patch contains all the intended functionality, including
>> operators for the json_get_path functions, so you can say things like
>>
>>      select jsonval->array['f1','0','f2] ...
> I would like to not create any -> operators, so that that syntax could
> be used in the future for method invocation or something similar (it's
> in the SQL standard).

This is the first time I have heard that we should stay away from this. 
We have operators with this name in hstore, which is why I chose it.

Have we officially deprecated '->'? I know we deprecated "=>", but I 
simply don't recall anything about '->'.

>
> I also don't find the proposed use to be very intuitive.  You invented
> lots of other function names -- why not invent a few more for this
> purpose that are clearer?
>
>

I'm happy to take opinions about this, and I expected some bikeshedding, 
but your reaction is contrary to everything others have told me. Mostly 
they love the operators.

I guess that '~>' and '~>>' would work as well as '->' and '->>'.

cheers

andrew

Re: json api WIP patch

От

Merlin Moncure

Дата:

01 февраля 2013 г., 02:35:52

On Thu, Jan 31, 2013 at 4:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 01/31/2013 05:06 PM, Peter Eisentraut wrote:
>>
>> On 1/10/13 6:42 PM, Andrew Dunstan wrote:
>>>
>>> This updated patch contains all the intended functionality, including
>>> operators for the json_get_path functions, so you can say things like
>>>
>>>      select jsonval->array['f1','0','f2] ...
>>
>> I would like to not create any -> operators, so that that syntax could
>> be used in the future for method invocation or something similar (it's
>> in the SQL standard).
>
>
>
> This is the first time I have heard that we should stay away from this. We
> have operators with this name in hstore, which is why I chose it.
>
> Have we officially deprecated '->'? I know we deprecated "=>", but I simply
> don't recall anything about '->'.
>
>
>>
>> I also don't find the proposed use to be very intuitive.  You invented
>> lots of other function names -- why not invent a few more for this
>> purpose that are clearer?
>>
>>
>
>
> I'm happy to take opinions about this, and I expected some bikeshedding, but
> your reaction is contrary to everything others have told me. Mostly they
> love the operators.
>
> I guess that '~>' and '~>>' would work as well as '->' and '->>'.

also hstore implements ->

quick off-topic aside: is colon (:) reserved for any purpose as an
operator in SQL?

merlin

Re: json api WIP patch

От

Tom Lane

Дата:

01 февраля 2013 г., 03:12:08

Merlin Moncure <mmoncure@gmail.com> writes:
> On Thu, Jan 31, 2013 at 4:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 01/31/2013 05:06 PM, Peter Eisentraut wrote:
>>> I would like to not create any -> operators, so that that syntax could
>>> be used in the future for method invocation or something similar (it's
>>> in the SQL standard).

>> This is the first time I have heard that we should stay away from this. We
>> have operators with this name in hstore, which is why I chose it.

I'm not happy about this either.  It's bad enough that we're thinking
about taking away =>, but to disallow -> as well?  My inclination is to
just say no, we're not implementing that.  Even if we remove the contrib
operators named that way, it's insane to suppose that nobody has chosen
these names for user-defined operators in their applications.

> quick off-topic aside: is colon (:) reserved for any purpose as an
> operator in SQL?

We disallow it as an operator character, because of the conflict with
parameter/variable syntax in ecpg and psql.  It was allowed before
PG 7.0, IIRC.
        regards, tom lane

Re: json api WIP patch

От

"David E. Wheeler"

Дата:

01 февраля 2013 г., 03:16:20

On Jan 31, 2013, at 2:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

> I'm happy to take opinions about this, and I expected some bikeshedding, but your reaction is contrary to everything
othershave told me. Mostly they love the operators. 
>
> I guess that '~>' and '~>>' would work as well as '->' and '->>'.

Or +> and +>>, since ~ is set very high and small by some fonts (where the fontmakers though of it as a kind of
superscriptcharacter). 

I suppose that := is out of the question?

Best,

David

Re: json api WIP patch

От

Andrew Dunstan

Дата:

01 февраля 2013 г., 03:27:17

On 01/31/2013 07:16 PM, David E. Wheeler wrote:
> On Jan 31, 2013, at 2:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>> I'm happy to take opinions about this, and I expected some bikeshedding, but your reaction is contrary to everything
othershave told me. Mostly they love the operators.
 
>>
>> I guess that '~>' and '~>>' would work as well as '->' and '->>'.
> Or +> and +>>, since ~ is set very high and small by some fonts (where the fontmakers though of it as a kind of
superscriptcharacter).
 
>
> I suppose that := is out of the question?
>


Even if it were I would not on any account use it. As an old Ada 
programmer my mind just revolts at the idea of using this for anything 
but assignment.

cheers

andrew

Re: json api WIP patch

От

Tom Lane

Дата:

01 февраля 2013 г., 03:32:30

Andrew Dunstan <andrew@dunslane.net> writes:
> On 01/31/2013 07:16 PM, David E. Wheeler wrote:
>> I suppose that := is out of the question?

> Even if it were I would not on any account use it. As an old Ada 
> programmer my mind just revolts at the idea of using this for anything 
> but assignment.

Ada or no, its use in plpgsql would render that a seriously bad idea.
        regards, tom lane

Re: json api WIP patch

От

"David E. Wheeler"

Дата:

01 февраля 2013 г., 03:38:23

On Jan 31, 2013, at 4:32 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Ada or no, its use in plpgsql would render that a seriously bad idea.

I assumed that its use in function params would be the main reason not to use it.

David

Re: json api WIP patch

От

Gavin Flower

Дата:

01 февраля 2013 г., 03:41:47

<div class="moz-cite-prefix">On 01/02/13 13:26, Andrew Dunstan wrote:<br /></div><blockquote
cite="mid:510B0BD2.4010000@dunslane.net"type="cite"><br /> On 01/31/2013 07:16 PM, David E. Wheeler wrote: <br
/><blockquotetype="cite">On Jan 31, 2013, at 2:20 PM, Andrew Dunstan <a class="moz-txt-link-rfc2396E"
href="mailto:andrew@dunslane.net"><andrew@dunslane.net></a>wrote: <br /><br /><blockquote type="cite">I'm happy
totake opinions about this, and I expected some bikeshedding, but your reaction is contrary to everything others have
toldme. Mostly they love the operators. <br /><br /> I guess that '~>' and '~>>' would work as well as '->'
and'->>'. <br /></blockquote> Or +> and +>>, since ~ is set very high and small by some fonts (where the
fontmakersthough of it as a kind of superscript character). <br /><br /> I suppose that := is out of the question? <br
/><br/></blockquote><br /><br /> Even if it were I would not on any account use it. As an old Ada programmer my mind
justrevolts at the idea of using this for anything but assignment. <br /><br /> cheers <br /><br /> andrew <br /><br
/><br/></blockquote><font size="-1">Ancient Al<font size="-1">gol 60 programmer here, otherwise ditto!</font></font><br
/>

Re: json api WIP patch

От

Robert Haas

Дата:

02 февраля 2013 г., 01:08:40

On Thu, Jan 31, 2013 at 7:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Merlin Moncure <mmoncure@gmail.com> writes:
>> On Thu, Jan 31, 2013 at 4:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> On 01/31/2013 05:06 PM, Peter Eisentraut wrote:
>>>> I would like to not create any -> operators, so that that syntax could
>>>> be used in the future for method invocation or something similar (it's
>>>> in the SQL standard).
>
>>> This is the first time I have heard that we should stay away from this. We
>>> have operators with this name in hstore, which is why I chose it.
>
> I'm not happy about this either.  It's bad enough that we're thinking
> about taking away =>, but to disallow -> as well?  My inclination is to
> just say no, we're not implementing that.  Even if we remove the contrib
> operators named that way, it's insane to suppose that nobody has chosen
> these names for user-defined operators in their applications.

I think it's smarter for us to ship functions, and let users wrap them
in operators if they so choose.  It's not difficult for people who
want it to do it, and that way we dodge this whole mess.

The thing that was really awful about hstore's => operator is that it
was =>(text, text), meaning that if somebody else invented, say,
xstore, they could not do the same thing that hstore did without
colliding with hstore.  I think we ought to have an ironclad rule that
any operators introduced in our core distribution for particular types
must have at least one argument of that type.  Otherwise, we'll end up
with a free-for-all where everyone tries to grab the "good" operator
names (of which there are only a handful) for their type, and types
we're adding five years from now will be stuck with no operator names
at all, or dumb stuff like ~~~~>.

But even leaving that aside, I'm surprised to hear so many people
dismissing SQL standards compliance so blithely.  We've certainly
spent a lot of blood, sweat, and tears on minor standards-compliance
issues over they years - why is it OK to not care about this
particular issue when we've spent so much effort caring about other
ones?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: json api WIP patch

От

Daniel Farina

Дата:

02 февраля 2013 г., 01:39:52

On Fri, Feb 1, 2013 at 2:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Thu, Jan 31, 2013 at 7:12 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Merlin Moncure <mmoncure@gmail.com> writes:
>>> On Thu, Jan 31, 2013 at 4:20 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>>> On 01/31/2013 05:06 PM, Peter Eisentraut wrote:
>>>>> I would like to not create any -> operators, so that that syntax could
>>>>> be used in the future for method invocation or something similar (it's
>>>>> in the SQL standard).
>>
>>>> This is the first time I have heard that we should stay away from this. We
>>>> have operators with this name in hstore, which is why I chose it.
>>
>> I'm not happy about this either.  It's bad enough that we're thinking
>> about taking away =>, but to disallow -> as well?  My inclination is to
>> just say no, we're not implementing that.  Even if we remove the contrib
>> operators named that way, it's insane to suppose that nobody has chosen
>> these names for user-defined operators in their applications.
>
> I think it's smarter for us to ship functions, and let users wrap them
> in operators if they so choose.  It's not difficult for people who
> want it to do it, and that way we dodge this whole mess.

Normally I'd agree with you, but I think there's a complexity here
that is very frown-inducing, although I'm not positively inclined to
state that it's so great that your suggested solution is not the least
of all evils:
 http://www.postgresql.org/message-id/8551.1331580169@sss.pgh.pa.us

The problem being: even though pg_operator resolves to functions in
pg_proc, they have distinct identities as far as the planner is
concerned w.r.t selectivity estimation and index selection.  Already
there is a slight hazard that some ORMs that want to grow hstore
support will spell it "fetchval" and others "->".  So far, infix
syntax seems to be the common default, but a minor disagreement among
ORMs or different user preferences will be destructive.

Another way to look at this is that by depriving people multiple
choices in the default install, that hazard goes away.  But it also
means that, practically, CREATE OPERATOR is going to be hazardous to
use because almost all software is probably not going to assume the
existence of any non-default installed operators for JSON, and those
that do will not receive the benefit of indexes from software using
the other notation.  So, I think that if one takes the 'when in doubt,
leave it out' approach you seem to be proposing, I also think that
very little profitable use of CREATE OPERATOR will take place -- ORMs
et al will choose the lowest common denominator (for good sensible
reason) and flirting with CREATE OPERATOR will probably cause
surprising plans, so I doubt it'll take hold.

--
fdr

Re: json api WIP patch

От

Tom Lane

Дата:

02 февраля 2013 г., 02:03:54

Daniel Farina <daniel@heroku.com> writes:
> On Fri, Feb 1, 2013 at 2:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>> I think it's smarter for us to ship functions, and let users wrap them
>> in operators if they so choose.  It's not difficult for people who

> The problem being: even though pg_operator resolves to functions in
> pg_proc, they have distinct identities as far as the planner is
> concerned w.r.t selectivity estimation and index selection.

Yeah, this is surely not a workable policy unless we first move all
those planner smarts to apply to functions not operators.  And rewrite
all the index AM APIs to use functions not operators, too.  Now this is
something that's been a wish-list item right along, but actually doing
it has always looked like a great deal of work for rather small reward.
        regards, tom lane

Re: json api WIP patch

От

Robert Haas

Дата:

04 февраля 2013 г., 04:20:03

On Fri, Feb 1, 2013 at 6:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Daniel Farina <daniel@heroku.com> writes:
>> On Fri, Feb 1, 2013 at 2:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>> I think it's smarter for us to ship functions, and let users wrap them
>>> in operators if they so choose.  It's not difficult for people who
>
>> The problem being: even though pg_operator resolves to functions in
>> pg_proc, they have distinct identities as far as the planner is
>> concerned w.r.t selectivity estimation and index selection.
>
> Yeah, this is surely not a workable policy unless we first move all
> those planner smarts to apply to functions not operators.  And rewrite
> all the index AM APIs to use functions not operators, too.  Now this is
> something that's been a wish-list item right along, but actually doing
> it has always looked like a great deal of work for rather small reward.

Hmm.  Well, if the operators are going to be indexable, then I agree
that's an issue, but isn't -> just a key-extraction operator?  That
wouldn't be something you could index anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: json api WIP patch

От

Andrew Dunstan

Дата:

04 февраля 2013 г., 05:05:44

On 02/03/2013 08:20 PM, Robert Haas wrote:
> On Fri, Feb 1, 2013 at 6:03 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Daniel Farina <daniel@heroku.com> writes:
>>> On Fri, Feb 1, 2013 at 2:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>>> I think it's smarter for us to ship functions, and let users wrap them
>>>> in operators if they so choose.  It's not difficult for people who
>>> The problem being: even though pg_operator resolves to functions in
>>> pg_proc, they have distinct identities as far as the planner is
>>> concerned w.r.t selectivity estimation and index selection.
>> Yeah, this is surely not a workable policy unless we first move all
>> those planner smarts to apply to functions not operators.  And rewrite
>> all the index AM APIs to use functions not operators, too.  Now this is
>> something that's been a wish-list item right along, but actually doing
>> it has always looked like a great deal of work for rather small reward.
> Hmm.  Well, if the operators are going to be indexable, then I agree
> that's an issue, but isn't -> just a key-extraction operator?  That
> wouldn't be something you could index anyway.
>


Er, what? It gives you the value corresponding to a key (or the numbered 
array element).


With the Json operators I provided you're more likely to use ->> in an 
index, because it returns de-escaped text rather than json, but I don't 
see any reason in principle why -> couldn't be used.



cheers

andrew

Re: json api WIP patch

От

Merlin Moncure

Дата:

04 февраля 2013 г., 17:18:11

On Fri, Feb 1, 2013 at 4:08 PM, Robert Haas <robertmhaas@gmail.com> wrote:
> But even leaving that aside, I'm surprised to hear so many people
> dismissing SQL standards compliance so blithely.  We've certainly
> spent a lot of blood, sweat, and tears on minor standards-compliance
> issues over they years - why is it OK to not care about this
> particular issue when we've spent so much effort caring about other
> ones?

Does the SQL Standard suggest you can't extend the language with
operators?  Or does it reserve certain characters for future use?  And
if so, is there a list?

merlin

Re: json api WIP patch

От

Robert Haas

Дата:

04 февраля 2013 г., 18:48:03

On Sun, Feb 3, 2013 at 9:05 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> Yeah, this is surely not a workable policy unless we first move all
>>> those planner smarts to apply to functions not operators.  And rewrite
>>> all the index AM APIs to use functions not operators, too.  Now this is
>>> something that's been a wish-list item right along, but actually doing
>>> it has always looked like a great deal of work for rather small reward.
>>
>> Hmm.  Well, if the operators are going to be indexable, then I agree
>> that's an issue, but isn't -> just a key-extraction operator?  That
>> wouldn't be something you could index anyway.
>
> Er, what? It gives you the value corresponding to a key (or the numbered
> array element).

That's what I figured.

> With the Json operators I provided you're more likely to use ->> in an
> index, because it returns de-escaped text rather than json, but I don't see
> any reason in principle why -> couldn't be used.

The point is that if you're talking about indexing something like
myeav->'andrew' you could equally well index json_get(myeav,
'andrew').  So there's no real need for it to be an operator rather
than a function.

The case in which it would matter is if it were something that could
be used as an index predicate, like:

Index Scan -> Index Cond: myeav->'andrew'

As of now, the query planner won't consider such a plan if it's only a
function and not an operator.  So if we had a case like that, the use
of operator notation could be justified on performance grounds.  If we
don't, I argue that it's better to stick with functional notation,
because the number of sensible function names is much larger than the
number of sensible operator names, and if we start using operator
notation every time someone thinks it will look nicer that way, we
will very quickly either run out of nice-looking operator names or
start overloading them heavily.

The SQL standards considerations seem worth thinking about, too.
We've certainly gone through a lot of pain working toward eliminating
=> as an operator name, and if the SQL standard has commandeered ->
for some purpose or other, I'd really rather not add to the headaches
involved should we ever decide to reclaim it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: json api WIP patch

От

Andrew Dunstan

Дата:

04 февраля 2013 г., 19:11:06

On 02/04/2013 10:47 AM, Robert Haas wrote:
>
> The SQL standards considerations seem worth thinking about, too.
> We've certainly gone through a lot of pain working toward eliminating
> => as an operator name, and if the SQL standard has commandeered ->
> for some purpose or other, I'd really rather not add to the headaches
> involved should we ever decide to reclaim it.


OK, but I'd like to know what is going to be safe. There's no way to 
future-proof the language. I'm quite prepared to replace -> with 
something else, and if I do then ->> will need to be adjusted 
accordingly, I think.

My suggestion would be ~> and ~>>. I know David Wheeler didn't like that 
on the ground that some fonts elevate ~ rather than aligning it in the 
middle as most monospaced fonts do, but I'm tempted just to say "then 
use a different font." Other possibilities that come to mind are +> and 
+>>, although I think they're less attractive. But I'll be guided by the 
consensus, assuming there is one ;-)

cheers

andrew

Re: json api WIP patch

От

Benedikt Grundmann

Дата:

04 февраля 2013 г., 19:19:13

On Mon, Feb 4, 2013 at 4:10 PM, Andrew Dunstan <andrew@dunslane.net> wrote:

On 02/04/2013 10:47 AM, Robert Haas wrote:

The SQL standards considerations seem worth thinking about, too.
We've certainly gone through a lot of pain working toward eliminating
=> as an operator name, and if the SQL standard has commandeered ->
for some purpose or other, I'd really rather not add to the headaches
involved should we ever decide to reclaim it.

OK, but I'd like to know what is going to be safe. There's no way to future-proof the language. I'm quite prepared to replace -> with something else, and if I do then ->> will need to be adjusted accordingly, I think.

My suggestion would be ~> and ~>>. I know David Wheeler didn't like that on the ground that some fonts elevate ~ rather than aligning it in the middle as most monospaced fonts do, but I'm tempted just to say "then use a different font." Other possibilities that come to mind are +> and +>>, although I think they're less attractive. But I'll be guided by the consensus, assuming there is one ;-)

As a user I would be much in favor of just functions and no additional operators if the sole difference is syntactical. I think custom operators are much harder to remember than function names (assuming reasonably well chosen function names).

Now Robert seems to suggest that there will also be speed / planner difference which seems sad (I would have expected operators to be just syntactical sugar for specially named functions and once we are past the parser there should be no difference).

Re: json api WIP patch

От

"David E. Wheeler"

Дата:

04 февраля 2013 г., 20:50:55

On Feb 4, 2013, at 8:10 AM, Andrew Dunstan <andrew@dunslane.net> wrote:

> My suggestion would be ~> and ~>>. I know David Wheeler didn't like that on the ground that some fonts elevate ~
ratherthan aligning it in the middle as most monospaced fonts do, but I'm tempted just to say "then use a different
font."Other possibilities that come to mind are +> and +>>, although I think they're less attractive. But I'll be
guidedby the consensus, assuming there is one ;-) 

On the contrary, I quite like ~>. I've used it in pair.
 http://pgxn.org/extension/pair

But others have complained about the font issue when I've suggested it for things in the past.

My fonts don't suck. :-)

I can live with +> and +>>.

David

Re: json api WIP patch

От

Merlin Moncure

Дата:

04 февраля 2013 г., 20:57:16

On Mon, Feb 4, 2013 at 10:18 AM, Benedikt Grundmann
<bgrundmann@janestreet.com> wrote:
>
>
>
> On Mon, Feb 4, 2013 at 4:10 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>
>>
>> On 02/04/2013 10:47 AM, Robert Haas wrote:
>>>
>>>
>>> The SQL standards considerations seem worth thinking about, too.
>>> We've certainly gone through a lot of pain working toward eliminating
>>> => as an operator name, and if the SQL standard has commandeered ->
>>> for some purpose or other, I'd really rather not add to the headaches
>>> involved should we ever decide to reclaim it.
>>
>>
>>
>> OK, but I'd like to know what is going to be safe. There's no way to
>> future-proof the language. I'm quite prepared to replace -> with something
>> else, and if I do then ->> will need to be adjusted accordingly, I think.
>>
>> My suggestion would be ~> and ~>>. I know David Wheeler didn't like that
>> on the ground that some fonts elevate ~ rather than aligning it in the
>> middle as most monospaced fonts do, but I'm tempted just to say "then use a
>> different font." Other possibilities that come to mind are +> and +>>,
>> although I think they're less attractive. But I'll be guided by the
>> consensus, assuming there is one ;-)
>>
> As a user I would be much in favor of just functions and no additional
> operators if the sole difference is syntactical.  I think custom operators
> are much harder to remember than function names (assuming reasonably well
> chosen function names).

couple quick observations:

*) just about all postgres extension types expose operators -- problem
is not specific to json (and therefore IMO irrelevant to 9.3 release
of enhancements)

*) hstore exposes ->.  I use it all over the place.  I find operator
to be terse and readable -- much more so than function definition.
Ok, operator such as "@-@" is pretty silly, but "->" for get is
natural.  The cat is out of the bag, so removing -> for 9.3 for
production seems pretty fruitless.

*) Removing -> (breaking all my and countless others' hstore dependent
code) should not happen until there is a very good reason.  This was
extensively discussed in development of hstore.   Breaking
compatibility sucks -- my company is just wrapping up a 12 month code
overhaul so we could get off 8.1.

*) it's bad enough that we drift from sql naming conventions and all
type manipulation functions (except in hstore) with type name.
json_get etc.   at least using operators allow avoiding some of that
unnecessary verbosity.  what's next: text_concatenate('foo', 'bar')?

merlin

Re: json api WIP patch

От

Andrew Dunstan

Дата:

04 февраля 2013 г., 21:07:50

On 02/04/2013 12:57 PM, Merlin Moncure wrote:

> *) it's bad enough that we drift from sql naming conventions and all
> type manipulation functions (except in hstore) with type name.
> json_get etc.   at least using operators allow avoiding some of that
> unnecessary verbosity.  what's next: text_concatenate('foo', 'bar')?
>

This names aren't set in stone either. I've been expecting some 
bikeshedding there, and I'm surprised there hasn't been more.

cheers

andrew

Re: json api WIP patch

От

Robert Haas

Дата:

04 февраля 2013 г., 22:38:50

On Mon, Feb 4, 2013 at 11:10 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
> On 02/04/2013 10:47 AM, Robert Haas wrote:
>>
>>
>> The SQL standards considerations seem worth thinking about, too.
>> We've certainly gone through a lot of pain working toward eliminating
>> => as an operator name, and if the SQL standard has commandeered ->
>> for some purpose or other, I'd really rather not add to the headaches
>> involved should we ever decide to reclaim it.
>
> OK, but I'd like to know what is going to be safe. There's no way to
> future-proof the language. I'm quite prepared to replace -> with something
> else, and if I do then ->> will need to be adjusted accordingly, I think.
>
> My suggestion would be ~> and ~>>. I know David Wheeler didn't like that on
> the ground that some fonts elevate ~ rather than aligning it in the middle
> as most monospaced fonts do, but I'm tempted just to say "then use a
> different font." Other possibilities that come to mind are +> and +>>,
> although I think they're less attractive. But I'll be guided by the
> consensus, assuming there is one ;-)

I suspect both of those are pretty safe from an SQL standards point of
view.  Of course, as Tom is often wont to point out, the SQL standards
committee sometimes does bizarre things, so nothing's perfect, but I'd
be rather shocked if any of those got tapped to mean something else.

That having been said, I still don't see value in adding operators at
all.  Good old function call notation seems perfectly adequate from
where I sit.  Sure, it's a little more verbose, but when you try to
too hard make things concise then you end up having to explain to your
users why \ditS is a sensible thing for them to type into psql, or why
s@\W@sprintf"%%%02x",ord($&)@e in Perl.  I recognize that I may lose
this argument, but I've worked with a couple of languages where
operators can be overloaded (C++) or defined (ML) and it's just never
seemed to work out very well.  YMMV, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: json api WIP patch

От

Merlin Moncure

Дата:

04 февраля 2013 г., 22:59:15

On Mon, Feb 4, 2013 at 12:07 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 02/04/2013 12:57 PM, Merlin Moncure wrote:
>
>> *) it's bad enough that we drift from sql naming conventions and all
>> type manipulation functions (except in hstore) with type name.
>> json_get etc.   at least using operators allow avoiding some of that
>> unnecessary verbosity.  what's next: text_concatenate('foo', 'bar')?
>>
>
> This names aren't set in stone either. I've been expecting some bikeshedding
> there, and I'm surprised there hasn't been more.

Well -- heh (asked to bikeshed: joy!) -- I felt like my objections
were noted and am more interested in getting said functionality out
the door than splitting hairs over names.  Type prefix issue is under
the same umbrella as use of the -> operator, that is, *not
specifically related to this patch, and not worth holding up this
patch over*.  In both cases it's essentially crying over spilt milk.

My only remaining nit with the proposal is with json_unnest().

SQL unnest() produces list of scalars regardless of dimensionality --
json unnest unwraps one level only (contrast: pl/pgsql array 'slice').So I think 'json_array_each', or perhaps
json_slice()is a better fit

there.

merlin

Re: json api WIP patch

От

Will Leinweber

Дата:

04 февраля 2013 г., 23:06:51

On Mon, Feb 4, 2013 at 11:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:

I suspect both of those are pretty safe from an SQL standards point of
view. Of course, as Tom is often wont to point out, the SQL standards
committee sometimes does bizarre things, so nothing's perfect, but I'd
be rather shocked if any of those got tapped to mean something else.

That having been said, I still don't see value in adding operators at
all. Good old function call notation seems perfectly adequate from
where I sit. Sure, it's a little more verbose, but when you try to
too hard make things concise then you end up having to explain to your
users why \ditS is a sensible thing for them to type into psql, or why
s@\W@sprintf"%%%02x",ord($&)@e in Perl. I recognize that I may lose
this argument, but I've worked with a couple of languages where
operators can be overloaded (C++) or defined (ML) and it's just never
seemed to work out very well. YMMV, of course.

For what my opinion is worth I absolute agree with just having function names. The -> in hstore is kind of nice, but it lead me to a whole lot of greif when I couldn't figure out how to create an index using it (turns out you have to use _double_ parens, who knew?), but could create an index on fetchval and assumed that postgres would figure it out.

Also a for quite a while it felt just like incantation of when I'd need parens around those operatiors or not. Now that I sorta-kinda-not-really understand the operation precedence rules in postgres/sql standard, I've mostly given up on using cute operators because their much more of a pain on a day-to-day basis.

Re: json api WIP patch

От

Andrew Dunstan

Дата:

04 февраля 2013 г., 23:11:21

On 02/04/2013 02:59 PM, Merlin Moncure wrote:
> On Mon, Feb 4, 2013 at 12:07 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 02/04/2013 12:57 PM, Merlin Moncure wrote:
>>
>>> *) it's bad enough that we drift from sql naming conventions and all
>>> type manipulation functions (except in hstore) with type name.
>>> json_get etc.   at least using operators allow avoiding some of that
>>> unnecessary verbosity.  what's next: text_concatenate('foo', 'bar')?
>>>
>> This names aren't set in stone either. I've been expecting some bikeshedding
>> there, and I'm surprised there hasn't been more.
> Well -- heh (asked to bikeshed: joy!) -- I felt like my objections
> were noted and am more interested in getting said functionality out
> the door than splitting hairs over names.  Type prefix issue is under
> the same umbrella as use of the -> operator, that is, *not
> specifically related to this patch, and not worth holding up this
> patch over*.  In both cases it's essentially crying over spilt milk.
>
> My only remaining nit with the proposal is with json_unnest().
>
> SQL unnest() produces list of scalars regardless of dimensionality --
> json unnest unwraps one level only (contrast: pl/pgsql array 'slice').
>   So I think 'json_array_each', or perhaps json_slice() is a better fit
> there.
>


how about json_array_elements()?

cheers

andrew

Re: json api WIP patch

От

Daniel Farina

Дата:

04 февраля 2013 г., 23:17:04

On Mon, Feb 4, 2013 at 11:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Mon, Feb 4, 2013 at 11:10 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 02/04/2013 10:47 AM, Robert Haas wrote:
>>>
>>>
>>> The SQL standards considerations seem worth thinking about, too.
>>> We've certainly gone through a lot of pain working toward eliminating
>>> => as an operator name, and if the SQL standard has commandeered ->
>>> for some purpose or other, I'd really rather not add to the headaches
>>> involved should we ever decide to reclaim it.
>>
>> OK, but I'd like to know what is going to be safe. There's no way to
>> future-proof the language. I'm quite prepared to replace -> with something
>> else, and if I do then ->> will need to be adjusted accordingly, I think.
>>
>> My suggestion would be ~> and ~>>. I know David Wheeler didn't like that on
>> the ground that some fonts elevate ~ rather than aligning it in the middle
>> as most monospaced fonts do, but I'm tempted just to say "then use a
>> different font." Other possibilities that come to mind are +> and +>>,
>> although I think they're less attractive. But I'll be guided by the
>> consensus, assuming there is one ;-)
>
> I suspect both of those are pretty safe from an SQL standards point of
> view.  Of course, as Tom is often wont to point out, the SQL standards
> committee sometimes does bizarre things, so nothing's perfect, but I'd
> be rather shocked if any of those got tapped to mean something else.
>
> That having been said, I still don't see value in adding operators at
> all.  Good old function call notation seems perfectly adequate from
> where I sit.  Sure, it's a little more verbose, but when you try to
> too hard make things concise then you end up having to explain to your
> users why \ditS is a sensible thing for them to type into psql, or why
> s@\W@sprintf"%%%02x",ord($&)@e in Perl.  I recognize that I may lose
> this argument, but I've worked with a couple of languages where
> operators can be overloaded (C++) or defined (ML) and it's just never
> seemed to work out very well.  YMMV, of course.

I also basically feel this way, although I know I tend more towards
notational brutalism than many.  I think we shouldn't kid ourselves
that non-default operators will be used, and for
current-implementation reasons (that maybe could be fixed by someone
determined) it's not really at the pleasure of the author to use them
via CREATE OPERATOR either.

So, I basically subscribe to view that we should investigate what
total reliance on prefix syntax looks like.  I guess it'll make nested
navigation horribly ugly, though...positively lisp-esque.  That' s one
consideration hstore doesn't have that may make use of infix notations
considerably more useful for json than hstore.

--
fdr

Re: json api WIP patch

От

Andrew Dunstan

Дата:

04 февраля 2013 г., 23:37:26

On 02/04/2013 03:16 PM, Daniel Farina wrote:
> On Mon, Feb 4, 2013 at 11:38 AM, Robert Haas <robertmhaas@gmail.com> wrote:
>> On Mon, Feb 4, 2013 at 11:10 AM, Andrew Dunstan <andrew@dunslane.net> wrote:
>>> On 02/04/2013 10:47 AM, Robert Haas wrote:
>>>>
>>>> The SQL standards considerations seem worth thinking about, too.
>>>> We've certainly gone through a lot of pain working toward eliminating
>>>> => as an operator name, and if the SQL standard has commandeered ->
>>>> for some purpose or other, I'd really rather not add to the headaches
>>>> involved should we ever decide to reclaim it.
>>> OK, but I'd like to know what is going to be safe. There's no way to
>>> future-proof the language. I'm quite prepared to replace -> with something
>>> else, and if I do then ->> will need to be adjusted accordingly, I think.
>>>
>>> My suggestion would be ~> and ~>>. I know David Wheeler didn't like that on
>>> the ground that some fonts elevate ~ rather than aligning it in the middle
>>> as most monospaced fonts do, but I'm tempted just to say "then use a
>>> different font." Other possibilities that come to mind are +> and +>>,
>>> although I think they're less attractive. But I'll be guided by the
>>> consensus, assuming there is one ;-)
>> I suspect both of those are pretty safe from an SQL standards point of
>> view.  Of course, as Tom is often wont to point out, the SQL standards
>> committee sometimes does bizarre things, so nothing's perfect, but I'd
>> be rather shocked if any of those got tapped to mean something else.
>>
>> That having been said, I still don't see value in adding operators at
>> all.  Good old function call notation seems perfectly adequate from
>> where I sit.  Sure, it's a little more verbose, but when you try to
>> too hard make things concise then you end up having to explain to your
>> users why \ditS is a sensible thing for them to type into psql, or why
>> s@\W@sprintf"%%%02x",ord($&)@e in Perl.  I recognize that I may lose
>> this argument, but I've worked with a couple of languages where
>> operators can be overloaded (C++) or defined (ML) and it's just never
>> seemed to work out very well.  YMMV, of course.
> I also basically feel this way, although I know I tend more towards
> notational brutalism than many.  I think we shouldn't kid ourselves
> that non-default operators will be used, and for
> current-implementation reasons (that maybe could be fixed by someone
> determined) it's not really at the pleasure of the author to use them
> via CREATE OPERATOR either.
>
> So, I basically subscribe to view that we should investigate what
> total reliance on prefix syntax looks like.  I guess it'll make nested
> navigation horribly ugly, though...positively lisp-esque.  That' s one
> consideration hstore doesn't have that may make use of infix notations
> considerably more useful for json than hstore.
>


We don't have the luxury of designing things like this in or out from 
scratch. Creation of operators has been a part of PostgreSQL for a good 
while longer than my involvement, and a great many people expect to be 
able to use it. I can just imagine the outrage at any suggestion of 
removing it.

So, please, let's get real. A "total reliance on prefix syntax" isn't 
going to happen, and investigating what it would look like seems to me 
just so much wasted time and effort.

cheers

andrew

Re: json api WIP patch

От

Merlin Moncure

Дата:

05 февраля 2013 г., 00:16:07

On Mon, Feb 4, 2013 at 2:10 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
>> My only remaining nit with the proposal is with json_unnest().
>>
>> SQL unnest() produces list of scalars regardless of dimensionality --
>> json unnest unwraps one level only (contrast: pl/pgsql array 'slice').
>>   So I think 'json_array_each', or perhaps json_slice() is a better fit
>> there.
>>
>
>
> how about json_array_elements()?

that works (although it's a little verbose for my taste).  maybe
json_unwrap, json_array_unwrap, etc.

merlin

Re: json api WIP patch

От

Daniel Farina

Дата:

05 февраля 2013 г., 00:20:16

On Mon, Feb 4, 2013 at 12:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>
> On 02/04/2013 03:16 PM, Daniel Farina wrote:
>>
>> On Mon, Feb 4, 2013 at 11:38 AM, Robert Haas <robertmhaas@gmail.com>
>> wrote:
>>>
>>> On Mon, Feb 4, 2013 at 11:10 AM, Andrew Dunstan <andrew@dunslane.net>
>>> wrote:
>>>>
>>>> On 02/04/2013 10:47 AM, Robert Haas wrote:
>>>>>
>>>>>
>>>>> The SQL standards considerations seem worth thinking about, too.
>>>>> We've certainly gone through a lot of pain working toward eliminating
>>>>> => as an operator name, and if the SQL standard has commandeered ->
>>>>> for some purpose or other, I'd really rather not add to the headaches
>>>>> involved should we ever decide to reclaim it.
>>>>
>>>> OK, but I'd like to know what is going to be safe. There's no way to
>>>> future-proof the language. I'm quite prepared to replace -> with
>>>> something
>>>> else, and if I do then ->> will need to be adjusted accordingly, I
>>>> think.
>>>>
>>>> My suggestion would be ~> and ~>>. I know David Wheeler didn't like that
>>>> on
>>>> the ground that some fonts elevate ~ rather than aligning it in the
>>>> middle
>>>> as most monospaced fonts do, but I'm tempted just to say "then use a
>>>> different font." Other possibilities that come to mind are +> and +>>,
>>>> although I think they're less attractive. But I'll be guided by the
>>>> consensus, assuming there is one ;-)
>>>
>>> I suspect both of those are pretty safe from an SQL standards point of
>>> view.  Of course, as Tom is often wont to point out, the SQL standards
>>> committee sometimes does bizarre things, so nothing's perfect, but I'd
>>> be rather shocked if any of those got tapped to mean something else.
>>>
>>> That having been said, I still don't see value in adding operators at
>>> all.  Good old function call notation seems perfectly adequate from
>>> where I sit.  Sure, it's a little more verbose, but when you try to
>>> too hard make things concise then you end up having to explain to your
>>> users why \ditS is a sensible thing for them to type into psql, or why
>>> s@\W@sprintf"%%%02x",ord($&)@e in Perl.  I recognize that I may lose
>>> this argument, but I've worked with a couple of languages where
>>> operators can be overloaded (C++) or defined (ML) and it's just never
>>> seemed to work out very well.  YMMV, of course.
>>
>> I also basically feel this way, although I know I tend more towards
>> notational brutalism than many.  I think we shouldn't kid ourselves
>> that non-default operators will be used, and for
>> current-implementation reasons (that maybe could be fixed by someone
>> determined) it's not really at the pleasure of the author to use them
>> via CREATE OPERATOR either.
>>
>> So, I basically subscribe to view that we should investigate what
>> total reliance on prefix syntax looks like.  I guess it'll make nested
>> navigation horribly ugly, though...positively lisp-esque.  That' s one
>> consideration hstore doesn't have that may make use of infix notations
>> considerably more useful for json than hstore.
>>
>
>
> We don't have the luxury of designing things like this in or out from
> scratch. Creation of operators has been a part of PostgreSQL for a good
> while longer than my involvement, and a great many people expect to be able
> to use it. I can just imagine the outrage at any suggestion of removing it.

I am only referring to referring the restriction that the planner
can't understand that fetchval() and '->' mean the same thing for,
say, hstore.  Hence, use of non-default CREATE OPERATOR may become
more useful some day, instead of basically being a pitfall when
someone reasonably thinks they could use either spelling of the same
functionality and the optimizer will figure it out.

I'm not suggesting removal of any feature.

My reference to "total reliance of prefix syntax" refers only to the
JSON operators, since the previous correspondence from Robert was
about how function call syntax alone may be sufficient.  This phrase
refers to the same idea he is proposing.

I also included a weakness to that idea, which is that nesting in JSON
makes the situation worse than the common compared case, hstore.

--
fdr

Re: json api WIP patch

От

Andrew Dunstan

Дата:

05 февраля 2013 г., 00:29:47

On 02/04/2013 04:19 PM, Daniel Farina wrote:
> On Mon, Feb 4, 2013 at 12:37 PM, Andrew Dunstan <andrew@dunslane.net> wrote:
>> On 02/04/2013 03:16 PM, Daniel Farina wrote:
>>> On Mon, Feb 4, 2013 at 11:38 AM, Robert Haas <robertmhaas@gmail.com>
>>> wrote:
>>>> On Mon, Feb 4, 2013 at 11:10 AM, Andrew Dunstan <andrew@dunslane.net>
>>>> wrote:
>>>>> On 02/04/2013 10:47 AM, Robert Haas wrote:
>>>>>>
>>>>>> The SQL standards considerations seem worth thinking about, too.
>>>>>> We've certainly gone through a lot of pain working toward eliminating
>>>>>> => as an operator name, and if the SQL standard has commandeered ->
>>>>>> for some purpose or other, I'd really rather not add to the headaches
>>>>>> involved should we ever decide to reclaim it.
>>>>> OK, but I'd like to know what is going to be safe. There's no way to
>>>>> future-proof the language. I'm quite prepared to replace -> with
>>>>> something
>>>>> else, and if I do then ->> will need to be adjusted accordingly, I
>>>>> think.
>>>>>
>>>>> My suggestion would be ~> and ~>>. I know David Wheeler didn't like that
>>>>> on
>>>>> the ground that some fonts elevate ~ rather than aligning it in the
>>>>> middle
>>>>> as most monospaced fonts do, but I'm tempted just to say "then use a
>>>>> different font." Other possibilities that come to mind are +> and +>>,
>>>>> although I think they're less attractive. But I'll be guided by the
>>>>> consensus, assuming there is one ;-)
>>>> I suspect both of those are pretty safe from an SQL standards point of
>>>> view.  Of course, as Tom is often wont to point out, the SQL standards
>>>> committee sometimes does bizarre things, so nothing's perfect, but I'd
>>>> be rather shocked if any of those got tapped to mean something else.
>>>>
>>>> That having been said, I still don't see value in adding operators at
>>>> all.  Good old function call notation seems perfectly adequate from
>>>> where I sit.  Sure, it's a little more verbose, but when you try to
>>>> too hard make things concise then you end up having to explain to your
>>>> users why \ditS is a sensible thing for them to type into psql, or why
>>>> s@\W@sprintf"%%%02x",ord($&)@e in Perl.  I recognize that I may lose
>>>> this argument, but I've worked with a couple of languages where
>>>> operators can be overloaded (C++) or defined (ML) and it's just never
>>>> seemed to work out very well.  YMMV, of course.
>>> I also basically feel this way, although I know I tend more towards
>>> notational brutalism than many.  I think we shouldn't kid ourselves
>>> that non-default operators will be used, and for
>>> current-implementation reasons (that maybe could be fixed by someone
>>> determined) it's not really at the pleasure of the author to use them
>>> via CREATE OPERATOR either.
>>>
>>> So, I basically subscribe to view that we should investigate what
>>> total reliance on prefix syntax looks like.  I guess it'll make nested
>>> navigation horribly ugly, though...positively lisp-esque.  That' s one
>>> consideration hstore doesn't have that may make use of infix notations
>>> considerably more useful for json than hstore.
>>>
>>
>> We don't have the luxury of designing things like this in or out from
>> scratch. Creation of operators has been a part of PostgreSQL for a good
>> while longer than my involvement, and a great many people expect to be able
>> to use it. I can just imagine the outrage at any suggestion of removing it.
> I am only referring to referring the restriction that the planner
> can't understand that fetchval() and '->' mean the same thing for,
> say, hstore.  Hence, use of non-default CREATE OPERATOR may become
> more useful some day, instead of basically being a pitfall when
> someone reasonably thinks they could use either spelling of the same
> functionality and the optimizer will figure it out.
>
> I'm not suggesting removal of any feature.
>
> My reference to "total reliance of prefix syntax" refers only to the
> JSON operators, since the previous correspondence from Robert was
> about how function call syntax alone may be sufficient.  This phrase
> refers to the same idea he is proposing.
>
> I also included a weakness to that idea, which is that nesting in JSON
> makes the situation worse than the common compared case, hstore.



I see. OK, sorry for misunderstanding.


I suspect, BTW that mostly people will use get_path*() (or rather, its 
equivalent operator ;-) ) rather than operator chaining:
    select myjson->>'{"authors",0,"name"}'::text[];

cheers

andrew

Re: json api WIP patch

От

Hannu Krosing

Дата:

05 февраля 2013 г., 02:43:08

On 01/31/2013 11:20 PM, Andrew Dunstan wrote:
>
> I'm happy to take opinions about this, and I expected
> some bikeshedding, but your reaction is contrary to
> everything others have told me. Mostly they love the operators.
What I would really like is if we extended postgresql core and made
a few more constructs definable as overloadable operator:

1) array / dictionary element lookup  a[b] CREATE OPERATOR [] (...)

2) attribute lookup  a.b CREATE OPERATOR . (...)

then you could make json lookups either step-by-step using

CREATE OPERATOR [] (    PROCEDURE = json_array_lookup, LEFTARG = json, RIGHTARG = int)

and

CREATE OPERATOR [] (    PROCEDURE = json_dict_lookup, LEFTARG = json, RIGHTARG = text)

fourthname = myjson[4]['name']


or perhaps a single


CREATE OPERATOR [] (    PROCEDURE = json_deep_lookup, LEFTARG = json, RIGHTARG = VARIADIC 
"any")

fourthname = myjson[4, 'name']


though I suspect that we do not support type VARIADIC "any" in operator 
definitions

---------
Hannu



> I guess that '~>' and '~>>' would work as well as '->' and '->>'.
>
>
> cheers
>
> andrew
>
>

Re: json api WIP patch

От

Pavel Stehule

Дата:

05 февраля 2013 г., 10:10:23

2013/2/5 Hannu Krosing <hannu@krosing.net>:
> On 01/31/2013 11:20 PM, Andrew Dunstan wrote:
>>
>>
>> I'm happy to take opinions about this, and I expected
>> some bikeshedding, but your reaction is contrary to
>> everything others have told me. Mostly they love the operators.
>
> What I would really like is if we extended postgresql core and made
> a few more constructs definable as overloadable operator:
>
> 1) array / dictionary element lookup
>   a[b]
>  CREATE OPERATOR [] (...)
>
> 2) attribute lookup
>   a.b
>  CREATE OPERATOR . (...)
>
> then you could make json lookups either step-by-step using
>
> CREATE OPERATOR [] (
>     PROCEDURE = json_array_lookup, LEFTARG = json, RIGHTARG = int)
>
> and
>
> CREATE OPERATOR [] (
>     PROCEDURE = json_dict_lookup, LEFTARG = json, RIGHTARG = text)
>
> fourthname = myjson[4]['name']
>
>
> or perhaps a single
>
>
> CREATE OPERATOR [] (
>     PROCEDURE = json_deep_lookup, LEFTARG = json, RIGHTARG = VARIADIC "any")
>
> fourthname = myjson[4, 'name']
>

it is near to full collection implementation - and can be nice to have
it. For this moment we should to return to this topic.

My preference is using well named functions (prefer it against
operator) and operator that are not in collision with current ANSI SQL

I don't see any nice on design   select
myjson->>'{"authors",0,"name"}'::text[]; - more it is ugly as
dinosaurs

better and more usual

myjson['authors']['0']['name']

or

myjson['authors/0/name']

Regards

Pavel

>
> though I suspect that we do not support type VARIADIC "any" in operator
> definitions
>
> ---------
> Hannu
>
>
>
>> I guess that '~>' and '~>>' would work as well as '->' and '->>'.
>>
>>
>> cheers
>>
>> andrew
>>
>>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: json api WIP patch

От

Andrew Dunstan

Дата:

05 февраля 2013 г., 11:27:02

On 02/05/2013 02:09 AM, Pavel Stehule wrote:

>
> I don't see any nice on design   select
> myjson->>'{"authors",0,"name"}'::text[]; - more it is ugly as
> dinosaurs

I rather like dinosaurs. Beauty is, as they say, in the eye of the beholder.

Let me also point out that you can say (somewhat less efficiently):
    myjson->'authors'->0->>'name'

which is not terribly inelegant.

>
> better and more usual
>
> myjson['authors']['0']['name']
>
> or
>
> myjson['authors/0/name']

Well, nothing like that is going to happen in this release. If you or 
someone wants to work on a general subscripting facility for arbitrary 
data types then I look forward to seeing it.

Let me also point out that the most important part of this patch is the 
part that almost nobody has commented on, namely the parser changes and 
API that the actual visible functions are built on. Writing JSON 
accessor / transformation functions without this framework is hard, and 
often redundant. I'm much more concerned to get this framework and some 
basic accessor functions (and preferably operators) added than bothered 
about how the latter are precisely spelled.

cheers

andrew

Re: json api WIP patch

От

Pavel Stehule

Дата:

05 февраля 2013 г., 12:05:23

2013/2/5 Andrew Dunstan <andrew@dunslane.net>:
>
> On 02/05/2013 02:09 AM, Pavel Stehule wrote:
>
>
>>
>> I don't see any nice on design   select
>> myjson->>'{"authors",0,"name"}'::text[]; - more it is ugly as
>> dinosaurs
>
>
> I rather like dinosaurs. Beauty is, as they say, in the eye of the beholder.
>
> Let me also point out that you can say (somewhat less efficiently):
>
>     myjson->'authors'->0->>'name'
>
> which is not terribly inelegant.
>
>
>>
>> better and more usual
>>
>> myjson['authors']['0']['name']
>>
>> or
>>
>> myjson['authors/0/name']
>
>
>
> Well, nothing like that is going to happen in this release. If you or
> someone wants to work on a general subscripting facility for arbitrary data
> types then I look forward to seeing it.
>
> Let me also point out that the most important part of this patch is the part
> that almost nobody has commented on, namely the parser changes and API that
> the actual visible functions are built on. Writing JSON accessor /
> transformation functions without this framework is hard, and often
> redundant. I'm much more concerned to get this framework and some basic
> accessor functions (and preferably operators) added than bothered about how
> the latter are precisely spelled.

C API and implementation can be changed or fixed  without hard issues
- it is usual so about SQL interface is hard discussion.

Regards

Pavel

>
>
> cheers
>
> andrew
>
>
>

Вход в личный кабинет

Восстановление пароля

Подтверждение аккаунта

Изменение пароля

Обсуждение: json api WIP patch

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения

Вложения