Обсуждение: XML schemas and PG column names
I'm doing some work with the output of query_to_xml_and_xmlschema(). The output is a bit unfortunate in my opinion when the column names in the query are not legal XML names. We quite reasonably translate the names so that legal XML names result, but we don't actually put the original name anywhere in the output, meaning that the end processor has some work to do to recover the original. Here are two snippets from the output when the column names are "z" and "25 %ile": <xsd:complexType name="RowType"> <xsd:sequence> <xsd:element name="z" type="INTEGER" nillable="true"></xsd:element> <xsd:element name="_x0032_5_x0020__x0025_ile" type="INTEGER" nillable="true"></xsd:element> </xsd:sequence> </xsd:complexType> <row> <z>1</z> <_x0032_5_x0020__x0025_ile>2</_x0032_5_x0020__x0025_ile> </row> Of course, we can recover the original name by using something like perl to do operations like this: $column_name =~ s/_x([[:xdigit:]]{4})_/pack("U",hex($1))/ge; but that's ugly and not as simply available in many XSL processors (I am using XSL to transform the XML.) I propose that we annotate the schema section RowType elements with the original names, so we would have something like this in the schema section: <xsd:complexType name="RowType" xmlns:pg="http://www.postgresql.org/schemas/column-names"> <xsd:sequence> <xsd:element name="z" type="INTEGER" nillable="true"> <xsd:annotation> <xsd:appinfo> <pg:column-name>z</pg:column-name> </xsd:appinfo> </xsd:annotation>* * </xsd:element> <xsd:element name="_x0032_5_x0020__x0025_ile" type="INTEGER" nillable="true"> <xsd:annotation> <xsd:appinfo> <pg:column-name>25 %ile</pg:column-name> </xsd:appinfo> * *</xsd:annotation> </xsd:element> </xsd:sequence> </xsd:complexType> While it might be a bit longwinded, it's not going to add to the per-row output, just the schema section. Thoughts? cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > I propose that we annotate the schema section RowType elements with the > original names, so we would have something like this in the schema section: 1. Is that legal per the SQL/XML spec? 2. What happens when the column name contains characters that would have to be escaped, such as "<" --- haven't you just replaced one de-escaping problem with another? regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> I propose that we annotate the schema section RowType elements with the >> original names, so we would have something like this in the schema section: >> > > 1. Is that legal per the SQL/XML spec? > It is certainly legal per XML and XSD specs, and the SQL/XML spec has annotations using appinfo elements. It would be rather surprising if the SQL/XML spec forbade annotations such as I propose. The spec is mind-bogglingly impenetrable, though. Perhaps Peter or Nicholas might know. > 2. What happens when the column name contains characters that would have > to be escaped, such as "<" --- haven't you just replaced one de-escaping > problem with another? > > > No. say the name is "foo & bar < baz". Then the annotation would be: <pg:column-name>foo & bar < baz</pg:column-name> But the difference is that the XML processor will automatically unescape this value (and re-escape it on output if necessary). The user won't have to do anything (or shouldn't if their XML processor is worth anything at all). So in a stylesheet, I'd be able to do something like: <xsl:for-each select="//[complexType[@name="RowType"]//pg:column-name"> <th><xsl:value-of select="." /></th> </xsl:for-each> and it would just Do The Right Thing. (If we didn't want the output re-escaped, say when the otuput format was not XML or HTML, we could make it do that too). cheers andrew
Andrew Dunstan <andrew@dunslane.net> writes: > Tom Lane wrote: >> 2. What happens when the column name contains characters that would have >> to be escaped, such as "<" --- haven't you just replaced one de-escaping >> problem with another? > But the difference is that the XML processor will automatically unescape > this value (and re-escape it on output if necessary). The user won't > have to do anything (or shouldn't if their XML processor is worth > anything at all). OK, so your argument is that this is a standard escaping rule and the one in the SQL standard is, um, not standard. I wonder why the SQL committee felt compelled to invent their own, then? regards, tom lane
Tom Lane wrote: > Andrew Dunstan <andrew@dunslane.net> writes: > >> Tom Lane wrote: >> >>> 2. What happens when the column name contains characters that would have >>> to be escaped, such as "<" --- haven't you just replaced one de-escaping >>> problem with another? >>> > > >> But the difference is that the XML processor will automatically unescape >> this value (and re-escape it on output if necessary). The user won't >> have to do anything (or shouldn't if their XML processor is worth >> anything at all). >> > > OK, so your argument is that this is a standard escaping rule and the > one in the SQL standard is, um, not standard. I wonder why the SQL > committee felt compelled to invent their own, then? > > > They are two different things. An XML-escaped text value is by no means necessarily a legal XML tag name (e.g. an XML name can't have spaces). Possibly what they really did wrong was to try to map SQL column names to XML tags at all. It might have been better to do something like: <column name="foo & bar">some value</column> instead of what we produce, which I assume is in the standard: <foo_x0020__x0026__x0020_bar>somevalue</foo_x0020__x0026__x0020_bar> which I think is just plain ugly. OTOH, then it would have been far harder (maybe impossible) to create an XML schema for such a mechanism, so I assume that's why they did it this way. Anyway, It would be nice to have a way of providing the non-mangled names - I think what I have suggested should meet the case. cheers andrew
On lör, 2009-12-12 at 11:51 -0500, Andrew Dunstan wrote: > It is certainly legal per XML and XSD specs, and the SQL/XML spec has > annotations using appinfo elements. It would be rather surprising if > the > SQL/XML spec forbade annotations such as I propose. The spec is > mind-bogglingly impenetrable, though. Perhaps Peter or Nicholas might > know. I think we can of course add our own annotations. It would be good to go through the SQL/XML standard document and check what style they use for their annotations so that we can structure and name ours similarly and have room for future work, in case someone also wants annotations for table names, schema names, etc. (Or was that part of your project as well?)
Peter Eisentraut wrote: > On lör, 2009-12-12 at 11:51 -0500, Andrew Dunstan wrote: > >> It is certainly legal per XML and XSD specs, and the SQL/XML spec has >> annotations using appinfo elements. It would be rather surprising if >> the >> SQL/XML spec forbade annotations such as I propose. The spec is >> mind-bogglingly impenetrable, though. Perhaps Peter or Nicholas might >> know. >> > > I think we can of course add our own annotations. It would be good to > go through the SQL/XML standard document and check what style they use > for their annotations so that we can structure and name ours similarly > and have room for future work, in case someone also wants annotations > for table names, schema names, etc. (Or was that part of your project as > well?) > > Well, the standard has an element specifically for annotations concerning certain objects: sqlxml:sqlname. However, I am not sure if it can be used in this context. Can you try reading the standard (<http://www.sqlx.org/SQL-XML-documents/5FCD-14-XML-2004-07.pdf>) and tell me? :-) If it's not comprehended by the standard then we should at least use a different namespace, and probably a different element name. The style can be made to match, though. I certainly think we could do more of this, although the column names are what matter to me right now. We can also do it bit by bit. cheers andrew