Thomas Lockhart <lockhart@alumni.caltech.edu> writes:
>> I think it's time to bite the bullet and put in a lookahead filter.
>> What say you?
> *sigh* Probably right. The UNION vs UNION JOIN stuff illustrates it
> pretty well. I haven't tried assigning precedence levels to these tokens
> or to those subclauses; would that help to resolve the conflicts?
I don't see how. The real problem is that given
SELECT * FROM foo UNION ... ^ parsing here
you don't know whether to reduce what you have to select_clause
(as you must if what follows is UNION SELECT) or shift (as you must
if you want to parse "foo UNION JOIN bar" as part of the FROM-clause).
Precedence will not help: the grammar is just plain not LR(1) unless you
count UNION JOIN as a single token. It's barely possible that we could
redesign our grammar to avoid needing to make a shift-reduce decision
here, but it would be so ugly and nonintuitive that I can't see that as
being a better answer than a lookahead filter.
We should use precedence to implement ISO's distinction in the
precedence of UNION, INTERSECT, and EXCEPT (we get that wrong
currently), but I don't see how it helps for the UNION vs UNION JOIN
issue.
Quite apropos of this: now that we are committed to assuming our lexer
is flex, does anyone object to using flex's -P option to customize the
yyfoo() names emitted by flex? That seems cleaner to me than the
sed-script kluges we currently rely on.
regards, tom lane