LR Grammars not in LALR(1) or LR(1)

LR Grammars not in LALR(1) or LR(1)

Post by Clint Olse » Tue, 15 Oct 2002 05:01:50




> [Bison is actually a descendant of Berkeley Yacc, but unless something
> has changed radically, they both use the same table generating and
> parsing algorithms. -John]

If you download the alpha version of bison (1.49a), they've actually
added GLR parsing support.  In the event of an ambiguity, bison
replicates itself down both parse branches and then either kills off
the exploration because of a parse error or merges them if they reduce
to the same thing.  This continues apparently down every ambiguous
branch respectively.

-Clint

 
 
 

LR Grammars not in LALR(1) or LR(1)

Post by Chris F Clar » Tue, 15 Oct 2002 05:02:24



> Hmmmmm.  Suppose we have a YACC input that uses disambiguation rules.
> Does the standard way of removing the precedence and associativity
> ambiguities by introducing new nonterminals and grammar rules always
> yield an LALR(1) grammar?  It would be nice if that were so.

Joachim Durcholz replied:

Quote:> The original grammar is ambiguous, hence not LR-anything. So this
> question doesn't make much sense to me.

Presuming Tom is asking the question "Can a question with precedence
and associativity directives as in most yacc derivatives be
transformed into an LALR grammar with no directives?" the question
does make sense, but does not have a known answer as far as I can
recall.

From the theoretical results that I know.  Here are some of the known
facts.  I hope I get them all right.  Some of the facts are very
subtle and non-intuitive.

1) Any deterministic context-free language (DCFL) (that is a language
   that has at least one deterministic context free grammar (DCFG))
   has an LR(1) grammar.

2) There is no algorithm that can find the LR(1) grammar given an
   arbitrary DCFL.  I presume that is also true even if you have a
   non-LR(1) DCFG for the DCFL.  (This means if you thing you have an
   alogorithm an omnisicient being can find a grammar that your
   algorithm will not work correctly upon.)

3) Not every DCFL has an LALR(1) grammar.

4) There is no algorithm then can tell if an aribtrary grammar is
   unambiguous.

5) There are algorithms that can tell if specific grammars are
   unambiguous.

6) Every LR(k) grammar is unambiguous.

7) Every DCFL has an unambiguous grammar (i.e the LR(1) grammar for
   the DCFL is an unambiguous one).

8) Precedence and associativity directives allow parsing an ambiguous
   grammar deterministically (by eliminating the ambiguous parses and
   potentially some non-ambiguous ones).  (The original paper on the
   topic was called "Deterministic Parsing of Ambiguous Grammars".  It
   is easy to read and worth finding in my opinion.)

9) The pressence of a conflict in a LR grammar does not necessarily
   mean that the grammar is ambiguous (only that the LR parser
   generator cannot tell whether the grammar is ambiguous or not).

10) Any ambiguous grammar will have LR conflicts.

Form what I understand of the theory, I believe points 2 & 3 are
potential killers.  Essentially I think they point to the existence of
pathological grammars that are ambiguous but which have a
deterministic (unambiguous) parse and for which it is not possible to
algorithmically find the corresponding LALR grammar (if there even is
one).

However, given all of this, from the practical point of view, I think
the common transformations can be expected to reliably generate an
LALR grammar from the ambiguous one for any grammar you are likely to
encounter.

One point which has been discussed in some of the related threads,
transforming a grammar often destroys its structure.  Thus, even if
one could mechanically generate an unambiguous LALR grammar from one
with precedence and associativity declarations, it is not clear that
one would want to, since doing so might change the non-terminals
sufficiently that one could not decorate the grammar the way one would
like.  (Note also that the grammar with the directives is probably
smaller, faster running, and easier to read than the unambiguous
grammar.)

Joachim Durcholz also replied:

Quote:> It's too easy to "disambiguate" a grammar into something that
> accepts a language that's different from the intended one.

This is a truth.  Careless use of precedence and associativity
directives can hide problems in the grammar where the directives apply
not only in the one conflict where they were intended to apply, but
also in other states where they eliminate a path that would have lead
to a correct parse.

Hope this helps,
-Chris

*****************************************************************************

Compiler Resources, Inc.       Web Site   :  http://world.std.com/~compres
3 Proctor Street               voice      :  (508) 435-5016
Hopkinton, MA  01748  USA      fax        :  (508) 435-4847  (24 hours)

 
 
 

LR Grammars not in LALR(1) or LR(1)

Post by M.G.J. van den Bran » Tue, 15 Oct 2002 05:03:51



> Then one would want to be able to compare those algorithms by merely
> change the parsing algorithm. And then the method of disambiguation should
> be independent of the parsing algorithm.

Yes, this would be great if this was possible. However, our group has
quite some experiences in using GLR techniques. One of the problems is
that postponing disambiguation entirely to the end of the parsing
process leads to an explosion in the parse forest. The filtering of
this forest becomes far too expensive.


    author = "P. Klint and E. Visser",
    title = "Using filters for the disambiguation of contextfree
                grammars",
    booktitle = {Proceedings of the ASMICS Workshop on Parsing Theory},
    editors = {G. Pighizzini and P. San Pietro},
    pages = {1--20},
    year = "1994",
    note = {Tech. Rep. 126-1994 Dipartimento di Scienze
        dell'Informazione, Universita di Milano}

Quote:}

a technique is described to use associativity and priority information
to "filter" the parse table in order to reduce the number of parallel
branches latter on in the GLR algorithm. In this way the parse forest
is reduced and the GLR speeds up.

The filters described in

Quote:>+    http://www.cs.uu.nl/people/visser/ftp/BSVV02.pdf

are again applied as soon as possible, however we did several
experiments in which the filters were applied after the parsing and
indeed conceptually this is possible, but you do not gain with respect
to performance.

-- Mark

---------------------------------------------------------------
M.G.J. van den Brand,
Department of Software Engineering
CWI
Kruislaan 413, NL-1098 SJ AMSTERDAM, The Netherlands.

Tel___(+31) 20 5924213  WWW____http://www.cwi.nl/~markvdb/

 
 
 

LR Grammars not in LALR(1) or LR(1)

Post by S?nke Kannapin » Sun, 20 Oct 2002 12:29:22


Quote:Chris F Clark writes:
> 1) Any deterministic context-free language (DCFL) (that is a language
>    that has at least one deterministic context free grammar (DCFG))
>    has an LR(1) grammar.

> ...

> 3) Not every DCFL has an LALR(1) grammar.

> ...

> From what I understand of the theory, I believe points 2 & 3 are
> potential killers.  Essentially I think they point to the existence of
> pathological grammars that are ambiguous but which have a
> deterministic (unambiguous) parse and for which it is not possible to
> algorithmically find the corresponding LALR grammar (if there even is
> one).

Just a minor correction concerning 3):

For k >= 0, any cfg G can be transformed into a structurally equivalent
cfg which is SLR(k) if and only if G is LR(k).
Therefore, for any fixed k >= 0, the families of LR(k) languages,
LALR(k) languages, and SLR(k) languages are all equal.

See Theorem 6.70 in chapter 6.6 of

    Seppo Sippu and Eljas Soisalon-Soininen:
    Parsing Theory. Vol.II: LR(k) and LL(k) Parsing
    EATCS Monographs on Theoretical Computer Science
    Springer-Verlag, 1990

which also contains a detailed proof (using the concept of a cfg
T_k(G) that right-to-right covers a cfg G).

Put another way, every DCFL has an LR(1), an LALR(1) and even an SLR(1)
grammar.

Regards,
Soenke.

 
 
 

LR Grammars not in LALR(1) or LR(1)

Post by Joachim Durchhol » Tue, 22 Oct 2002 11:55:05



> Put another way, every DCFL has an LR(1), an LALR(1) and even an SLR(1)
> grammar.

What, then, is the advantage of LALR(1) over SLR(1)? Especially in the
light that LALR and SLR versions of the grammar are structurally
equivalent (assuming some useful definition of "structurally equivalent").

Joachim
[I was under the impression that LALR tables are smaller. -John]

 
 
 

LR Grammars not in LALR(1) or LR(1)

Post by Clint Olse » Sat, 26 Oct 2002 12:56:36



> [I was under the impression that LALR tables are smaller. -John]

The Dragon Book states that SLR is not as powerful as LALR.  The example
grammar given was:

S -> L = R
S -> R
L -> * R
L -> id
R -> L

Apparently SLR does not remember enough left context to resolve what to do
on '='.

-Clint

 
 
 

LR Grammars not in LALR(1) or LR(1)

Post by S?nke Kannapin » Sat, 26 Oct 2002 13:15:36


Quote:> [I was under the impression that LALR tables are smaller. -John]

Yes and no. Depends on what concept of "parser size" you apply.
You can ask for the "size" of an LALR(k) parser
* as a pushdown transducer - that's a formal concept of "size" -, or
* as the cumulative size of the parser tables of a parser as generated
  by some parser generator; that's an implementation-oriented concept
  of parser size (which is problematic in theoretical size comparisons
  for several reasons).

The number of canonical LR(0) states is, of course, the main
contributing factor for both the formal and some implementation-
oriented parser size concept for LALR(k) and SLR(k) parsers. But
formally, the SLR(k) parser of a grammar has at least as many actions
than the LALR(k) parser. Consequently, a grammar's SLR(k) parser is
always greater or equal in formal size than its LALR(k) parser. But
whether the same is also true when looking at implementations is hard
to say: I cannot say whether the compression techniques applied by
your favorite parser generator is able to produce smaller output for
SLR(k) than for LALR(k). Can you?

Quote:

> > Put another way, every DCFL has an LR(1), an LALR(1) and even an
> > SLR(1) grammar.

> What, then, is the advantage of LALR(1) over SLR(1)? Especially in
> the light that LALR and SLR versions of the grammar are structurally
> equivalent (assuming some useful definition of "structurally
> equivalent").

Of course, for a *fixed* grammar, the SLR(k) parser may be ambiguous
while the LALR(k) parser isn't.

An SLR(1) grammar for a DCFL will probably have more nonterminals and
rules than an LALR(1) grammar (and even more that an LR(1) grammar)
for the same language. Still, they can both be "structurally equivalent"
to the LR(1) grammar (needs a mapping from xLR(1) grammar rules to
corresponding LR(1) grammar rules).

Regards,
S?nke