RPN as intermediate code?

RPN as intermediate code?

Post by Noel Goreli » Wed, 23 Aug 1995 04:00:00



I am struggling trying to convert my C interpreter from a tree-based
design to an intermediate-code design.   I have, up until now simply
used YACC to produce my parse tree, and then just stepped through the
tree executing the code.  This has worked fine for most things,
including if/else, compound statements, and for/while loops.

I now want to move to storing (and executing) some form of intermediate
code.  I really like the idea of encoding into an 'execution string', like
bc does, and reverse polish works great for this, as long as I stick to
encoding expressions.  But this model falls apart when I get into things
like conditionals and loops, and I am completely baffled how to do
short-circuit logic like this.  For instance, I can compile the following
code into the RPN strings shown (which execute easily):

    x = a*(b+c)             ->  "a b c + * x ="
    y = sqrt(b*b-4*a*c)     ->  "b b * 4 a * c * - 1 sqrt callfunc y ="
    z = foo(a,bar(b),c)     ->  "a b 1 bar callfunc c 3 foo callfunc z ="

    (actually, all the identifiers and constants go into a dictionary
    and are replaced by a lookup index, but you get the idea...)

However, when I get into conditionals and loops, I have absolutely no idea
how to proceed.  Do I really have to get all the way down to the level of
computing jump offsets and implement these as conditional gotos?  Or is
there a more elegant way that I just haven't found yet?

And why can't I find any literature about this sort of thing?  All the
compiler books I can find say 'Here is reverse polish.  Its a good
intermediate representation because its stack-like', but then they go off
and do something completely different.

Anyways, I'll take any help I can get here.


[I've used conditional jumps with ofsets.  They're ugly, but they're not hard
to do.  The alternative is to have tokens like WHILE  ENDWHILE, IF THEN
ELSE ENDIF and have the interpreter scan over parts you're skipping and keep
a stack of places to retur tom but that's a lot of scanning and skipping.
-John]
--


 
 
 

RPN as intermediate code?

Post by Leonard Kapl » Sat, 26 Aug 1995 04:00:00



>I am struggling trying to convert my C interpreter from a tree-based
>design to an intermediate-code design.

(snip)

Quote:>I now want to move to storing (and executing) some form of intermediate
>code.  I really like the idea of encoding into an 'execution string', like
>bc does, and reverse polish works great for this, as long as I stick to
>encoding expressions.  But this model falls apart when I get into things
>like conditionals and loops, and I am completely baffled how to do
>short-circuit logic like this.  For instance, I can compile the following
>code into the RPN strings shown (which execute easily):

(snip)

Quote:>However, when I get into conditionals and loops, I have absolutely no idea
>how to proceed.  Do I really have to get all the way down to the level of
>computing jump offsets and implement these as conditional gotos?

You could either use that approach (effectively both compile _and_ assemble
for your runtime "machine", I suppose), or use conditional branches and
store the target addresses into your dictionary (just compile). I've used
the following approach successfully:

Traditionally, Forth implements conditionals using two branch instructions,
call them BRANCH and ZBRANCH. BRANCH is unconditional, the target address
(or offset) is always "jumped to". ZBRANCH only performs the branch if the
current top-of-stack (TOS) is zero. An "if" statement like this:

    if (a > b){
      stuff;
      stuff;
    }

Would generate code something like:

    a
    b
    >
    ZBRANCH xx
    stuff
    stuff
    (this is location xx)

A more complex case,

    if (a > b){
      stuff1;
      stuff2;
    }else{
      stuff3;
      stuff4;
    }

Would generate

    a
    b
    >
    ZBRANCH xx
    stuff1
    stuff2
    BRANCH yy
    (this is location xx)
    stuff3
    stuff4
    (this is location yy)

A loop like

    while (a > b){
      stuff5;
      stuff6;
    }

looks like

    (this is location xx)
    a
    b
    >
    ZBRANCH yy
    stuff5
    stuff6
    BRANCH xx
    (this is location yy)

And so on. One disadvantage to this approach (or maybe not, you haven't
said whether your interpreter scans and runs line-by-line, or does an
entire tokenizing pass, _then_ runs) - you must always scan far enough
ahead to resolve any necessary addresses. In Forth, this is not a problem,
the system traditionally has two states, compiling and executing. While
compiling, it even uses the expression stack to hold both branch target
locations, and locations where the target addresses are to be stored (the
word following the "BRANCH" instruction, essentially). Also, Forth's
conditional words (IF, THEN, ELSE, and so on) actually _execute_ during
compilation, and generate the above structures into the dictionary. They
are called "compiling words" for that reason.

Quote:>And why can't I find any literature about this sort of thing?  All the
>compiler books I can find say 'Here is reverse polish.  Its a good
>intermediate representation because its stack-like', but then they go off
>and do something completely different.

I've often wondered that myself. Interpreters don't usually get much more
than a brief mention, anyway, in most compiler books. I learned how to set
up the above branch implementation by looking at the source code for
various Forth implementations.

Quote:>[I've used conditional jumps with ofsets.  They're ugly, but they're not
>hard
>to do.  The alternative is to have tokens like WHILE  ENDWHILE, IF THEN
>ELSE ENDIF and have the interpreter scan over parts you're skipping and
>keep
>a stack of places to retur tom but that's a lot of scanning and skipping.
>-John]

That would certainly work, though you would end up with more work at
runtime - unless you resolve things during a prepass, at which point you're
pretty-much back to jumps and offsets anyways!.

-Len
--



 
 
 

RPN as intermediate code?

Post by Nor Jai » Sun, 27 Aug 1995 04:00:00


Quote:>And why can't I find any literature about this sort of thing?  All the
>compiler books I can find say 'Here is reverse polish.  Its a good
>intermediate representation because its stack-like', but then they go off
>and do something completely different.

Try Threaded Interpreted Language (TIL) published by Byte. It is a very
old book (1970's?) discussing the implementation of a language that
looks like Forth. (May be it was Forth's forgotten predecessor?). The
compiled code (as well as the source code) is in RPN. If I remembered
correctly, almost everything, (+, -, if, while, etc) are compiled into
"procedure calls". But I gather performance is not high in your criteria
list. The good thing about TIL is that if you stripped away its editor and
translator, the run-time part that is left is very small.

Nor Jaidi


--


 
 
 

RPN as intermediate code?

Post by Martin Oders » Wed, 30 Aug 1995 04:00:00


Noel Gorelick writes (on RPN translations):

   However, when I get into conditionals and loops, I have absolutely no idea
   how to proceed.  Do I really have to get all the way down to the level of
   computing jump offsets and implement these as conditional gotos?  Or is
   there a more elegant way that I just haven't found yet?

I believe there is one.  All you need is a construct that pushes the
address of a given chunk of code on the execution stack. That address
can then be used by instructions for conditionals, loops etc. E.g.
you would translate

        while x <= y do x := x * x

to:

        pushc  x y <=
        pushc  x x * x:=
        while

Cheers

-- Martin Odersky
--


 
 
 

RPN as intermediate code?

Post by David Furminie » Fri, 01 Sep 1995 04:00:00


:     x = a*(b+c)             ->  "a b c + * x ="
:     y = sqrt(b*b-4*a*c)     ->  "b b * 4 a * c * - 1 sqrt callfunc y ="
:     z = foo(a,bar(b),c)     ->  "a b 1 bar callfunc c 3 foo callfunc z ="

:     (actually, all the identifiers and constants go into a dictionary
:     and are replaced by a lookup index, but you get the idea...)

: However, when I get into conditionals and loops, I have absolutely no idea
: how to proceed.  Do I really have to get all the way down to the level of
: computing jump offsets and implement these as conditional gotos?  Or is
: there a more elegant way that I just haven't found yet?

 Ok I am not a compiler specialist, but I am quiet used to use
 the HP48 wich is fully RPN

  so I thought liske that

  first replace '=' with STO wich would be much more general
  like << 1 + >> 'INC' STO to create the INC function
  now you could do

     z = if b then a*(b+c) else sqrt(b*b-4*a*c)

  in

     << b c + a * >> X STO << b b * a c 4 * * - >> Y STO
     X Y b IFTE

   it is important to understand that the form << >> will not execute
   the stuff inside but only define a function
   and the IFTE function would be

   taking b from the stack if b is true then drop the first element from the
   stack and execute the second, if not swap the elements drop execute

  Hope this will help

--
  Dave
& have a good night


--


 
 
 

RPN as intermediate code?

Post by Henry Bak » Sun, 03 Sep 1995 04:00:00




> >I am struggling trying to convert my C interpreter from a tree-based
> >design to an intermediate-code design.

> >I now want to move to storing (and executing) some form of intermediate
> >code.  I really like the idea of encoding into an 'execution string', like
> >bc does, and reverse polish works great for this, as long as I stick to
> >encoding expressions.  But this model falls apart when I get into things
> >like conditionals and loops, and I am completely baffled how to do
> >short-circuit logic like this.  For instance, I can compile the following
> >code into the RPN strings shown (which execute easily):

Check out the following web stuff:

ftp://ftp.netcom.com/pub/hb/hbaker/ForthStack.html (also .ps.Z) and
ftp://ftp.netcom.com/pub/hb/hbaker/Use1Var.html  (also .ps.Z)

--
www/ftp directory:
ftp://ftp.netcom.com/pub/hb/hbaker/home.html

--


 
 
 

RPN as intermediate code?

Post by Stephen J Bev » Thu, 07 Sep 1995 04:00:00


   Try Threaded Interpreted Language (TIL) published by Byte. It is a very
   old book (1970's?) discussing the implementation of a language that
   looks like Forth. (May be it was Forth's forgotten predecessor?).

It is Forth.  Here's the relevant section of a comp.lang.forth FAQ on books :-

"Threaded Interpretive Languages"  R. G. Loeliger
Byte Books, 1981, ISBN: 0-07-038360-X

  This book is out of print but sometimes it is available from
  booksellers or used book stores.


  Before I read the book I thought that, based on the title, it was
  going to examine languages which are associated with threaded
  interpretive implementations and also cover different threading
  techniques.  I was disappointed to find that it only covers (FIG)
  Forth and indirect threading, in fact a more accurate title for the
  book would be "How to write an indirect threaded FIG Forth for the
  Z80".  That said, the book is easy to read and detailed enough for
  someone to implement a Forth from scratch on a Z80.  The bulk of the
  book is taken up with Z80 code for the various words which make up
  the Forth.  It is not a just listing though, each section is
  carefully explained with all inputs/outputs, registers,
  ... etc. documented.  There are also introductory sections which
  explain the concepts being used along with flowcharts (!) for some
  routines.  Towards the end of the book around 30 pages are devoted
  to an implementation of a Z80 Forth assembler.  Block-based VM,
  editors and cross compilation are briefly discussed.

--


 
 
 

RPN as intermediate code?

Post by Matthew Johns » Tue, 12 Sep 1995 04:00:00


The book "TIL..." was written well after FORTH was written, by the
author when he was trying to do his own version of a TIL.  He was
accused of "not understanding FORTH", but in fact he was doing his
own TIL, not FORTH.  Even so, I think his variations on the FORTH
idea did not prove very promising.  This is why you still see
versions of FORTH (although more and more rarely), but almost never
his TIL.

The latest incarnation of FORTH is IEEE 1275 standard for _hardware
independent_firmware. For more info on 1275, see my Web page on Open
Firmware at: ftp://ftp.netcom.com/pub/me/mejohnsn/OFIntro.html.  I
will be re-writing it soon to presume less FORTH knowledge on the
part of the reader.

Matthew Johnson
Sabaki Engineering

--


 
 
 

RPN as intermediate code?

Post by Robert Corbe » Wed, 13 Sep 1995 04:00:00



>The latest incarnation of FORTH is IEEE 1275 standard for _hardware
>independent_firmware.

For most programmers, the ANSI Forth standard (X3.215-1994) might
prove more useful.

                                        Sincerely,
                                        Bob Corbett
--


 
 
 

1. P-code, T-code, and Uni-Code Intermediate Languages

I am trying to find recent references to P-code, T-code and Uni-Code, all
of which are intermediate code languages. I am somewhat familiar with the
Pascal P machine and P-code although not with what has been done lately,
but I have not heard much about T-code or Uni-Code. It would be greatly
appreciated if anyone knowing about these codes would send me e-mail.
Thank you very much in advance.

Aiqin (Angelina) Pan
Mitsubishi Electric Research Laboratory
Sunnyvale, California

--


2. I swear I've got to be the most unlucky guy.....

3. intermediate code representation

4. RID/SID for NT Local Group

5. intermediate/3d code??

6. Microtek X6 El Any Good?

7. Position of an Intermediate Code Generator

8. Trying to find an example of producing intermediate Code in YACC

9. good Intermediate code for microcontroller c compiler?

10. Intermediate code representations (detailed)

11. Intermediate code representations

12. Why do intermediate codes have >, >=?