[talk] lex start conditions.

Warner Losh imp at bsdimp.com
Thu Oct 29 15:34:16 EDT 2020

On Thu, Oct 29, 2020 at 10:31 AM Pete Wright <pete at nomadlogic.org> wrote:

> On 10/28/20 6:44 PM, R. Clayton wrote:
> > It's van Wyk's thesis work under Knuth on Ideal, a constraint-based
> little
> > language for describing pictures, so I'm guessing it's from whatever
> they had
> > going at Stanford in the late '70s.  I'm hoping it's a common-knowledge
> idiom
> > that has subsided into obscurity over time, and I'm trying to find
> long-timer
> > who will look at it and say "Oh yeah, that...".
> If you aren't already subscribed to TUHS
> (https://minnie.tuhs.org/mailman/listinfo/tuhs) it might be worth
> sending a message to that list.  I've found it really useful resource
> for things like this, often getting information directly from the people
> involved.

Late 70's would be either the PWB version of lex, or the V7 version.

But looking at the V7 version, we see

header.c: fprintf(fout,"# define BEGIN yybgin = yysvec + 1 +\n");

but today's flex generates:

#define BEGIN (yy_start) = 1 + 2 *

so the +1 could get to the other half.  Lex generates #defines for each of
these states that are sequential.

Rummaging around in the code a bit, I see that yy_at_bol is used when we're
at the start of the line. It's 1 at start of line and 0 otherwise.

So my best guess is that the PROGRAM+1 transitions to the parser state as
if it were the start of the line (or is trying to) even if it really isn't
at the start of a line.

The matches this that I found in the flex manual:
      The macro yy_set_bol(at_bol) can be used to control whether the
       buffer's scanning context for the next token match is done as though
       the beginning of a line.  A non-zero macro argument makes rules
       anchored with
        '^' active, while a zero argument makes '^' rules inactive.

       The macro YY_AT_BOL() returns true if the next token scanned from the
       current buffer will have '^' rules active, false otherwise.
as well as scattered references to line based parsing.

The next layer of details, though, requires more study than I have time for
right now :)

Reading your grammar that way, does it track?

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.nycbug.org/pipermail/talk/attachments/20201029/f0083b5c/attachment.htm>

More information about the talk mailing list