Exploratory Parsing in Frames

A brief charter. We explore Ward's technique of Exploratory Parsing with 1 hour and a local wiki with the Frame Plugin and our esm.html script.

This was a web based experiment manager that could prepare, monitor, and refine task specific grammars turned into engines with a slightly enhanced version of Ian Piumarta's pegleg.

We begin with our esm frame and a javascript peg parser. docs

We will parse raw text, a comma-separated list of url.

Trouble: I wanted newline as separator but couldn't get that to parse.

const rawtext = `https://cdn.jsdelivr.net/npm/peggy@4.0.2/esm,https://c2.com/ward/sys/find.cgi?search=explore&start=3&list=7`

Here's our grammar:

Trouble: the other rule works when called directly but doesn't take over when url fails to match.

const grammar = ` start = url ("," url)* url = (protocol "//" domain path* query?) / other protocol = "http" "s"? ":" domain = word ("." word)* path = "/" word suffix? version? suffix = "." word query = "?" param ("&" param)* param = key ("=" value)? key = word / word value = word / num version = "@" capture:(num "." num "." num) {return capture.flat().join("")} word = capture:([a-z] [a-z0-9]*) {return capture.flat().join("")} num = [0-9]+ other = [^,]+ `

We pretty-print the parse result and follow that with the trace generated dot and log.

//wiki.dbbs.co/assets/pages/js-snippet-template/esm.html HEIGHT 400

For now we copy/paste the dot diagram here.

digraph {node[style=filled fillcolor=palegreen] url->protocol [label=2] domain->word [label=5] url->domain [label=2] path->word [label=6] url->path [label=6] version->num [label=3] path->version [label=1] start->url [label=2] suffix->word [label=1] path->suffix [label=1] key->word [label=3] param->key [label=3] value->word [label=1] param->value [label=3] query->param [label=3] value->num [label=2] url->query [label=1] }

First our trace handler then the parser run and print.

const stack = []; const log = []; const tally = {}; function trace({type,rule,result}) { const show = () => { log.push( `${stack.join("->")}, ${ JSON.stringify(result)}`); const edge = stack.slice(-2).join("->") if(edge in tally) tally[edge].count++ else tally[edge]={count:1} } switch(type) { case 'rule.enter': stack.push(rule); break case 'rule.fail': stack.pop(); break; case 'rule.match': show(); stack.pop(); break } }

import peggy from 'https://cdn.jsdelivr.net/npm/peggy@4.0.2/+esm'; export async function emit(el) { const option = {trace:true,tracer:{trace}}; const parser = peggy.generate(grammar,option); const result = parser.parse(rawtext,option); const style = `style="background-color:white"`; const token = t => `<code ${style}>${t}</code>`; const pretty = result .flat(9) .filter(t => t) .map(token) .join(" "); const dot = Object.entries(tally) .map(t =>`${t[0]} [label=${t[1].count}]`) .join("\n") el.innerHTML = ` <div>${pretty}</div> <pre>${dot}</pre> <pre>${log.join("\n")}</pre>`; }