Universal Parser Generator

One tool reads any .grammar file and produces a combined lexer + LALR(1)/SLR(1) parser. One tool, any language, zero language-specific code in the engine.

View on GitHub Read the Docs

How It Works

Write two files — a grammar and a language module — and Nexus generates a complete, production-ready parser.

{lang}.grammartokens + rules

{lang}.ziglang module

→

nexusgenerator engine

→

parser.ziglexer + parser

Example: Expression Grammar

A complete grammar for arithmetic expressions with operator precedence, associativity, parentheses, and multiple start symbols.

basic.grammar Input

@lexer

tokens
    integer, ident, plus, minus, star, slash,
    power, lparen, rparen, newline, eof, err

'+'                              → plus
'-'                              → minus
'*'                              → star
'/'                              → slash
"**"                             → power
'('                              → lparen
')'                              → rparen
'\n'                             → newline
[0-9]+                           → integer
[a-zA-Z_][a-zA-Z0-9_]*           → ident
.                                → err

@parser

@lang = "basic"
@conflicts = 0

name     = IDENT
program! = body                  → (module ...1)
expr!    = expr                  → 1

body     = stmt                  → (1)
         | body NEWLINE stmt     → (...1 3)
         | body NEWLINE          → 1

stmt     = expr
expr     = @infix

unary    = "-" unary             → (neg 2)
         | atom

atom     = name
         | INTEGER
         | "(" expr ")"          → 2

@infix unary
    "+"  left,  "-"  left
    "*"  left,  "/"  left
    "**" right

terminal Generate

$ ./bin/nexus basic.grammar src/parser.zig

Generated parser.zig — switch-based lexer, LALR(1) parse tables,
Tag enum, Sexp output, and all reduction actions in one module.

Key Features

Self-Hosting

Nexus parses its own grammar format using a parser it generated from itself. No hand-written bootstrap parser.

Language-Agnostic

The engine contains zero language-specific code. All language knowledge lives in your two input files.

Declarative Lexer

Regex-like patterns, state guards, actions, and SIMD-accelerated scanning — all from grammar declarations.

Operator Precedence

@infix auto-generates a full precedence chain from a simple table of operators and associativity.

Context-Sensitive Keywords

@as resolves identifier-keyword ambiguity at parse time with ordered priority and state checks.

Zero-Copy Tokens

8-byte packed tokens reference the original source. No string allocations during lexing.

Validated Languages

Nexus has been used to build parsers for three real-world languages, each with its own grammar and language module.

Language	Grammar	Description
Zag	`zag.grammar`	Systems language with indent-based blocks and token reclassification
Slash	`slash.grammar`	Scripting language with heredocs, regex literals, and indent/outdent
MUMPS	`mumps.grammar`	Legacy healthcare language with pattern mode, dot-level counting, and context-sensitive commands