Nexus logo

Universal Parser Generator

One tool reads any .grammar file and produces a combined lexer + LALR(1)/SLR(1) parser. One tool, any language, zero language-specific code in the engine.

View on GitHub Read the Docs

How It Works

Write two files — a grammar and a language module — and Nexus generates a complete, production-ready parser.

{lang}.grammartokens + rules
+
{lang}.ziglang module
nexusgenerator engine
parser.ziglexer + parser

Example: Expression Grammar

A complete grammar for arithmetic expressions with operator precedence, associativity, parentheses, and multiple start symbols.

basic.grammar Input
@lexer

tokens
    integer, ident, plus, minus, star, slash,
    power, lparen, rparen, newline, eof, err

'+'                               plus
'-'                               minus
'*'                               star
'/'                               slash
"**"                              power
'('                               lparen
')'                               rparen
'\n'                              newline
[0-9]+                            integer
[a-zA-Z_][a-zA-Z0-9_]*            ident
.                                 err

@parser

@lang = "basic"
@conflicts = 0

name     = IDENT
program! = body                   (module ...1)
expr!    = expr                   1

body     = stmt                   (1)
         | body NEWLINE stmt      (...1 3)
         | body NEWLINE           1

stmt     = expr
expr     = @infix

unary    = "-" unary              (neg 2)
         | atom

atom     = name
         | INTEGER
         | "(" expr ")"           2

@infix unary
    "+"  left,  "-"  left
    "*"  left,  "/"  left
    "**" right
terminal Generate
$ ./bin/nexus basic.grammar src/parser.zig

Generated parser.zig — switch-based lexer, LALR(1) parse tables,
Tag enum, Sexp output, and all reduction actions in one module.

Key Features

Self-Hosting

Nexus parses its own grammar format using a parser it generated from itself. No hand-written bootstrap parser.

Language-Agnostic

The engine contains zero language-specific code. All language knowledge lives in your two input files.

Declarative Lexer

Regex-like patterns, state guards, actions, and SIMD-accelerated scanning — all from grammar declarations.

Operator Precedence

@infix auto-generates a full precedence chain from a simple table of operators and associativity.

Context-Sensitive Keywords

@as resolves identifier-keyword ambiguity at parse time with ordered priority and state checks.

Zero-Copy Tokens

8-byte packed tokens reference the original source. No string allocations during lexing.


Validated Languages

Nexus has been used to build parsers for three real-world languages, each with its own grammar and language module.

Language Grammar Description
Zag zag.grammar Systems language with indent-based blocks and token reclassification
Slash slash.grammar Scripting language with heredocs, regex literals, and indent/outdent
MUMPS mumps.grammar Legacy healthcare language with pattern mode, dot-level counting, and context-sensitive commands