Research Papers#
Research outputs from the AI4FM group at Loyola University Chicago.
Can LLMs Write Correct TLA+ Specifications? Evaluating Natural-Language-to-TLA+ Generation
Under Submission
The first systematic evaluation of LLM-based TLA+ specification synthesis from natural language. Evaluates 30 LLMs across eight families on 205 TLA+ specifications using the SANY parser and TLC model checker. LLMs achieve only 8.6% semantic correctness. Model size does not predict quality on formal languages.
TLA+ for All: Running Model Checking in a Python Notebook
figshare, February 2025 - Laufer, Thiruvathukal
Integrates TLA+ model checking into Python notebook environments, making formal verification accessible without specialized tooling. Supports literate modeling, reproducibility, and barrier-free access for teaching and research.