IDE for validating specifications
A reminder of our overarching vision
AI promises to power high-assurance software that is cheap and plentiful with strong guarantees. But guarantees are only helpful if they guarantee what we care about. It will fall to human engineers to determine whether they got the guarantees they need.
This is fundamentally a human-in-the-loop design challenge. Therefore, we’re researching how humans make sense of formal specifications and what leads them to resolve issues and establish confidence in these specs*.
In AI-driven workflows, humans will describe what they want by using their existing designs and documentation, and/or by describing what they want on the spot. These descriptions will be informal compared to the level of precision used in formal specifications. AI systems will increasingly handle converting these documents into formal specifications and the subsequent verified code synthesis. Still, humans will need to review the many clarifying assumptions that refined the informal documents into formal specifications. Human review is vital because these clarifying assumptions will change the meaning of the specification, and because ultimately, the formal specification is what humans trust when they decide to deploy.
To understand and address human review needs, Atlas is studying how professional cryptography engineers establish trust in formalizations of existing natural language specifications.
Where the tool is now
To do this, we’ve built a tool for specification understanding and validation. We’re taking the approach that this should be an open-source platform where anyone can add modular features (like counterexample generation). It’s easy to start waxing poetic about a future paradigm of specification-driven AI (and we will in the next post), but concretely, we’re starting by simply doing line-by-line mapping between natural language and a formal spec.
Our goal before the end of this year is for a software developer with no experience in formal methods to be able to find a mistake we introduced into a mechanized formal specification of a system they’re familiar with simply because the tool steers them toward understanding that the spec says something that is not what they intend.
If you’re interested, we have monthly updates for one of our grantors here in Google Docs. You can also check out the code directly on Github: https://github.com/atlas-computing-org/formal-specification-ide.
Signal is a great testbed
To demonstrate a specific use case, we’ve taken the documentation of X3DH from the Signal Foundation and mapped it to Lean.
Here you can see the markdown description of the X3DH protocol on the left, and the corresponding formalization of sending the initialization message in Lean on the right. Gray highlights show text with corresponding chunks of text on the other side. Yellow highlights are meant to be warnings (possible inconsistencies), and red highlights are likely problems in the correspondence between description and spec.
The current components in this are
The IDE (built by Atlas)
Signal’s natural-language specification of their X3DH protocol at https://signal.org/docs/specifications/x3dh/
A hand-generated formalization of this specification in Lean (by Atlas)
Some hand-generated annotations that identify relationships and concerns between the informal and formal language (by Atlas)
We plan to add the following capabilities:
AI-copilot-style generation of formal specifications
AI-generated annotations mapping parts on the left side to the right side
AI can do this, but doesn’t do a great job
We’ll generate these, compare, and identify paths to improving outputs
Integration into VSCode so Lean or other formal code can leverage state-of-the-art IDE features
We’re already finding this valuable in our spec formalization. Mapping how our Lean code matched Signal’s specification and calling out our simplifying assumptions caught multiple mistakes we’d made and suggested additional design improvements. We’re excited to see how this can help others.
Call to action
If you would like to use this tool in your specification workflow, please reach out! We would love to support you if we can make something you’d actively use. hello@atlascomputing.org
* Don’t take our word for it; here’s Talia Ringer at HoTSoS talking about the spec validation problem at the end
(46:08): This is like a challenge I want to leave people with – I think the most important problem right now in this space is to figure out, what tools can actually best help users make sense of a generated specification that comes out of one of these tools.