<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:itunes="http://www.itunes.com/dtds/podcast-1.0.dtd" xmlns:googleplay="http://www.google.com/schemas/play-podcasts/1.0"><channel><title><![CDATA[Atlas Blog]]></title><description><![CDATA[Atlas Computing Blog is the main blog for https://atlascomputing.org/
We're scaling human review, starting with tools to help people precisely define how software should behave.]]></description><link>https://blog.atlascomputing.org</link><image><url>https://substackcdn.com/image/fetch/$s_!Nlv7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff370c2bd-71e7-4808-a3c6-f0e1be165e0d_1280x1280.png</url><title>Atlas Blog</title><link>https://blog.atlascomputing.org</link></image><generator>Substack</generator><lastBuildDate>Sun, 19 Apr 2026 02:28:29 GMT</lastBuildDate><atom:link href="https://blog.atlascomputing.org/feed" rel="self" type="application/rss+xml"/><copyright><![CDATA[Atlas Computing]]></copyright><language><![CDATA[en]]></language><webMaster><![CDATA[atlascomputing@substack.com]]></webMaster><itunes:owner><itunes:email><![CDATA[atlascomputing@substack.com]]></itunes:email><itunes:name><![CDATA[Atlas Computing]]></itunes:name></itunes:owner><itunes:author><![CDATA[Atlas Computing]]></itunes:author><googleplay:owner><![CDATA[atlascomputing@substack.com]]></googleplay:owner><googleplay:email><![CDATA[atlascomputing@substack.com]]></googleplay:email><googleplay:author><![CDATA[Atlas Computing]]></googleplay:author><itunes:block><![CDATA[Yes]]></itunes:block><item><title><![CDATA[CSLib: Lean’s Formal Software Foundation]]></title><description><![CDATA[Bottom line up front: If you love Lean and care about software, you&#8217;re likely to be excited about the progress on CSLib and might be interested in contributing to one of these two tracks present below.]]></description><link>https://blog.atlascomputing.org/p/cslib-leans-formal-software-foundation</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/cslib-leans-formal-software-foundation</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Wed, 24 Dec 2025 13:25:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nlv7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff370c2bd-71e7-4808-a3c6-f0e1be165e0d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Bottom line up front: If you love Lean and care about software, you&#8217;re likely to be excited about the progress on CSLib and might be interested in contributing to one of these two tracks present below.</p><div><hr></div><p>If you&#8217;re reading this, you&#8217;re likely already familiar with Lean (the programming language and proof-assistant that&#8217;s steadily gaining notability for its use by mathematicians like <a href="https://terrytao.wordpress.com/2025/05/31/a-lean-companion-to-analysis-i/">Terry Tao</a> and AI-for-math efforts like Google DeepMind&#8217;s <a href="https://www.nature.com/articles/d41586-025-03959-9">DeepThink</a>).  There&#8217;s incredible value in using a computer to rigorously and automatically check the correctness of the mathematical proof for a theorem: you don&#8217;t have to either trust that the proof is correct or understand and verify every line yourself before you can use the theorem in your own proofs. And you could imagine wanting to rigorously verify software properties as well.</p><p>To prove theorems about mathematics, Lean starts from axioms, foundational and widely agreed upon truths, that are combined with formal logic. In software, axioms might take the form of formalizations of programming languages or functional descriptions of how compilers and operating systems behave. If we start from precise, logical definitions, we can prove properties about real code with the same rigor mathematicians use for theorems.</p><p>This matters because once you formalize these foundations, you can prove useful things about software.  This is already commonly used in high-assurance software (e.g., proving that a cryptography library implements a particular function correctly, or that a microkernel provides guaranteed isolation between processes).  However, many of these proof systems were designed for specific verification tasks at a time when the skills, expertise, and cost of generating specifications and proofs were very high. Now that we have a clear line of sight to a future in which the cost of generating proofs and code is very low, it is more important than ever to build a general foundation for proving software properties.  This will enable us to compose different properties of subsystems to prove properties of overarching systems. (Imagine being able to easily and confidently reason about privacy guarantees, worst-case runtime bounds, or memory safety for your entire system because every library you import came with such guaranteed assurances.)</p><p>This is the long-term promise of <a href="http://cslib.io/">CSLib</a>, a new project and library in Lean 4, setting out to build verified foundations that connect high-level CS theory to low-level executable code.  At Atlas Computing, we&#8217;re proud to host <a href="https://www.linkedin.com/in/alexandrerademaker/">Alexandre Rademaker</a>, one of CSLib&#8217;s tech leads, as we see this work as fundamental to building robust software systems.  Feel free to check out <a href="http://cslib.io/">cslib.io</a> for documentation, or the <a href="https://www.cslib.io/roadmap/">CSLib roadmap</a> for the full technical vision and where different pieces fit together.</p><h2>More than an analog of MathLib</h2><p>Lean has been transformative for mathematics, and Mathlib (the home of the various definitions and proven theorems in Lean) has led to an ecosystem where mathematicians can formalize proofs, build on each other&#8217;s work, and verify results with unprecedented rigor. But proving things about software is very different from proving things about math; for example, I&#8217;ve yet to meet a mathematician who cares about the performance of their proofs, or a computer scientist who doesn&#8217;t care about the performance of their code.</p><p>CSLib has two complementary pillars:</p><ol><li><p>Formalizing core CS concepts directly in Lean, like models of computation, algorithms, data structures, and their properties</p></li><li><p>Building an infrastructure for Lean-based reasoning about everyday imperative code.</p></li></ol><p>Together, these enable proving properties about real software using the theoretical foundations from the first pillar.</p><p>This is infrastructure work, and we need a community to help us formalize this computer science.  Just as MathLib is building a community and working toward formalizing all sufficiently important theorems in mathematics, CSLib is building a community to formalize the undergraduate CS curriculum and eventually provide strong assurances about (eventually all) sufficiently important software.  This will start with developing and building consensus around a set of design choices that the AI-for-software and AI-for-math communities are ready to build on.</p><h2>Two Ways to Contribute</h2><p>If you find yourself with some time over the holidays and have a fondness for Lean, here are two active tracks where I hear that CSLib would love some contributions:</p><h3>Track 1: Formalizing CS Foundations, algorithms, and data structures</h3><p>The current effort can be seen <a href="https://github.com/leanprover/cslib/pulls">here</a>, but will eventually include everything from cryptography and complexity theory.  If you want to contribute, read the <a href="https://www.cslib.io/contributing/">contributing guidelines</a> and familiarize yourself with the repository structure.</p><h3>Track 2: Verifying Low-Level Code</h3><p>At Atlas, Alex is porting AWS&#8217;s <a href="https://github.com/awslabs/s2n-bignum">s2n-bignum</a> library to Lean. s2n-bignum provides cryptographic integer arithmetic routines in pure assembly (x86_64 and ARM), each with machine-checked formal proofs in HOL Light. Our goal is to bring these verified implementations to Lean&#8217;s ecosystem. The first ARM assembly proof has been completed and is available at <a href="http://github.com/atlas-computing-org/bignum">github.com/atlas-computing-org/bignum</a>. Near-term priorities include expanding the executable ARM model, completing Mach-O binary parsing, and implementing decision procedures for bit-vectors. The bignum project will serve as an early CSLib consumer, demonstrating how verified low-level code can leverage CSLib&#8217;s foundations. We expect to reuse general-purpose definitions and theorems from CSLib throughout the bignum proofs, creating a practical feedback loop that helps shape CSLib&#8217;s development.</p><p>If you&#8217;re looking for a place to ask questions, check out the <a href="http://cslib.io/">website</a> and the Zulip community <a href="https://leanprover.zulipchat.com/">forum</a>.</p><h2>Why It Matters</h2><p>The connection between these two tracks might not be evident at first. But consider: you&#8217;re simultaneously building the theoretical vocabulary (graphs, complexity, algorithms) and the verified compilation path (starting with proven arithmetic primitives). When both exist, you can finally do something remarkable: prove properties about real systems, from algorithm choice down to machine instructions.</p><p>This is how we get to a world where:</p><ul><li><p>Compilers come with correctness guarantees</p></li><li><p>Operating system kernels have verified security properties</p></li><li><p>Algorithm implementations carry proven complexity bounds</p></li><li><p>Software infrastructure is trustworthy by construction</p></li></ul><p>CSLib is seeding this ecosystem. It&#8217;s early enough that your contributions will shape the whole trajectory.</p>]]></content:encoded></item><item><title><![CDATA[An alternative to "fund people not projects"]]></title><description><![CDATA[Our catechism for creating impact-maximizing organizations places finding a founder *last*]]></description><link>https://blog.atlascomputing.org/p/an-alternative-to-fund-people-not</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/an-alternative-to-fund-people-not</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 16 Dec 2025 15:15:37 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Ek-B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!Ek-B!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!Ek-B!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!Ek-B!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!Ek-B!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!Ek-B!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!Ek-B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png" width="1024" height="559" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:559,&quot;width&quot;:1024,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:1084924,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/181707030?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!Ek-B!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 424w, https://substackcdn.com/image/fetch/$s_!Ek-B!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 848w, https://substackcdn.com/image/fetch/$s_!Ek-B!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 1272w, https://substackcdn.com/image/fetch/$s_!Ek-B!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F50b05477-af96-4b41-814b-c0c5bca343f0_1024x559.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>I&#8217;m a firm believer that it&#8217;s wise to &#8220;fund people, not projects&#8221; on the margin. I think this is especially true if you are trying to maximize upside in competitive markets with high uncertainty. This holds especially true when your founder or leader has to adapt to new learnings and a highly dynamic environment (for example, in startups or in bleeding-edge research).</p><p>However, there are many environments where, from your vantage point, you can see what&#8217;s needed, and are confident that someone should just do the thing.  Atlas Computing seeks to do this to improve societal security and AI resilience amid growing AI capabilities.</p><p>To advance this, we&#8217;ve developed the following catechism for starting organizations to unblock predictable technological development bottlenecks and maximize impact.  Metascientists and DARPA alums out there will recognize that this was clearly inspired by and aspiring to be an analog of <a href="https://www.darpa.mil/work-with-us/heilmeier-catechism">The Heilmeier Catechism</a> for designing impactful research programs.</p><p></p><div><hr></div><p>This complements the <a href="https://docs.google.com/spreadsheets/d/1QAdfr71KOM0w5ZsU_8OxLG1q1mDUlJQJBK9UB0WTeT8/">AI Resilience Gap Map</a> by outlining a 6-step process to follow for each listed gap (row).</p><ol><li><p>What are you trying to solve?  This should be a bottleneck to unblock or a gap to fill.</p><ol><li><p>What&#8217;s a good outcome that&#8217;s bottlenecked on a breakthrough or effort that no one is working on that would benefit from a new organization?</p></li><li><p>Or, what&#8217;s a risk that could be mitigated with a new organization, but there&#8217;s no one working on that at the moment?</p></li><li><p><strong>Artifact:</strong> Write a brief (&lt;1 page) description about what&#8217;s happening today that seems clearly broken and how it should work instead.  Get one relevant expert to attest to the real need.</p></li></ol></li><li><p>What would you believe (that other reasonable people might disagree with) that would greatly inform how you would approach closing the resilience gap from 1?</p><ol><li><p><strong>Artifact:</strong> A written (short) story about how a new org would be sufficient to address the bottleneck (or close the gap).</p><ol><li><p>At least two field strategists should attest that this approach is the most likely to succeed, despite engaged and constructive criticism from the whole cohort of field strategists.</p></li></ol></li></ol></li><li><p>Premised on that belief, what should this new organization do?</p><ol><li><p><strong>Artifact: </strong>Write a ~2-page document that could be sent to a funder that describes:</p><ol><li><p>What is their north star mission statement?</p></li><li><p>Who needs to work with this org, and how does this org solve a pain point for them?</p></li><li><p>What does their 6-month success milestone to demonstrate competence look like?</p><ol><li><p>How many people are needed to achieve that?  How much funding is needed?</p></li></ol></li><li><p>What is the longer-term (2-5 year) goal?</p></li><li><p>What is the legal structure and business model for the org?  Who benefits? Who pays?</p></li></ol></li></ol></li><li><p>Who are the most relevant 5-10 experts in the world who can validate (or iterate on) your beliefs from 1-3? These should be advisors, potential users or customers, or other organizations that cover this cause area.  Actually ask them for feedback.</p><ol><li><p><strong>Artifact: </strong>You can move on when they all point to the 2-pager from step 3 and say, &#8220;I want this to exist; it would solve a problem for me.&#8221;</p></li></ol></li><li><p>Who would be interested in funding this, if presented with the right founding team?</p><ol><li><p><strong>Artifact:</strong> a list of funders with a realistic expected value calculation that accounts for the roadmap needed to reach the first milestone</p></li><li><p><strong>Artifact:</strong> at least one of the two biggest funders in the above  list expressing interest in the organization and committing to diligence a team we source for the organization.</p></li></ol></li><li><p>What skills are needed to run this org? Who is likely to have those skills?  What experience do they need?  Who would be your dream candidate(s)?  Who can you think of who&#8217;s a plausible candidate, and what gives you pause?  You should be confident that the founding team can make all future hiring decisions themselves.</p><ol><li><p><strong>Artifact: </strong>Generate a job description with enough specificity that we can give it to a recruiter and find candidates</p></li></ol></li></ol><p></p><div><hr></div><p>Feel free to ask questions, comment on specific lines, or download the 1-page PDF of the above doc <a href="https://docs.google.com/document/d/13Wb9YjLOMCl9JGQkYn0J-5qRqS4FZsr86KtOgD5TFRw/edit?usp=sharing">here</a>.</p>]]></content:encoded></item><item><title><![CDATA[Post-FMxAI 2025 newsletter ]]></title><description><![CDATA[Takeaways from Formal Methods x AI conference 2025 @ SRI, Menlo Park: atlascomputing.org/fmai25]]></description><link>https://blog.atlascomputing.org/p/post-fmxai-2025-newsletter</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/post-fmxai-2025-newsletter</guid><dc:creator><![CDATA[Atlas Computing]]></dc:creator><pubDate>Fri, 31 Oct 2025 12:05:13 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/4b22cd9d-9076-4346-b391-7170c9c97c98_420x300.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>This is a summary we sent to our <a href="https://atlascomputing.org/fmai25">FMxAI</a> attendees. We wanted to share the takeaways here as well.</em></p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!CHAn!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!CHAn!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 424w, https://substackcdn.com/image/fetch/$s_!CHAn!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 848w, https://substackcdn.com/image/fetch/$s_!CHAn!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!CHAn!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!CHAn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:4883086,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/177645966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!CHAn!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 424w, https://substackcdn.com/image/fetch/$s_!CHAn!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 848w, https://substackcdn.com/image/fetch/$s_!CHAn!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 1272w, https://substackcdn.com/image/fetch/$s_!CHAn!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc1f27554-2b5f-4426-8017-8216f6dcbe4f_2048x1536.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>Thank you for taking part in our meeting. It was great to see so much work going on at the intersection of Formal Methods and AI. Here are a few high-level themes we noticed, and we&#8217;d love to hear from you if there&#8217;s anything you think we missed.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe for more updates:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><ol><li><p><strong>We need more benchmarks / evals / RL environments to make AI models better at formal methods. </strong>We heard a lot of discussion about the lack of good evaluation benchmarks for formal methods. The current generation of models is extensively trained by reinforcement learning against problem-specific benchmarks. If you have a dataset of problems that AI can&#8217;t currently solve, even if the number of problems is modest, it seems impactful to turn it into an AI eval and get AI labs to include it in training. If you need help figuring out how to do this (or are generally interested in contributing to this effort), talk to Evan Miyazono (evan@atlascomputing.org).</p></li></ol><ol start="2"><li><p><strong>We need more FM infrastructure. </strong>A lot of discussions focused on formal methods infrastructure: if AI gets stronger, will we have tools, conventions, and languages available for the AI systems and workflows to use? One new project that seems like it might help here is the <a href="https://github.com/leanprover/cslib">CSlib project</a>, which is building an intermediate representation for computer science concepts in Lean. However, it seemed like the FM infrastructure gap is very large, and would benefit from both more funding and more senior talent.</p></li></ol><ol start="3"><li><p><strong>New orgs are starting. </strong>Several organizations represented at the meeting are brand new, having started in the last year. These include (in no particular order) <a href="https://www.math.inc">Math Inc</a>., <a href="https://theoremlabs.com">Theorem Labs</a>, <a href="https://sigillogic.com">Sigil Logic</a>, <a href="https://www.principialabs.org">Principia Labs</a>, <a href="https://axiommath.ai">Axiom Math</a>, <a href="https://www.safer-ai.org">Safer-AI</a>, <a href="https://ulyssean.com">Ulyssean</a>, and <a href="http://genproof.ai">genproof.ai</a>. We take this as a strong signal that people are starting to see the potential in formal methods combined with AI.</p></li></ol><ol start="4"><li><p><strong>Engineering tools matter, specifications matter. </strong>Many people discussed possible AI-driven tools that could be used for engineering. We heard several people raise the notion of &#8220;verified vibe-coding&#8221; or &#8220;vibe-speccing&#8221;. A particularly important problem seems to be how to specify formally what AIs should do, and how to use these specifications to guide the AI to a correct response.</p></li></ol><ol start="5"><li><p><strong>Lean is a big thing, but not the only thing. </strong>The Lean theorem prover was a topic of discussion in many conversations. On the one hand, Lean has become a common denominator &#8212;a tool known outside FM expert circles. On the other hand, we talked to many FM experts who were keen to emphasize the broad range of tools in formal methods, including CHERI. It seems to be a live debate in FMxAI whether and how to standardize on Lean, or to try to maintain the diversity of the field, or a defense-in-depth approach.</p></li></ol><ol start="6"><li><p><strong>AI forecasts vary enormously. </strong>We heard many conversations about the future of AI, and here, forecasts varied enormously. Broadly speaking, attendees working more closely on AI predicted faster gains, while FM experts were more skeptical. At the most skeptical end of the spectrum, some attendees felt that AI capabilities were unlikely to increase, while at the other end, others predicted that fully automated AI software engineers would be in place by 2030.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!korH!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!korH!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 424w, https://substackcdn.com/image/fetch/$s_!korH!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 848w, https://substackcdn.com/image/fetch/$s_!korH!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!korH!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!korH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:733849,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/177645966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!korH!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 424w, https://substackcdn.com/image/fetch/$s_!korH!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 848w, https://substackcdn.com/image/fetch/$s_!korH!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!korH!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc9bec074-8534-4fe3-be8a-bf07ae935e78_1707x1280.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Forward pointers</strong></p><p>Here are some additional things you might want to sign up for updates on</p><ol><li><p>Newsletter on formal approaches to AI security: </p><div class="embedded-publication-wrap" data-attrs="{&quot;id&quot;:2800667,&quot;name&quot;:&quot;Can We Secure AI With Formal Methods?&quot;,&quot;logo_url&quot;:&quot;https://substackcdn.com/image/fetch/$s_!ykg_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083283b3-e660-4c7f-81e3-c40b1d1ebafd_1024x1024.png&quot;,&quot;base_url&quot;:&quot;https://gsai.substack.com&quot;,&quot;hero_text&quot;:&quot;Formal methods needs to know that AI security folks are a critical fountain of users. AI security folks need to know how to ask formal methodsititians for widgets. FKA Progress in Guaranteed Safe AI.\n&quot;,&quot;author_name&quot;:&quot;Quinn Dougherty&quot;,&quot;show_subscribe&quot;:true,&quot;logo_bg_color&quot;:&quot;#ffffff&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="EmbeddedPublicationToDOMWithSubscribe"><div class="embedded-publication show-subscribe"><a class="embedded-publication-link-part" native="true" href="https://gsai.substack.com?utm_source=substack&amp;utm_campaign=publication_embed&amp;utm_medium=web"><img class="embedded-publication-logo" src="https://substackcdn.com/image/fetch/$s_!ykg_!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F083283b3-e660-4c7f-81e3-c40b1d1ebafd_1024x1024.png" width="56" height="56" style="background-color: rgb(255, 255, 255);"><span class="embedded-publication-name">Can We Secure AI With Formal Methods?</span><div class="embedded-publication-hero-text">Formal methods needs to know that AI security folks are a critical fountain of users. AI security folks need to know how to ask formal methodsititians for widgets. FKA Progress in Guaranteed Safe AI.
</div><div class="embedded-publication-author-name">By Quinn Dougherty</div></a><form class="embedded-publication-subscribe" method="GET" action="https://gsai.substack.com/subscribe?"><input type="hidden" name="source" value="publication-embed"><input type="hidden" name="autoSubmit" value="true"><input type="email" class="email-input" name="email" placeholder="Type your email..."><input type="submit" class="button primary" value="Subscribe"></form></div></div></li><li><p>ARIA Safeguarded AI program: <a href="https://www.aria.org.uk/programme-safeguarded-ai/">https://www.aria.org.uk/programme-safeguarded-ai/</a></p></li><li><p><a href="https://verilib.org/">Verilib</a></p></li><li><p>The Atlas Computing <a href="https://blog.atlascomputing.org/">blog</a> will have updates on some related projects and spin-outs when there are public announcements (like a potential FRO to build tools to generate and validate formal specs)</p></li></ol><p>Lastly, &gt;70% of attendees who filled out the post-event survey said they&#8217;d highly recommend the event to colleagues, and almost 95% said they&#8217;d try their best to attend a subsequent event, so it seems like we found a good recipe: great people + lots of space to chat. We&#8217;ll keep you updated and hope to see you again soon!</p><p>Mike, Evan, and the team.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!POAm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!POAm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 424w, https://substackcdn.com/image/fetch/$s_!POAm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 848w, https://substackcdn.com/image/fetch/$s_!POAm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!POAm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!POAm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/c89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:430215,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/177645966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!POAm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 424w, https://substackcdn.com/image/fetch/$s_!POAm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 848w, https://substackcdn.com/image/fetch/$s_!POAm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!POAm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fc89d7cb8-6918-43e3-965d-2dc37c085389_1707x1280.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!wglM!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!wglM!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wglM!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wglM!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wglM!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!wglM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg" width="1456" height="1092" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1092,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:303574,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/177645966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!wglM!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 424w, https://substackcdn.com/image/fetch/$s_!wglM!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 848w, https://substackcdn.com/image/fetch/$s_!wglM!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!wglM!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F68ec2146-0b70-40df-83c3-4e69e2777a11_1707x1280.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!qz5o!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!qz5o!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qz5o!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qz5o!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qz5o!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!qz5o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg" width="1707" height="821" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/a1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:821,&quot;width&quot;:1707,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:409615,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/177645966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff03e4111-8caf-40e4-9524-223ba0d642fc_1707x1280.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!qz5o!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 424w, https://substackcdn.com/image/fetch/$s_!qz5o!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 848w, https://substackcdn.com/image/fetch/$s_!qz5o!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!qz5o!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fa1d4c6fe-433c-4b6a-8f94-fe6a4ff59ed9_1707x821.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!nCQm!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!nCQm!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nCQm!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nCQm!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nCQm!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!nCQm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg" width="1702" height="1216" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/e9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:1216,&quot;width&quot;:1702,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:730041,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/177645966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd592fa9b-e1e5-4d36-8448-d3db2cd59274_1707x1280.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!nCQm!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 424w, https://substackcdn.com/image/fetch/$s_!nCQm!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 848w, https://substackcdn.com/image/fetch/$s_!nCQm!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!nCQm!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fe9c9e87d-ccf8-46fc-ac34-986d1e4caeda_1702x1216.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!94W0!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!94W0!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 424w, https://substackcdn.com/image/fetch/$s_!94W0!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 848w, https://substackcdn.com/image/fetch/$s_!94W0!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!94W0!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!94W0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg" width="1707" height="840" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/bde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:840,&quot;width&quot;:1707,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:482865,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/jpeg&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/177645966?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fd52a5856-8d96-4ce0-9939-309894f387c4_1707x1280.jpeg&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!94W0!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 424w, https://substackcdn.com/image/fetch/$s_!94W0!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 848w, https://substackcdn.com/image/fetch/$s_!94W0!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 1272w, https://substackcdn.com/image/fetch/$s_!94W0!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fbde0e6e9-d9be-4d3c-95ca-630b9944c695_1707x840.jpeg 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe for more updates:</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Civilization's maintenance backlog]]></title><description><![CDATA[A few dozen organization-shaped holes to be filled before powerful AI arrives]]></description><link>https://blog.atlascomputing.org/p/civilizations-maintenance-backlog</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/civilizations-maintenance-backlog</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Fri, 17 Oct 2025 13:49:31 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!gD1-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h4><a href="https://docs.google.com/spreadsheets/d/1QAdfr71KOM0w5ZsU_8OxLG1q1mDUlJQJBK9UB0WTeT8/edit?usp=sharing">If you want to jump straight to our list of org-shaped holes, here it is</a>.  Otherwise&#8230; some context:</h4><p>Atlas Computing has pivoted to forming new organizations to address critical gaps in AI deployment readiness and security infrastructure (or if you live in Berkeley,  neglected catastrophic risks from AI).</p><p>In <a href="https://blog.atlascomputing.org/p/website-updated">our previous post</a>, we talked about how we updated our website to match this, and teased at sharing the list we&#8217;ve started.  We&#8217;re excited to share our in-progress list.</p><p>As a teaser, here&#8217;s the ontology we&#8217;re using for identifying categories of gaps:</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!gD1-!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!gD1-!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 424w, https://substackcdn.com/image/fetch/$s_!gD1-!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 848w, https://substackcdn.com/image/fetch/$s_!gD1-!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 1272w, https://substackcdn.com/image/fetch/$s_!gD1-!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!gD1-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png" width="1456" height="502" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/cc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:502,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:179405,&quot;alt&quot;:&quot;&quot;,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/174668149?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" title="" srcset="https://substackcdn.com/image/fetch/$s_!gD1-!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 424w, https://substackcdn.com/image/fetch/$s_!gD1-!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 848w, https://substackcdn.com/image/fetch/$s_!gD1-!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 1272w, https://substackcdn.com/image/fetch/$s_!gD1-!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fcc815ca8-c136-417e-a31a-636deb2bffca_2046x706.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>This list of gaps is:</p><ul><li><p><strong>not comprehensive</strong></p><ul><li><p>We&#8217;d actually love to take suggestions of things we&#8217;re missing. If you have any ideas, please reach out to evan@atlascomputing.org, Or add a comment on this post or a comment on the sheet itself.</p></li><li><p>I did try to come up with some labels that feel mutually exclusive and completely exhaustive. I can&#8217;t guarantee I did a perfect job, but this seems like a relevant way of sorting possible orgs.</p></li></ul></li><li><p><strong>not very well-vetted</strong></p><ul><li><p>I think we got rid of all of the ideas that are clearly bad but some bad ones doubtlessly snuck through</p></li></ul></li><li><p><strong>not full of sexy startup ideas</strong></p><ul><li><p>These are not brilliant research directions or clever product insights. The expectations of these have you saying, huh, yes, full. And 20% you smacking your forehead wondering why no one&#8217;s built this yet.</p></li></ul></li><li><p><strong>not strictly nonprofits</strong></p><ul><li><p>As I&#8217;ve said before, we&#8217;re neither an incubator nor a fellowship program nor a think tank.  We&#8217;ll try to get these orgs started (in a way that&#8217;s compliant with tax laws), but once someone can be the &#8220;go-to person&#8221; for that topic, we want to get out of the mix.</p></li></ul></li><li><p><strong>not a finished artifact</strong></p><ul><li><p>and it&#8217;s not trying to be.  We could take forever just polishing this list and that would lead to nothing getting done.  My favorite part of lists is crossing things off</p></li></ul></li><li><p><strong>just a to-do list</strong> of all of the orgs that we think someone should start, and we&#8217;ll do as many as we can as fast as we can.</p><ul><li><p>The plan is roughly to fill in the columns from left to right, and the columns are pretty specifically designed so that each column provides useful constraints to the next column to the right</p></li></ul></li><li><p><strong>linked at the bottom of the page</strong></p></li></ul><p>Most of these gaps don&#8217;t just create risks - they prevent confident, trustworthy adoption of what is already a very useful, transformational technology. When you can&#8217;t verify security properties, you slow down rollout. When you lack coordination infrastructure, you get duplicated effort. When you don&#8217;t have clear standards, it&#8217;s hard to blame people for repeated reinvention.  We&#8217;re looking at reducing risks and increasing upsides.</p><h2>the important part</h2><p>However, the most important thing in this blog post by far is not the list. Rather, it&#8217;s the illustration of how to use the list.  For one of these items, the biorisk clearinghouse (currently row 26), we&#8217;ve started an initial exploration of what it would look like to set up this organization.</p><p>Why this one? Because it&#8217;s concrete, clearly scoped, and has obvious stakeholders I could talk to immediately.  Over the coming weeks, we&#8217;re interviewing relevant stakeholders to find out if this is really the problem, what hurdles make the problem challenging, and what skills are needed to jump those hurdles.  After that, we&#8217;ll source someone with those skills and support them in starting the organization.  </p><p>Maybe somewhere along the way we find this project isn&#8217;t necessary. I&#8217;d be delighted if someone beats us to it.  But these things have gone unaddressed for long enough, I think it&#8217;s worth trying to do it ourselves.</p><p>Stay tuned to see our progress.</p><div><hr></div><h1><a href="https://docs.google.com/spreadsheets/d/1QAdfr71KOM0w5ZsU_8OxLG1q1mDUlJQJBK9UB0WTeT8/edit?usp=sharing">here&#8217;s the list</a></h1><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://docs.google.com/spreadsheets/d/1QAdfr71KOM0w5ZsU_8OxLG1q1mDUlJQJBK9UB0WTeT8/edit?gid=0#gid=0" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!lfrl!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 424w, https://substackcdn.com/image/fetch/$s_!lfrl!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 848w, https://substackcdn.com/image/fetch/$s_!lfrl!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 1272w, https://substackcdn.com/image/fetch/$s_!lfrl!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!lfrl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png" width="1456" height="571" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:571,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:513354,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:&quot;https://docs.google.com/spreadsheets/d/1QAdfr71KOM0w5ZsU_8OxLG1q1mDUlJQJBK9UB0WTeT8/edit?gid=0#gid=0&quot;,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/174668149?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!lfrl!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 424w, https://substackcdn.com/image/fetch/$s_!lfrl!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 848w, https://substackcdn.com/image/fetch/$s_!lfrl!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 1272w, https://substackcdn.com/image/fetch/$s_!lfrl!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F467bcfbd-65cf-4255-9e49-13cd00d7b7b5_3046x1195.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>join this effort</h3><p>Oh, and I&#8217;m trying to recruit a small team (2-10 others) to work with me on actually scoping out and starting these organizations. If you think that you (or someone you know) would be great at doing that, please reach out.  We don&#8217;t have an official job description on the website, but the preview google doc is <a href="https://docs.google.com/document/d/1lCWhOPcGUvmswF8steLTbySbuY6_rcWLSHT5HuoISZc/edit?tab=t.0">here</a>.  And if you can and want to join this cohort within your org, that&#8217;d be welcome too!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Atlas Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Website updated!]]></title><description><![CDATA[to match our focus on mapping and addressing neglected catastrophic risks]]></description><link>https://blog.atlascomputing.org/p/website-updated</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/website-updated</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 02 Sep 2025 13:01:13 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!okxj!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a919151-a360-4cd0-8f23-02cb3524cb53_700x700.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hey everyone,</p><p>We&#8217;re coming up on the 2-year anniversary of the founding of Atlas (in October), and that seems like a good time to double-down on the things we think we&#8217;ve been doing well.</p><p>If we reflect on the past successes of Atlas, some of our biggest impacts have been </p><ul><li><p>helping organize the GSAI summit</p></li><li><p>recruiting and supporting Jason Gross to do the <a href="https://blog.atlascomputing.org/p/progress-in-autoformalization-experiments">experiments</a> that became the foundation and impetus for Theorem Labs</p></li><li><p>helping build and guide the <a href="https://flexheg.com/https://blog.atlascomputing.org/p/announcing-flexible-hardware-enabled">nascent flexHEG community</a>, mentoring projects and building teams, including bringing in Mehmet Sencan to develop commercial tamper response mechanisms.</p></li></ul><p>I wrote about this in <a href="https://groups.google.com/a/atlascomputing.org/g/updates/c/iMIRnmbZh0M">my Q3 update</a>, but I&#8217;m now 50% time at Convergent Research to create two FROs* that reduce risks from AI.  One of those efforts will be focused on tools to validate formal specifications, continuing our work on <a href="https://blog.atlascomputing.org/p/ide-for-validating-specifications">an IDE for formal specifications</a>, and the other will focus on building useful hardware for AI compute governance.  That frees up Atlas Computing to look upstream of FRO creation and identify what teams or projects should exist to start addressing neglected potential catastrophic risks from AI.  </p><p>Our first step is talking to experts and making a list.  Once we have that list, we'll share it with all of you here. </p><p>After that, we'll start refining our understanding of the problems, identifying potential solutions, relevant experts, potential supporters, and  interested stakeholders before gift-wrapping these and hunting for founders.</p><p>We hold a somewhat contrarian intake that the &#8220;generalist founder archetype&#8221; isn&#8217;t a binary characteristic, and we can lower the barrier to entry for creating an organization to address these risks IF the potential directly-responsible individual is provided the right problem, relevant context, initial milestones, stakeholders, and advisors.  I&#8217;d claim that we demonstrated this with Mehmet Sencan and Jason Gross, neither of whom were founders before joining Atlas. </p><p>I hope we can reproduce this model and look forward to sharing our learnings with you as we try!  (And even if we&#8217;re wrong, hopefully the list we make provides some useful starting points for others.)</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading! Subscribe for future progress updates.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p></p><p>*for those who don&#8217;t know, a Focused Research Organization (FRO) is a tightly scoped, time-bound initiatives (typically ~5 years) that pursue ambitious technical milestones (like large datasets, next-gen tools, or open protocols) through startup-style execution by teams of about 10&#8211;30 full-time employees. Their mission is to create and deploy high-impact public goods into the world, via open-sourcing, partnerships, or spinouts, rather than pursuing open-ended research. More here: <a href="https://www.convergentresearch.org/about-fros">https://www.convergentresearch.org/about-fros</a> </p>]]></content:encoded></item><item><title><![CDATA[Daniel Windham: Passing the Torch]]></title><description><![CDATA[Reflections on our work as I transition out of Atlas]]></description><link>https://blog.atlascomputing.org/p/passing-the-torch</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/passing-the-torch</guid><dc:creator><![CDATA[Daniel Windham]]></dc:creator><pubDate>Sat, 12 Jul 2025 03:58:22 GMT</pubDate><enclosure url="https://substack-post-media.s3.amazonaws.com/public/images/65d49528-e2f6-4a0c-9b84-55d85b584dfa_704x704.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>After over a year and a half as co-founder and CTO of Atlas, I&#8217;m writing to share that I&#8217;ll be stepping away from my role at the organization. This is a personal transition, not a pivot for Atlas. Our mission to help humans govern increasingly powerful AI systems by setting clear rules remains as vital as ever. I&#8217;m thrilled that Evan and the rest of the Atlas team will carry forward this mission.</p><p>When Evan and I started Atlas, we had an ambitious hypothesis: that the future of trustworthy AI depends not just on better models, but on better review - tools that help humans understand, validate, and specify the rules that AI systems must follow. The growing pile of evidence for this claim now ranges from <a href="https://mashable.com/article/ai-generated-resumes-overwhelming-recruiters">growing popularity</a> of AI tools for screening AI-generated resumes to <a href="https://www.anthropic.com/research/agentic-misalignment">agentic misalignment</a>, where agents behave differently when they believe they&#8217;re not being monitored.</p><p>More actionably, we believed emerging AI would make it possible to bring the power of mathematical guarantees to people who aren&#8217;t specialists in formal methods, and to develop tools that scale human judgment, not replace it.</p><p>This would have been a tall order for even the most experienced experts in the world, and I&#8217;m proud of the early steps we&#8217;ve taken toward this vision. Atlas incubated the development of two critical safety technologies and companies: the flexible hardware governors that became Earandil and the Lean autoformalization that became Theorem. In our in-house R&amp;D, we&#8217;ve developed an IDE for specification validation backed by AI tools for aligning formal specs with natural-language documentation. And through our community engagement, we&#8217;ve helped shepherd tremendous growing attention and momentum in the GSAI community. Along the way, we&#8217;ve gotten to learn from and collaborate with pioneers in formal methods, AI safety, and community building. Most of all, we&#8217;ve built a team that cares deeply about doing this right.</p><p>Going forward, I&#8217;m thrilled that Alexandre Rademaker will be leading technical work at Atlas. Alex brings world-class expertise in logic, formal verification, and natural language processing, and he&#8217;s already been instrumental in driving our Spec IDE forward. He&#8217;ll lead our research work funded by Schmidt Sciences and our technical collaboration with the Beneficial AI Foundation, and I know he&#8217;ll do great things.</p><p>I&#8217;ll always be cheering for Atlas and I&#8217;m incredibly excited for what&#8217;s next.</p><p>Thank you to everyone who&#8217;s supported us, collaborated with us, or just shared ideas along the way. If you care about tools that help humans stay in the driver&#8217;s seat as AI systems become more capable, you should keep an eye on Atlas. The work is just getting started.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Atlas Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[A refinement-based paradigm for code generation]]></title><description><![CDATA[Consider this to be an extended answer to the question &#8220;why would you build the IDE for specification&#8221; described in our previous blogpost. The questions in this document are a modified+truncated Heilmeier Catechism that were part of a (rejected) proposal, but I wanted to share it as a public artifact.]]></description><link>https://blog.atlascomputing.org/p/a-refinement-based-paradigm-for-code</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/a-refinement-based-paradigm-for-code</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 03 Jun 2025 14:31:17 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!kIxR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Consider this to be an extended answer to the question &#8220;why would you build the IDE for specification&#8221; described in <a href="https://blog.atlascomputing.org/p/ide-for-validating-specifications">our previous blogpost</a>.  The questions in this document are a modified+truncated <a href="https://www.darpa.mil/work-with-us/heilmeier-catechism">Heilmeier Catechism</a> that were part of a (rejected) proposal, but I wanted to share it as a public artifact.  </p><div><hr></div><h2>&#8220;How is it done today, and what are the limits of current practice?&#8221; What&#8217;s the default trajectory + why is that not ideal?</h2><p><strong>All strategies for reducing AI risk follow the same limited paradigm</strong>: some combination of evaluations and benchmarks (measuring how capable or risky the AI systems are), red-teaming (trying to get AI systems to do the right thing), and &#8220;alignment&#8221;. Alignment is a vague notion that you&#8217;re encoding your values into the AI system itself, so that you don&#8217;t have to review its behavior.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Atlas Blog! Subscribe for free to receive new posts.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div><p><strong>Relying on this paradigm is fundamentally risky for many reasons</strong>, the first of which is that you have no reason to believe that the system is aligned with your goals, or that alignment is even possible. Additionally, you shouldn&#8217;t believe these practices can catch all possible failure modes. Lastly, alignment implies researchers are getting systems to behave well by indoctrinating them with cultural preferences/norms/values, which means that AI systems become vectors for ideologies.</p><p><strong>Atlas Computing wants to build and deploy an alternative paradigm</strong>. Instead of handing off decision-making to AI systems and hoping that the AI systems will behave as the user would like, <strong>we propose building tools that set rules for AI systems that they prove they&#8217;re following</strong>. This is very similar to the research directions proposed by Davidad, Yoshua Bengio, Stuart Russell, Max Tegmark, and others. But rather than a pure research effort, we want to build and deploy prototypes of systems to make human review possible at scale. The natural place to start doing this is with a technologically adept and forward-looking government. We hope this will help establish Singapore as an ambitious and pragmatic international partner on AI innovation and governance within raising security and resilience baselines for AI and upskilling the workforce through sector specific AI training programs.</p><p>Not all areas of AI use are equally risky - <strong>we intend to start with the systems where we believe that our paradigm of specification-based AI will show the strongest benefits, namely AI systems generating software and the structuring of natural language responses</strong>. This proposal comprises 3 parts:</p><ul><li><p>The remainder of part 1 describes the overarching vision of specification-based AI</p></li><li><p>Part 2 describes our first development direction: developing and validating formal specifications of software on the path to specification-driven AI generation of software</p></li><li><p>Part 3 describes a useful tool to use specifications to improve the quality of responses from large language models (LLMs) to reduce hallucinations and improve communication quality and clarity.</p></li></ul><p>We&#8217;re proposing pursuing the work in Part 2 and Part 3 simultaneously.</p><h2>What are you trying to do? What is the vision? Articulate your objectives using absolutely no jargon.</h2><p>From above: &#8220;<strong>we propose building tools that set rules for AI systems that they prove they&#8217;re following.</strong>&#8221; Let&#8217;s break down this description of the paradigm we&#8217;re proposing.</p><p><strong>&#8220;rules for AI systems&#8221;:</strong></p><ul><li><p>Users of AI systems should be able to describe in very precise terms what properties AI outputs should have &#8211; this holds for various types of AI outputs, like software, pictures, audio, engineering designs, news articles, and legal opinions. For a concrete example, let&#8217;s consider an AI system that generates images for simplicity, though part 2 of this proposal will focus on AI-generated software. At present, it&#8217;s very challenging for genAI systems to generate images that have every named feature (especially text).</p></li><li><p>We need a language to set rules for different types of AI outputs. (Note that this isn&#8217;t unprecedented - we have legal terms for various domains of law.) Constitutional AI is an imprecise form of this where the specifications are written in plain English (natural language) where a different language model plays the role of adjudicator. But the languages need to be very precise, so that we don&#8217;t have to abdicate review to an AI system. For our image-generating example, this language would likely include terms for image styles, as well as words that denote presence/absence/location/orientation of a feature.</p><ul><li><p>This specification language does not need be able to describe every aspect of the AI generated output, but should be the medium through which user preferences and governance processes can control the content. For example, in an image, a US Supreme Court Justice once said that the threshold for an indecent image is &#8220;I know it when I see it&#8221;, and while we don&#8217;t argue for the elimination of subjective opinion or human evaluation, objective specifications could define guidelines or conservative boundaries to empower human review.</p></li></ul></li></ul><p><strong>&#8220;That they prove they&#8217;re following&#8221;:</strong></p><ul><li><p>Once we have specified what we want from an AI output, the AI system should present the user with both the output and corresponding evidence or proof that the output follows the rules. This is analogous to compliance processes (wherein solutions are presented with evidence that the solution adheres to relevant requirements). However, in this case, the required properties and evidence should be able to be evaluated by computers so as not to increase the burden of review. One could imagine one day applying this to not just formal verification of software but simplifying other aspects of compliance, ranging from structural engineering to early-stage drug development (validated via simulation).</p></li></ul><p><strong>&#8220;tools that set rules&#8221;:</strong></p><ul><li><p>As these rules might end up looking like a programming language in their own right, it is important to build tools to make it easy for normal users to set these rules and understand the implications of these rules..</p></li></ul><p><strong>&#8220;we propose building tools&#8221;:</strong></p><ul><li><p>We believe that the best first step to convincing anyone this is a better workflow is to start by prototyping the tools.</p></li><li><p>The intuitive next question becomes &#8220;how would these tools work?&#8221;. We envision a world where AI systems empower you by helping you understand and manage the complexity of the world &#8212; not serving you so that you can abdicate control and responsibility.</p></li></ul><ul><li><p>Someone with minimal (or even no) technical background should be able to use AI to develop a new software application, design a structure, write a book, or generate a new work of art while deciding exactly how much attention to pay to any design decision.</p></li><li><p>Their tools should empower them to justify any decision to even an expert in that field.</p></li><li><p>They should be able to generate descriptions of software systems or any other AI-generated artifact at any level of specificity, and use AI tools to refine the specificity of those requirements until the system is sufficiently constrained, at which point AI tools generate the artifact and prove that it matches the user-generated specifications.</p></li></ul><p>The following figure shows different forms of possible specifications when designing a complex system. We expect a user to start with an informal, big-picture sense of the desired solution (i.e. the left column) and use AI tooling to identify and make well-informed design decisions until tests and an implementation are reached.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!kIxR!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!kIxR!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 424w, https://substackcdn.com/image/fetch/$s_!kIxR!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 848w, https://substackcdn.com/image/fetch/$s_!kIxR!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 1272w, https://substackcdn.com/image/fetch/$s_!kIxR!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!kIxR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png" width="1456" height="859" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:859,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:497592,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/164762658?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!kIxR!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 424w, https://substackcdn.com/image/fetch/$s_!kIxR!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 848w, https://substackcdn.com/image/fetch/$s_!kIxR!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 1272w, https://substackcdn.com/image/fetch/$s_!kIxR!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F823c0a2a-69bf-429a-9f92-14ad84fcff79_2526x1490.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p><strong>Here, every grey arrow is a tool that helps you refine your model of what you want to build with the AI system.  </strong>The tool should help you ensure consistency across the various levels of abstraction and formality, and could/will look like a our IDE with a pair of panels, as described in <a href="https://blog.atlascomputing.org/p/ide-for-validating-specifications">our previous post</a>.  (Here&#8217;s the video demo of the current status of our tool if you missed it.)</p><div id="youtube2-wfPr0aCzYXA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;wfPr0aCzYXA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/wfPr0aCzYXA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>While this diagram could apply to multiple types of AI outputs, for the remainder of this proposal, examples will focus on AI systems generating software. We choose this as our first direction because formal languages for completely specifying the behavior of software already exist.</p><h2>How can this be broken down into manageable steps?</h2><p>This can be built incrementally by identifying the parts of the above diagram that are already labor-intensive actions currently done by hand, and that building better tools for those actions to enable step-by-step progress toward a comprehensive product. We propose the following roadmap as a rough order (where we&#8217;re building tools represented here as arrows, as they make conversions between artifacts).</p><p>Success would likely be measured by traditional usage metrics, like the number of active users, their reports on the effectiveness of the tool, and the tool&#8217;s prospects to impact a larger number of people.</p><h3>Year 1</h3><p>In the first year, we focus on small software systems that are composed of a small number of functions and develop tools to convert informal specifications to formal descriptions and property tests.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!yBG9!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!yBG9!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 424w, https://substackcdn.com/image/fetch/$s_!yBG9!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 848w, https://substackcdn.com/image/fetch/$s_!yBG9!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 1272w, https://substackcdn.com/image/fetch/$s_!yBG9!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!yBG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png" width="1456" height="563" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:563,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:293987,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/164762658?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!yBG9!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 424w, https://substackcdn.com/image/fetch/$s_!yBG9!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 848w, https://substackcdn.com/image/fetch/$s_!yBG9!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 1272w, https://substackcdn.com/image/fetch/$s_!yBG9!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F85643e75-296c-4b42-b040-23bfdaaca58f_2334x902.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Year 2</h3><p>In the second year, we start looking at larger systems and adapt our system to incorporate architectural requirements in addition to the fine-structure and high-level requirements.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!KeTS!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!KeTS!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 424w, https://substackcdn.com/image/fetch/$s_!KeTS!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 848w, https://substackcdn.com/image/fetch/$s_!KeTS!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 1272w, https://substackcdn.com/image/fetch/$s_!KeTS!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!KeTS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png" width="1456" height="833" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/dee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:833,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:467703,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/164762658?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!KeTS!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 424w, https://substackcdn.com/image/fetch/$s_!KeTS!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 848w, https://substackcdn.com/image/fetch/$s_!KeTS!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 1272w, https://substackcdn.com/image/fetch/$s_!KeTS!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fdee8859e-d5b6-4daf-a89d-7dee6ce63068_2300x1316.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h3>Year 3</h3><p>In the third (and last year of the proposal) we complete and polish most translation tools. Additionally, we prototype tools to generate verified implementations, though we expect significant progress will be made on this front by other efforts to advance AI-based code synthesis and AI for proof generation.</p><p>After the first year, we would also hope to deploy enough tooling into practice to showcase our implementations of specification-based AI.</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!cnab!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!cnab!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 424w, https://substackcdn.com/image/fetch/$s_!cnab!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 848w, https://substackcdn.com/image/fetch/$s_!cnab!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 1272w, https://substackcdn.com/image/fetch/$s_!cnab!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!cnab!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png" width="1456" height="836" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:836,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:465284,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/164762658?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!cnab!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 424w, https://substackcdn.com/image/fetch/$s_!cnab!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 848w, https://substackcdn.com/image/fetch/$s_!cnab!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 1272w, https://substackcdn.com/image/fetch/$s_!cnab!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F5e088500-ac53-44c8-b378-f83f34ec9563_2284x1312.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>By year 3, the tool should support each of the following engineering workflows:</p><ul><li><p>New workflow: Start a new Mapped Project from scratch</p></li><li><p>Initialize workflow: Convert an existing project into a Mapped Project</p></li><li><p>Update workflow: Evolve a Mapped Project as part of development or maintenance work</p></li></ul><h1>In summary</h1><p>We want to empower people who have an idea of what they want to build with an AI system to:</p><ul><li><p>continuously refine their sense of what they want built,</p></li><li><p>make informed design decisions, prompted by AI systems, and</p></li><li><p>focus on requirements of *what* should be built, and be able to ignore how it&#8217;s built.</p></li></ul><p>We think that looks like an IDE for specifying properties of software, so we&#8217;re building the platform.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Atlas Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[IDE for validating specifications]]></title><description><![CDATA[A reminder of our overarching vision]]></description><link>https://blog.atlascomputing.org/p/ide-for-validating-specifications</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/ide-for-validating-specifications</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Thu, 22 May 2025 13:07:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/youtube/w_728,c_limit/wfPr0aCzYXA" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>A reminder of our overarching vision</h2><p>AI promises to power high-assurance software that is cheap and plentiful with strong guarantees. But guarantees are only helpful if they guarantee what we care about. It will fall to human engineers to determine whether they got the guarantees they need.</p><p>This is fundamentally a human-in-the-loop design challenge. Therefore, we&#8217;re researching how humans make sense of formal specifications and what leads them to resolve issues and establish confidence in these specs*.</p><p>In AI-driven workflows, humans will describe what they want by using their existing designs and documentation, and/or by describing what they want on the spot. These descriptions will be informal compared to the level of precision used in formal specifications. AI systems will increasingly handle converting these documents into formal specifications and the subsequent verified code synthesis. Still, humans will need to review the many clarifying assumptions that refined the informal documents into formal specifications. Human review is vital because these clarifying assumptions will change the meaning of the specification, and because ultimately, the formal specification is what humans trust when they decide to deploy.</p><p>To understand and address human review needs, Atlas is studying how professional cryptography engineers establish trust in formalizations of existing natural language specifications.</p><h2>Where the tool is now</h2><p>To do this, we&#8217;ve built a tool for specification understanding and validation. We&#8217;re taking the approach that this should be an open-source platform where anyone can add modular features (like counterexample generation). It&#8217;s easy to start waxing poetic about a future paradigm of specification-driven AI (and we will in the next post), but concretely, we&#8217;re starting by simply doing line-by-line mapping between natural language and a formal spec.</p><p><strong>Our goal before the end of this year</strong> is for a software developer with no experience in formal methods to be able to find a mistake we introduced into a mechanized formal specification of a system they&#8217;re familiar with simply because the tool steers them toward understanding that the spec says something that is not what they intend.</p><p>If you&#8217;re interested, we have monthly updates for one of our grantors <a href="https://docs.google.com/document/d/1ioKk5ILjgRjVpLrc2-jWzzEosN0QgesuGYju5NJ-fEM/edit?tab=t.0">here</a> in Google Docs. You can also check out the code directly on Github: <a href="https://github.com/atlas-computing-org/formal-specification-ide">https://github.com/atlas-computing-org/formal-specification-ide</a>.</p><h2>Signal is a great testbed</h2><p>To demonstrate a specific use case, we&#8217;ve taken the documentation of X3DH from the Signal Foundation and mapped it to Lean.</p><div id="youtube2-wfPr0aCzYXA" class="youtube-wrap" data-attrs="{&quot;videoId&quot;:&quot;wfPr0aCzYXA&quot;,&quot;startTime&quot;:null,&quot;endTime&quot;:null}" data-component-name="Youtube2ToDOM"><div class="youtube-inner"><iframe src="https://www.youtube-nocookie.com/embed/wfPr0aCzYXA?rel=0&amp;autoplay=0&amp;showinfo=0&amp;enablejsapi=0" frameborder="0" loading="lazy" gesture="media" allow="autoplay; fullscreen" allowautoplay="true" allowfullscreen="true" width="728" height="409"></iframe></div></div><p>Here you can see the markdown description of the X3DH protocol on the left, and the corresponding formalization of sending the initialization message in Lean on the right. Gray highlights show text with corresponding chunks of text on the other side. Yellow highlights are meant to be warnings (possible inconsistencies), and red highlights are likely problems in the correspondence between description and spec.</p><p>The current components in this are</p><ul><li><p>The IDE (built by Atlas)</p></li><li><p>Signal&#8217;s natural-language specification of their X3DH protocol at <a href="https://signal.org/docs/specifications/x3dh/">https://signal.org/docs/specifications/x3dh/</a></p></li><li><p>A hand-generated formalization of this specification in Lean (by Atlas)</p></li><li><p>Some hand-generated annotations that identify relationships and concerns between the informal and formal language (by Atlas)</p></li></ul><p>We plan to add the following capabilities:</p><ul><li><p>AI-copilot-style generation of formal specifications</p></li><li><p>AI-generated annotations mapping parts on the left side to the right side</p><ul><li><p>AI can do this, but doesn&#8217;t do a great job</p></li><li><p>We&#8217;ll generate these, compare, and identify paths to improving outputs</p></li></ul></li><li><p>Integration into VSCode so Lean or other formal code can leverage state-of-the-art IDE features</p></li></ul><p>We&#8217;re already finding this valuable in our spec formalization. Mapping how our Lean code matched Signal&#8217;s specification and calling out our simplifying assumptions caught multiple mistakes we&#8217;d made and suggested additional design improvements. We&#8217;re excited to see how this can help others.</p><h2>Call to action</h2><p>If you would like to use this tool in your specification workflow, please reach out! We would love to support you if we can make something you&#8217;d actively use. <a href="mailto:hello@atlascomputing.org">hello@atlascomputing.org</a></p><p></p><div><hr></div><p>* Don&#8217;t take our word for it; <a href="https://www.youtube.com/watch?v=TWMXGiyPx7A">here&#8217;s Talia Ringer at HoTSoS </a>talking about the spec validation problem at the end</p><blockquote><p><strong>(46:08): </strong>This is like a challenge I want to leave people with &#8211; I think the most important problem right now in this space is to figure out, what tools can actually best help users make sense of a generated specification that comes out of one of these tools.</p></blockquote><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Atlas Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Progress in autoformalization experiments]]></title><description><![CDATA[Can today's AI systems generate formally verified code?]]></description><link>https://blog.atlascomputing.org/p/progress-in-autoformalization-experiments</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/progress-in-autoformalization-experiments</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 22 Apr 2025 14:22:19 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!z7_V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>programming note: Our quarterly update (separate from this blog) went out last week &#8212; check it out <a href="https://groups.google.com/a/atlascomputing.org/g/updates/c/i9N6u7OK4hQ">here</a>, or tell me if you&#8217;d like those cross-posted here in the future. Also, this post is much shorter than what we&#8217;ve done previously; let us know if you like this format.</em></p><div><hr></div><p>First off, a team update: We&#8217;re excited to announce that Jason Gross has spun out of Atlas Computing!</p><p>In his (regrettably brief but) eventful time at Atlas Computing, Jason primarily ran experiments using LLMs to transpile from Coq to Lean (<a href="https://github.com/JasonGross/autoformalization-transpilation">repo</a>) while advising work on our specification validation tool (<a href="https://github.com/atlas-computing-org/formal-specification-ide">repo</a>; <a href="https://docs.google.com/presentation/d/1iJx_KMywm26_vN663SGOK__Q8LOg9JGXb_e9u2Mi9g0/edit#slide=id.g32d928c7668_0_139">summary slides</a>). These translation efforts were an important evaluation and demonstration, not simply to show that it could be possible for libraries or verification tools in one proof system to benefit others, but also to show that today&#8217;s AI systems are sufficient to significantly automate various processes related to proof generation and debugging.</p><p>There were generally minor issues that required effort getting this system work; for instance it resulted in a handful of new bug reports in the Coq proof assistant.</p><p>The nontrivial part of automating this</p><ul><li><p><a href="https://github.com/rocq-community/rocq-lean-import">There&#8217;s already a tool that converts from Lean to Coq</a> (not source-to-source, but compiled output to compiled output)</p></li><li><p>We started with Coq, and used an LLM to generate Lean.</p></li><li><p>This was then compiled and sent through the rocq lean importer</p></li><li><p>Now we can compare the compilation of the original Coq against the twice translated Coq, like so:</p></li></ul><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!z7_V!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!z7_V!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 424w, https://substackcdn.com/image/fetch/$s_!z7_V!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 848w, https://substackcdn.com/image/fetch/$s_!z7_V!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 1272w, https://substackcdn.com/image/fetch/$s_!z7_V!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!z7_V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png" width="614" height="313.74725274725273" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/fa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:744,&quot;width&quot;:1456,&quot;resizeWidth&quot;:614,&quot;bytes&quot;:195327,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/161642830?img=https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!z7_V!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 424w, https://substackcdn.com/image/fetch/$s_!z7_V!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 848w, https://substackcdn.com/image/fetch/$s_!z7_V!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 1272w, https://substackcdn.com/image/fetch/$s_!z7_V!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ffa934f4a-87aa-4c4e-817f-95692a8622b6_1898x970.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a><figcaption class="image-caption">How to validate LLM translation (main Atlas-built parts are in green)</figcaption></figure></div><p>In Coq, the goals look fiendishly complicated and the proofs look trivial. This is because Lean uses a powerful elaborator on simple primitives while Coq uses a weak elaborator on more powerful primitives, but if you know the structure of how things should reduce, you can basically make it all go away.</p><p>That said capabilities were far better than expected and, we believe, far more useful in practice than most practitioners of formal verification believe. For instance, with some hand-holding, Jason got a frontier model to compose and prove a specification of program equivalence &#8212; a couple hundred lines of working Lean code in a couple hours. We have high confidence this will replicate across important codebases, and scale to larger and more complex tasks as models improve.</p><p>As a result, we&#8217;re excited that Jason will be dramatically scaling up our expectations of this effort. Here&#8217;s Jason&#8217;s home page if you&#8217;re interested: <a href="https://jasongross.github.io/">https://jasongross.github.io/</a></p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Atlas Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Govern AI with Rules, Not Values]]></title><description><![CDATA[A Vision of Specification-Driven AI]]></description><link>https://blog.atlascomputing.org/p/govern-ai-with-rules-not-values</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/govern-ai-with-rules-not-values</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 01 Apr 2025 14:16:08 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a919151-a360-4cd0-8f23-02cb3524cb53_700x700.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p><em>&#8220;it is indispensable that they should be bound down by strict rules and precedents, which serve to define and point out their duty in every particular case that comes before them&#8221;</em></p><p><em>Alexander Hamilton, Federalist 78, describing the judiciary, to become arbiters of the law</em></p><h1>1. The case for specification-driven AI</h1><p>If we continue treating AIs as human, we will yield our humanity to them. I&#8217;m claiming that specification-driven AI is a paradigm in which humans can translate our notions of norms and morality so that human-level AI systems can be required to respect human autonomy and negotiate concepts like morality as peers.</p><p>Consider this example:</p><blockquote><p>If you hired a contractor for a kitchen renovation, you wouldn't share your life philosophy and aesthetic values in hopes that the contractor intuits what kind of cabinets you want. Instead, you&#8217;d provide detailed specifications: measurements, materials, deadlines, and acceptance criteria. Perhaps you'd work with a designer first, who specializes in developing these specifications. Importantly, the contractor must also adhere to building codes and safety regulations &#8212; external specifications that constrain what can be built regardless of client preferences. The contractor then delivers precisely what was asked for and provides evidence they've met the requirements.</p></blockquote><p>This contractor relationship is fundamentally different from how we form collaborative relationships with other people. With employees, we train them, imbue them with company values, surround them with company culture, and guide them toward a company mission and goals. Then we grant them increasing amounts of autonomy in pursuit of our shared objectives. With children, we guide them and teach them through examples; we pass on values through principles like &#8220;treat others as you would like to be treated,&#8221; as well as providing copious feedback; and we give them increasing amounts of independence and autonomy.</p><p>I claim that effectively <strong>all today&#8217;s concerns about AI come from the fact that we are (perhaps unintentionally) trying to slot AI systems into the &#8220;human&#8221; category of the social fabric when we should be treating them as non-human entities, similar to corporations.</strong> Trust leads to anthropomorphizing AI systems which creates issues because they&#8217;re fundamentally not human. Alternatively, adopting a contractor-like framework could both dramatically reduce the likelihood of failure modes from unsafe AI and facilitate faster, more effective adoption of AI capabilities. This distinction is not merely academic. As AI capabilities expand, the way we instruct and govern these systems will fundamentally shape their impact on society.</p><p>In the remainder of this essay, I&#8217;ll explain&#8230;</p><ul><li><p>Why you should be skeptical of the current strategies to &#8220;align&#8221; and improve AI,</p></li><li><p>What we can do instead: scale human review through formalization of policies,</p></li><li><p>What challenges must be overcome to adopt this paradigm, and</p></li><li><p>What the future of this approach looks like.</p></li></ul><h1>2. Values-based alignment is not democratic</h1><p>Companies and research labs developing frontier large language models claim that the solution to misbehaving AI is &#8220;alignment,&#8221; in which the researchers more accurately and effectively embed norms and values into the AI systems themselves. However, AI researchers surveyed widely consider the question &#8220;How can one align an AI?&#8221; to be among the <a href="https://aiimpacts.org/wp-content/uploads/2023/04/Thousands_of_AI_authors_on_the_future_of_AI.pdf">most important</a> unsolved research questions in the field of AI.</p><p>An aligned AI would have internalized human values and preferences, allowing them to "do what humans want," or perhaps a specific as &#8220;do what the user wants&#8221; across diverse contexts. This goal may seem intuitive, but even if alignment were solved tomorrow, there are still challenges about what to align it to:</p><p>Your preferences vary over time: Your values today might not match your values tomorrow, next year, or five years from now. Which version of yourself should an AI align with? An AI system that defers to your future values would be deemed paternalistic, yet we (rightfully) criticize recommendation engines for exploiting our fleeting desires as they maximize engagement. Furthermore, I expect that confidence in an ostensibly aligned AI system would be further eroded if I knew it wasn&#8217;t actually aligned to me, but rather to a frontier lab&#8217;s approximation of me.</p><p>You (provably) can&#8217;t please all the people all the time: When multiple people use the same AI system, whose values take precedence? <a href="https://en.wikipedia.org/w/index.php?title=Arrow's_impossibility_theorem">Arrow's impossibility theorem</a> shows that it is provably mathematically impossible to combine preferences (e.g. in a vote) and preserve all of some basic, intuitive fairness properties. Even going from a group of <em>n</em> people with opinions on some options to saying &#8220;the group has opinions on the options&#8221; requires that you discard all but <em>1/n </em>of the information! (What mechanism you use, whether it be rank-choice voting, quadratic voting, or markets, should be thought of as simply a choice of which information you&#8217;re disregarding.) By comparison, alignment seems to assume we can convince an AI system to act morally; how long after a company claims to have an AI that acts morally, will they (or others) begin to point to the AI&#8217;s actions as not just an example of, but rather the paragon of morality?</p><p>Alignment enhances cultural conflicts: If the alignment problem were solved today, AI systems become vectors for ideologies. America, China, and other countries are all struggling to ensure their values are embedded in AI systems. This concentrates extraordinary power in the hands of those who define these values, and increases geopolitical instability as countries attempt to ensure that any possible superintelligence might be their culture&#8217;s ideological successor. Furthermore, preferences evolve over time, creating a potential dilemma: either we allow future generations to update the values of long-lived AI systems (undermining the strength of alignment today), or we risk a future where humans are governed by superintelligent entities enforcing centuries-old value systems that no longer reflect contemporary moral understanding.</p><p>The interpretation problem: Values are inherently ambiguous and context-dependent. Consider a simple instruction to an AI image generator to create "historically accurate" images. Should it:</p><ul><li><p>Replicate biased representation from historical training data?</p></li><li><p>Correct for historical bias while maintaining period authenticity?</p></li><li><p>Optimize for some middle ground between accuracy and contemporary values?</p></li></ul><p>Moral philosophy points to the fundamental nature of the &#8220;is/ought chasm&#8221; (a.k.a. Hume&#8217;s Guillotine): you cannot, through logic alone, conclude a statement about what the world <em>should</em> be solely from statements about how the world <em>is</em>. Therefore, to reach an objectively correct answer about what an AI system <em>should </em>do requires that we start from some implicit or explicit premise about what &#8220;<em>good&#8221; </em>is.</p><p>These fundamental limitations (temporal inconsistency of individual preferences, impossibility of lossless preference aggregation, ideological amplification, and inherent ambiguity of values) reveal why the alignment paradigm faces both technical and philosophical obstacles. Rather than planning for AI system to internalize and interpret human values, we need an approach that establishes clear boundaries, enables objective verification, and preserves human agency in determining acceptable AI behaviors. Specification-driven AI offers precisely this alternative path.</p><h1>3. What is specification-driven AI</h1><p>Rather than embedding values into an AI to improve alignment, specification-driven (or spec-driven) AI is a family of approaches in which a user generates a formal specification &#8212; criteria expressed precisely and unambiguously so they can be automatically verified &#8212; and the AI&#8217;s output is verified against that specification, ideally with formal logic or mathematical proofs.</p><p>The following figure below illustrates how people currently use AI systems and compares it to a spec-driven AI workflow, which has the following steps:</p><ol><li><p>First, the user first generates a human-reviewable, formal specification (the &#8220;Solution Spec&#8221; in the figure below). While the definition of a formal spec will be provided in the next subsection, consider it to be a list of all the objective properties that a solution should have. This is likely done with the help of an AI tool, but importantly, the spec can be reviewed by and explained to the human.</p></li><li><p>Once the spec is approved, a different AI system (the &#8220;Solution &amp; Proof Generator&#8221; automatically generates a solution in addition to a mathematical proof that the solution satisfies the spec.</p></li><li><p>The proof-verification process is then a fully automated, trustless step that can be performed by a small, generalized, non-AI program.</p></li></ol><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!j3PQ!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!j3PQ!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 424w, https://substackcdn.com/image/fetch/$s_!j3PQ!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 848w, https://substackcdn.com/image/fetch/$s_!j3PQ!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 1272w, https://substackcdn.com/image/fetch/$s_!j3PQ!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!j3PQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png" width="1456" height="606" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:606,&quot;width&quot;:1456,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!j3PQ!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 424w, https://substackcdn.com/image/fetch/$s_!j3PQ!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 848w, https://substackcdn.com/image/fetch/$s_!j3PQ!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 1272w, https://substackcdn.com/image/fetch/$s_!j3PQ!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F28dedbfb-4339-4956-ade5-af401099a9e7_1600x666.png 1456w" sizes="100vw" loading="lazy"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><p>It's worth noting that although there are multiple approaches that fit under this umbrella, the term spec-driven AI is not a common term (yet). The proposed <a href="https://arxiv.org/abs/2405.06624">Guaranteed-Safe AI</a> framework and <a href="https://www.aria.org.uk/programme-safeguarded-ai/">Safeguarded AI</a> architecture would qualify, as would plans to use AI to generate formally verified software. However, strategies like <a href="https://dl.acm.org/doi/abs/10.1145/3630106.3658979">Constitutional AI</a> do not fall into the category of spec-driven AI because, in those systems, the constitution is subjective and must be interpreted by AI systems which must be trusted and the goal spectrum and AI are objective requirements that do not require trust. It also seems worth noting that spec-driven AI is a type of AI control, wherein the model is assumed to be either fallible or untrustworthy.</p><h2>What is a formal specification</h2><p>Formalization is the process of converting ambiguous natural language policies into precise, machine-checkable specifications. A formal spec must take inputs that are anchored in observations or measurements of the world, but can then reason about properties those inputs must have with mathematics and formal logic. (Going slightly deeper, formal logic is the process of generating conclusions from premises using axioms like &#8220;A implies B &amp; B implies C &#8658; A implies C&#8221;.)</p><p>While it is challenging to formally specify all the relevant properties of a solution, there is already adoption or progress developing formalization techniques across various domains:</p><ul><li><p>Software Verification: Formal verification is a branch of computer science in which desired properties of programs are formally specified and the software is proven against those properties. The notion of proving software properties dates back to early proposals by Alan Turing, and techniques already provide mathematical guarantees about critical software properties, including flight systems software on aircraft, train scheduling software, <a href="http://adam.chlipala.net/papers/FiatCryptoSP19/FiatCryptoSP19.pdf">cryptography</a> and <a href="https://project-everest.github.io/">computer networking</a> libraries, and even an operating system <a href="https://sel4.systems/">microkernel</a>. In most instances, adoption is limited by expertise in formal verification. However, AI could democratize these approaches, allowing non-specialists to express desired software behaviors in natural language and receive formally verified implementations.<br>As a separate example, the Rust programming language is growing in popularity because it enforces a property of memory safety on all programs, which could be considered a form of formal verification.</p></li><li><p>Tax codes are written in natural language, but tax software run by governments are mechanized formalizations (i.e. runnable as software). However, this formalization is not typically generated with input from the original lawmakers. There are efforts to <a href="https://arxiv.org/pdf/2103.03198.pdf">formalize the tax code</a> into a public, formal source of truth around tax liability, enabling anyone to understand regulatory implications, and deploy autonomous AI agents to operate with confidence.</p></li><li><p>HR policies can describe objective criteria for hiring, promotion, layoffs or other workplace changes. However, these policies may be inconsistent (i.e. containing conflicting statements) or contain undefined behavior (don&#8217;t fully explain what to do in all situations). The head of the largest formal verification team in the world has spoken about the work they&#8217;ve done <a href="https://www.youtube.com/watch?feature=shared&amp;v=ZXOBAdIxFBE">formalizing HR policies</a>, and the limit on who can benefit from this formalization seems largely limited by the cost of the expertise needed to formalize policies.</p></li></ul><p>Hopefully it&#8217;s intuitive how formalization could apply to other domains as well:</p><ul><li><p>Building codes specify measurable properties like minimum number of exits, number of electrical outlets, sizes of windows, and structural requirements. &#8203;&#8203;Imagine an architect submitting building plans online with all metadata needed to automatically verify compliance with building codes. Instead of waiting weeks for manual review, they could submit their plan with evidence and proof that the plans meet all requirements. This could enable near-instant feedback and approval of compliant plans, in exchange for extra work on the architect&#8217;s side generating computational arguments for compliance. This also should enable faster iteration because the formalized policies are transparent and as complete as possible without requiring human interpretation.</p></li></ul><ul><li><p>Information trustworthiness: as consumers of news and other types of information, we could state what criteria we considered necessary or sufficient to accept information, and parse information into an easily-interpretable set of assumptions, arguments, and conclusions. For instance, users could specify that they do not trust specific sources unless claims are based on primary sources. Especially if combined with tools to track provenance of information, this could enable dramatically better epistemics for the general population.</p></li></ul><ul><li><p>Biological Specifications: A computational <a href="https://atlascomputing.org/tox-forecasting-froposal.pdf">toxicity forecasting competition</a> could create standardized ways to quantify chemical hazards, enabling AI systems to screen potential compounds for safety concerns before synthesis. This specification language would then allow regulatory bodies to formally specify safety requirements that must be satisfied before proceeding with novel chemical development.</p></li></ul><h2>Scaling Human Review; avoiding a review crisis</h2><p>It might seem that I&#8217;m simply advocating for more automation of regulatory review that would benefit all humans. While I think this automation would be a dramatic improvement in today&#8217;s world of humans regulating human actions, I believe it is critical for humans reviewing AI actions, because human review doesn&#8217;t scale once we&#8217;ve deployed human-level AI agents.</p><p>Consider it in these terms: Today people (a) decide what to do, (b) decide how to do the thing, (c) do the thing, and then (d) review the outcome. As AI systems become increasingly capable, they are currently on track to take over (c), then (b), then (a), while humans are left reviewing. Humans generate fewer than 200 tokens per minute. One million tokens from GPT4.5 costs roughly $150, which is the highest cost for any available model. If costs don&#8217;t change significantly, that means a human-level AI system will be able to generate outputs at the equivalent of a human for $3,600 per working year. Since, humans cannot hand-off responsibility to an AI system, so once AI systems become roughly as capable as humans, human reviewers must either become a bottleneck on progress, abdicate review, or find a way to automate the review process. The only other alternative would be expecting all humans to be employed as reviewers of AI outputs, evaluating for compliance.</p><p>Formalization can automate this process safely and efficiently. I believe we should build toward a world where humans decide what should (and shouldn&#8217;t) be done, and AI systems have to prove their actions against these specifications.</p><p>One relevant benefit of this architecture: Specifications compose better than values: If two groups of people reach a fundamental conflict about actions being morally good or wrong (e.g. debates around equal treatment vs religious expression), there&#8217;s no clear way to rectify this conflict from the perspective of AI alignment. However, governments have processes to determine what is illegal, and we could imagine each of the above groups using the legal apparatus to set policies defining a formal boundary between legal and illegal that is a compromise in encoding good vs wrong.</p><p>In control theory, this could be mapped onto a constrained optimization problem. Training an autoregressive transformer intends to minimize the difference between the reward function and some notion of morality to be optimized. However, specs and rules set constraints. It's hard to robustly combine preferences in a way that my preference can't be cancelled out by your anti-preferences. But constraints stack nicely, similar to international treaties, national laws, and local laws: if something is prohibited at only the state or national level, then it's illegal regardless of the level at which it's illegal.</p><h2>Additional benefits of spec-driven AI:</h2><p>The paradigm can be adopted incrementally: Formalization doesn't require an all-or-nothing approach. Organizations or governments can first formalize a subset of their policies, starting with domains most amenable to formalization, then provide a "fast lane" of review for proposals that demonstrably comply with the formalized policies. As these interfaces become increasingly common and adopted amongst companies using AI agents, organizations can gradually expand the scope of formalized policies as they gain confidence in the approach.</p><p>Specifications enable black-box trust: One intrinsic risk from this is that embedding values and ideals into a machine makes that machine a vector for your ideologies. This means that anyone with a competing ideology will see yours as an existential threat. However, if your machine is bound by formalized requirements, you should be comfortable using anyone else's system because that output will also have to meet the requirements; any dissatisfaction you have with the outcome could be mapped, purely to an error or omission in the specification.</p><p>We have the social institutions and mechanisms to set rules: No society on earth is well-suited to curate a set of examples sufficient to convey its values. And even if it were, alignment is not transparent, and there&#8217;s an entire field of research based on the open question of interpreting the actions of AI systems. By comparison specifications are transparent, they can be debated, refined, and revised through democratic processes. All countries have mechanisms to set rules for citizens to follow, and democratic countries have mechanisms to ensure this is participatory. Rather than companies leaving liability to the users of their increasingly autonomous AI systems, or letting those systems be limited by alignment, we should have tools that first formalize the laws to ensure verifiable compliance, and eventually provide these tools to legislators and onboard governance processes so that we don&#8217;t risk AI systems that don&#8217;t follow laws.</p><h1>4. Implementation Challenges and Solutions</h1><p>I&#8217;m a staunch believer that to truly have confidence in the existence of a solution is to solve the problem. As this problem is not yet solved, there must therefore be details omitted and questions unanswered. This section is intended to address some of the bigger open questions that must be answered in addition to the significant quantity of engineering that will be needed to make this approach successful.</p><p>Q: What is the likelihood spec-driven AI succeeds as a paradigm?</p><p>A: Enough that I think it's worth doing, but success is far from guaranteed. I'm not pursuing this because I think it's easily tractable, but because it feels so needed and potentially valuable if successful.</p><p>Q: Previous attempts to create formal specifications have been prohibitively expensive. How is this approach different?</p><p>A: Prior formal verification efforts like the seL4 microkernel required approximately <a href="https://read.seas.harvard.edu/~kohler/class/cs260r-17/klein10sel4.pdf">20 person-years</a> to generate 10,000 lines of verified code&#8212;roughly 200 hours per line of code. At this rate, formalization is only practical for the most security critical applications. However, today's language models have shown promising capabilities on formal reasoning benchmarks with multiple models scoring above 80% on the <a href="https://paperswithcode.com/sota/math-word-problem-solving-on-math">MATH benchmark</a> of word problems and <a href="https://github.com/atlas-computing-org/CoqLeanTranslation">translate between specification languages</a>. While previously only specialists could write formal specifications, AI assistants are increasingly enabling non-specialists to participate in the formalization process, opening this approach to widespread adoption.</p><p>Q: Haven&#8217;t there been attempts to create robust ontologies for formalizing?</p><p>A: The Internet's early days saw numerous attempts at creating knowledge databases and ontologies, most of which failed because people needed to learn complex schemas in order to use them. The critical difference today is that AI can learn the rules, syntax, and terms for a formal specification language, and handle the translation between natural language and formal specifications, removing this adoption barrier. There are even systems like <a href="https://www.doc.ic.ac.uk/~rak/papers/LPOP.pdf">Logical English</a> which was constructed to be easy to read and understand as an English speaker with no formal logic experience; writing in this system is challenging, but having a formal spec in logical english that could be compiled to a more succinct and mathematical logic (by a formally verified compiler) would enable a user incredibly high leverage without having to learn to write in a new ontology.</p><p>Q: Proving things about software is challenging enough, but how can you claim to prove things about the real world?</p><p>A: Formal verification provides mathematical guarantees about certain properties, but those guarantees are only as good as the specifications and world models they're based on. To understand the power and limitations of specification-driven AI, I think about four components in the verification workflow:</p><div class="captioned-image-container"><figure><a class="image-link image2" target="_blank" href="https://substackcdn.com/image/fetch/$s_!f50l!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!f50l!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 424w, https://substackcdn.com/image/fetch/$s_!f50l!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 848w, https://substackcdn.com/image/fetch/$s_!f50l!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 1272w, https://substackcdn.com/image/fetch/$s_!f50l!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!f50l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png" width="1310" height="143" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/ef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:143,&quot;width&quot;:1310,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:null,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:null,&quot;href&quot;:null,&quot;belowTheFold&quot;:true,&quot;topImage&quot;:false,&quot;internalRedirect&quot;:&quot;https://blog.atlascomputing.org/i/159577917?img=https%3A%2F%2Fsubstackcdn.com%2Fimage%2Ffetch%2Ff_auto%2Cq_auto%3Agood%2Cfl_progressive%3Asteep%2Fhttps%253A%252F%252Fsubstack-post-media.s3.amazonaws.com%252Fpublic%252Fimages%252Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png&quot;,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!f50l!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 424w, https://substackcdn.com/image/fetch/$s_!f50l!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 848w, https://substackcdn.com/image/fetch/$s_!f50l!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 1272w, https://substackcdn.com/image/fetch/$s_!f50l!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fef22b019-9ee2-45ff-8314-420b6f030271_1310x143.png 1456w" sizes="100vw" loading="lazy"></picture><div></div></div></a></figure></div><p>Only the spec and the model can be written down and examined. A model &#8660; reality gap will grow because our knowledge of the universe is incomplete and our ability to express all relevant aspects of the world is limited. The best mitigation is to build better tools for people to contribute collaborative improvements to the world model, which is an active area of <a href="https://www.aria.org.uk/opportunity-spaces/mathematics-for-safe-ai/safeguarded-ai/">research and investment</a>.</p><p>Specification-based approaches can mathematically prove there's no spec &#8660; model gap, ensuring that what gets built actually matches what was specified. This is powerful because it eliminates an entire class of implementation errors.</p><p>The intent &#8660; spec gap represents the challenge of translating what you want into a formal specification.For example, a self-driving car might have a specification to "maintain safe distance from obstacles" that fails to account for momentum or slippery roads. One mitigation is to find simpler properties that are easier to formalize and/or prove (e.g. the car shouldn&#8217;t drive over a certain speed if ice appears to be present). Another is to provide tools to help users reason about the spec and its implications in the world model, asking questions and understanding design implications.Our approach provides tools to explore edge cases and test specifications against diverse scenarios before implementation.</p><p>Rather than claiming to eliminate all gaps between intent and reality, specification-driven AI gives us concrete ways to narrow these gaps systematically while providing mathematical guarantees for the parts that can be formalized.</p><h1>5. Progress and next-steps for spec-driven AI</h1><p>Several tools and initiatives are emerging to enable a transition from an alignment-based paradigm to a specification-driven paradigm. This section outlines the current landscape and future directions.</p><p>The main components for formal verification are specifications, solutions, and proofs that the solutions satisfy the specs. To deploy these systems widely we need progress making sure we have good specs and good proofs, as well as putting the whole thing together:</p><h2>Tools for specs</h2><h3>Specification validation:</h3><p>To ensure specifications match intent, we need spec validation tools can generate test cases, counterexamples, and natural language explanations of formal specifications. These help humans understand the consequences of their specifications before implementation. My nonprofit, <a href="http://atlascomputing.org/">Atlas Computing</a>, is building a specification IDE that uses an LLM to help users understand and improve the mapping between a natural language specification and a formal spec, closing the aforementioned intent &#8660; spec gap.</p><p>The goal is essentially to load the relevant components of the specification into a person's brain so they can understand if that's actually what they want. This approach is more robust than externalizing only a subset of requirements and hoping the implementation system correctly interprets them.</p><h3>AI-Powered Translation:</h3><p>Large language models can translate between natural language statements and formal specifications, making formalization accessible to non-specialists. This removes the primary barrier that prevented earlier ontology projects from succeeding. You no longer need to learn complex schemas&#8212;AI can handle that translation for you.</p><h2>Tools for proof generation</h2><h3>Verification systems</h3><p>Proof verification systems like Coq, and Isabelle have been under development for decades (recently joined by Lean as a rising star) and will serve as infrastructure for proof generation. These systems provide the mathematical foundation for verifying that implementations meet specifications. While historically requiring specialized expertise, improved interfaces and language model integrations are making these tools more accessible.</p><p>The growing interest in formal verification has also led to a renewed interest in investing in formal verification tools and projects. This includes new proof languages, proof libraries, and proof verification systems. The world's largest formal logic team is the Automated Reasoning Group (ARG) at Amazon Web Services; this team includes Leonardo de Moura, who is also leading a Focused Research Organization with significant funding for 5 years to<a href="https://lean-fro.org/about/roadmap-y2/"> improve the Lean proof assistant</a> as a foundational tool in formal verification. The Lean FRO (Lean Focused Research Organization) has received approximately $50 million in funding to advance the Lean theorem prover and make formal mathematics more accessible.</p><h3>AI-based Solution and Proof Generation</h3><p>Several organizations are developing systems to automate the generation of mathematical proofs:</p><p><strong>OpenAI and Epoch AI:</strong> OpenAI has funded Epoch AI to collaborate on the Frontier Math benchmark, hoping to improve the mathematical capabilities of language models. Their research has shown promising results in applying language models to formal mathematics.</p><p><strong>AlphaGeometry2:</strong> In February of this year, DeepMind's AlphaGeometry2 demonstrated the ability to perform at <a href="https://arxiv.org/abs/2502.03544">gold-medal level</a> on International Mathematics Olympiad questions. AlphaProof, another DeepMind project, is <a href="https://lean-lang.org/spotlight/">built using Lean</a>.</p><p><strong>Startups:</strong> Harmonic has raised $18 million to build tools for formal verification and mathematical reasoning, with the goal of enhancing AI safety through provable guarantees. Similarly, Morpheus (founded by OpenAI alumni) raised $20 million to develop systems that can generate and verify mathematical proofs.</p><p><strong>DARPA:</strong> DARPA has run multiple programs over the past decade to improve tools, methods, and practices around formal verification and<a href="https://www.darpa.mil/news/2023/formal-methods-large-scale"> continues their support</a> of the field, due to the promise it shows for enhancing national security. Their Automated Rapid Certification Of Software (<a href="https://www.darpa.mil/research/programs/automated-rapid-certification-of-software">ARCOS</a>) program aims to develop tools for automated formal verification of mission-critical software while their Pipelined Reasoning of Verifiers Enabling Robust Systems (<a href="https://www.darpa.mil/research/programs/pipelined-reasoning-of-verifiers-enabling-robust-systems">PROVERS</a>) program aims to facilitate proof repair.</p><h2>Spec-driven AI coordination</h2><p>I&#8217;m also co-organizing the most recent in a series of workshops with the authors of the Guaranteed-Safe AI paper, which includes Yoshua Bengio (Turing Award winner), David &#8220;davidad&#8221; Dalrymple (who leads the &#163;59M Safeguarded AI research program at ARIA) and multiple other highly regarded professors in the fields of AI and formal verification. The goal of these workshops is to coordinate with various researchers, funders, and organization builders in order to identify gaps and help drive adoption of specification-based AI as a widely useful and safer system. These efforts aim to identify high-priority domains for initial application of specification-driven approaches and develop roadmaps for necessary technical advancements.</p><h2>A Path Forward</h2><p>As AI systems become more powerful, the question of how to govern them becomes increasingly urgent. The specification-driven paradigm offers a practical, democratic alternative to values-based alignment approaches. By treating AI systems more like contractors than employees&#8212;providing clear specifications rather than hoping they internalize our values&#8212;we can build systems that reliably do what we want because we can verify it.</p><p>This approach requires advances in formal methods, natural language understanding, and human-computer interaction. It also demands a shift in mindset from both AI developers and policymakers. Rather than building increasingly autonomous systems and hoping they share our values, we should build increasingly verifiable systems that demonstrably follow our rules.</p><h2>We&#8217;re seeking collaborators:</h2><p>Please reach out if:</p><ul><li><p><strong>You work in research </strong>and are interested in developing any of the aforementioned tools</p></li><li><p><strong>You work in policy</strong> and want to explore expressing regulations as formal specifications</p></li><li><p><strong>You work in industry</strong> and you want to try out verification technologies to more safely leverage AI advancements</p></li><li><p><strong>You want to know more</strong> about how to demand transparency and verifiability from AI systems that impact your life</p></li></ul><p>The future of AI doesn't have to be a choice between stagnation and uncontrolled autonomous AI systems. With specification-driven approaches, we can harness AI's transformative potential while maintaining meaningful human oversight. The time to build this future is now, before powerful AI systems become too entrenched in the alignment-based paradigm to change course.</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Want to receive future posts to your inbox?</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Our Report On AI-Enabled Tools For Scaling Formal Verification]]></title><description><![CDATA[Scaling Human Understanding and Review Capacity]]></description><link>https://blog.atlascomputing.org/p/our-report-on-ai-enabled-tools-for</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/our-report-on-ai-enabled-tools-for</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 06 Aug 2024 19:06:41 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!pPM1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h2>Scaling Human Understanding and Review Capacity</h2><p>At Atlas Computing, we focus on advancing humanity&#8217;s understanding and enhancing review capacity through innovative technologies. We&#8217;re excited to share our latest report, which outlines the tools that we think are needed to understand the software that runs our world, as well as all the software that AI systems will generate in the coming years.&nbsp; The report, titled "<a href="https://atlascomputing.org/ai-assisted-fv-toolchain.pdf">AI-Assisted Code Specification, Synthesis, and Verification</a>," outlines a widely-applicable strategy and lists the modular tools that we believe will dramatically facilitate the use of formal verification.</p><p>Formal verification (FV) is the gold standard for security and stability to ensure that software is behaving as intended. However, the traditional FV processes are labor-intensive and costly, often limiting their application to only the most critical subsystems, but we believe advances in AI will make formal verification the dominant form of software development in the near future.&nbsp; In the report, we elaborate on the potential and current limitations of formal verification (FV), outline existing formalization workflows, and describe the 12 modular projects that can automate steps in those workflows.&nbsp; We recommend the 2-page executive summary at the start of <a href="https://atlascomputing.org/ai-assisted-fv-toolchain.pdf">the document</a> for more information, but you can see the main diagram below.&nbsp; The report was created in collaboration with the Topos Institute, funded by Protocol Labs and the Survival and Flourishing Fund.&nbsp;</p><div class="captioned-image-container"><figure><a class="image-link image2 is-viewable-img" target="_blank" href="https://substackcdn.com/image/fetch/$s_!pPM1!,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png" data-component-name="Image2ToDOM"><div class="image2-inset"><picture><source type="image/webp" srcset="https://substackcdn.com/image/fetch/$s_!pPM1!,w_424,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 424w, https://substackcdn.com/image/fetch/$s_!pPM1!,w_848,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 848w, https://substackcdn.com/image/fetch/$s_!pPM1!,w_1272,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 1272w, https://substackcdn.com/image/fetch/$s_!pPM1!,w_1456,c_limit,f_webp,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 1456w" sizes="100vw"><img src="https://substackcdn.com/image/fetch/$s_!pPM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png" width="1205" height="468" data-attrs="{&quot;src&quot;:&quot;https://substack-post-media.s3.amazonaws.com/public/images/99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png&quot;,&quot;srcNoWatermark&quot;:null,&quot;fullscreen&quot;:null,&quot;imageSize&quot;:null,&quot;height&quot;:468,&quot;width&quot;:1205,&quot;resizeWidth&quot;:null,&quot;bytes&quot;:136753,&quot;alt&quot;:null,&quot;title&quot;:null,&quot;type&quot;:&quot;image/png&quot;,&quot;href&quot;:null,&quot;belowTheFold&quot;:false,&quot;topImage&quot;:true,&quot;internalRedirect&quot;:null,&quot;isProcessing&quot;:false,&quot;align&quot;:null,&quot;offset&quot;:false}" class="sizing-normal" alt="" srcset="https://substackcdn.com/image/fetch/$s_!pPM1!,w_424,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 424w, https://substackcdn.com/image/fetch/$s_!pPM1!,w_848,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 848w, https://substackcdn.com/image/fetch/$s_!pPM1!,w_1272,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 1272w, https://substackcdn.com/image/fetch/$s_!pPM1!,w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F99f9a2ec-861f-4146-ac7b-481ad38ada91_1205x468.png 1456w" sizes="100vw" fetchpriority="high"></picture><div class="image-link-expand"><div class="pencraft pc-display-flex pc-gap-8 pc-reset"><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container restack-image"><svg role="img" width="20" height="20" viewBox="0 0 20 20" fill="none" stroke-width="1.5" stroke="var(--color-fg-primary)" stroke-linecap="round" stroke-linejoin="round" xmlns="http://www.w3.org/2000/svg"><g><title></title><path d="M2.53001 7.81595C3.49179 4.73911 6.43281 2.5 9.91173 2.5C13.1684 2.5 15.9537 4.46214 17.0852 7.23684L17.6179 8.67647M17.6179 8.67647L18.5002 4.26471M17.6179 8.67647L13.6473 6.91176M17.4995 12.1841C16.5378 15.2609 13.5967 17.5 10.1178 17.5C6.86118 17.5 4.07589 15.5379 2.94432 12.7632L2.41165 11.3235M2.41165 11.3235L1.5293 15.7353M2.41165 11.3235L6.38224 13.0882"></path></g></svg></button><button tabindex="0" type="button" class="pencraft pc-reset pencraft icon-container view-image"><svg xmlns="http://www.w3.org/2000/svg" width="20" height="20" viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2" stroke-linecap="round" stroke-linejoin="round" class="lucide lucide-maximize2 lucide-maximize-2"><polyline points="15 3 21 3 21 9"></polyline><polyline points="9 21 3 21 3 15"></polyline><line x1="21" x2="14" y1="3" y2="10"></line><line x1="3" x2="10" y1="21" y2="14"></line></svg></button></div></div></div></a></figure></div><h2>Next Step: Prototyping</h2><p>We are actively looking for funders, potential users, and talented engineers/researchers to join us in this exciting endeavor. If you are interested in funding our work, becoming a user of our tools, or contributing to the research and development efforts, please reach out to us at <a href="mailto:hello@atlascomputing.org">hello@atlascomputing.org</a></p><p>Your support and participation can help us accelerate the development and adoption of these innovative tools, ultimately contributing to a safer and more secure digital world.&nbsp; Together, we can scale human understanding and review capacity, making formal verification an integral part of software development.</p>]]></content:encoded></item><item><title><![CDATA[Announcing Flexible Hardware-Enabled Governors]]></title><description><![CDATA[Imagine if every uranium atom pulled out of the ground had its own international atomic energy inspector tasked to follow it, report on its usage, perhaps track its location.]]></description><link>https://blog.atlascomputing.org/p/announcing-flexible-hardware-enabled</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/announcing-flexible-hardware-enabled</guid><dc:creator><![CDATA[Atlas Computing]]></dc:creator><pubDate>Wed, 31 Jul 2024 15:22:57 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nlv7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff370c2bd-71e7-4808-a3c6-f0e1be165e0d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Imagine if every uranium atom pulled out of the ground had its own international atomic energy inspector tasked to follow it, report on its usage, perhaps track its location. Ideally this system would even stop the atom from being radioactive if it were being used unsafely, or convert it to lead if the inspector were removed. If this were possible, it would be an incredible boon to nuclear deterrence.</p><p>Flexible Hardware-Enabled Governors (flex-HEGs) provide this level of oversight and reporting for datacenter GPUs. If you believe AI can pose a risk similar to nuclear war (which many leading experts do, in <a href="https://www.safe.ai/work/statement-on-ai-risk">this statement on AI Risk</a>), then you may have joined the call for an international governance body for AI. We believe this type of international governance requires technological support for transparency, reporting, and collaboration.</p><p>Atlas Computing was created to help humans better understand and review increasingly automated systems. Because using AI safely may require building AI safely, we're excited to announce that Atlas Computing will help prototype Hardware-Enabled Governors (HEGs).</p><p>HEGs are specialized hardware components integrated into GPUs and other high-performance computing devices that allow compliance with AI safety best practices by enabling transparent, privacy preserving monitoring of AI training and deployment processes.&nbsp;</p><p>Humanity needs a system that prevents any actor from bypassing established safety rules. This, in turn, is critical for preventing catastrophic misuse or accidental deployment of dangerous AI capabilities. By embedding governance mechanisms directly into the hardware, HEGs enable coordination between mistrustful AI developers or regulators.<br></p><h3>What are Hardware-Enabled Governors?</h3><p>By embedding compliance mechanisms directly into AI hardware, HEGs ensure that agreed-upon rules are enforced at the most fundamental level.&nbsp; While the concept inherits a long tradition of hardware compute governance*, we use the term coined in <a href="https://yoshuabengio.org/wp-content/uploads/2024/07/flex-HEGs-memo.pdf">this post</a> on Yoshua Bengio&#8217;s blog and the requirements therein.</p><p>HEGs consist of three key components:</p><ol><li><p>A compliance processor that determines if AI generation meets negotiated reporting thresholds</p></li><li><p>A tamper-responsive mechanisms to maintain system integrity</p></li><li><p>An offline power source to ensure continuous operation of tamper-response</p></li></ol><p>These layers enable a wide range of regulatory capabilities, such as ensuring that safety best practices are used for any training run over a certain size. HEGs leverage secure enclaves and sophisticated methods to distinguish between training and inference. This allows for targeted governance without impeding benign AI applications.<br></p><h3>Why is this important?</h3><p>As nations recognize the potential of advanced AI systems, they may rush to develop these technologies for military or strategic purposes, similar to the nuclear arms race, incentivizing fewer safety precautions and increasing the risk of catastrophes or loss of control.&nbsp;</p><p>HEGs could serve as concrete demonstrations of responsible behavior, providing reliable evidence that safety protocols are followed and allowing all players to transparently show adherence to best practices. While they don't define or guarantee safe AI, they ensure consistent application of best practices, reducing safety lapses. As a result, they shift incentives toward cooperation and responsible development, stabilizing the AI landscape and preventing the unchecked escalation of AI capabilities, which could have catastrophic global impacts.<br></p><h3>Who decides what is safe?</h3><p>Let&#8217;s start by noting that the creators of HEGs should <strong>not</strong> be the people responsible for defining safety best-practices or compute thresholds.</p><p>Establishing well-defined beliefs and standards about AI risk is crucial for the effectiveness and integrity of HEGs, but also fairly separate from creating the HEGs themselves. Ideally, policymakers or subject matter experts should make these decisions, potentially starting from something like <a href="https://www.alignmentforum.org/posts/Zfk6faYvcf5Ht7xDx/compute-thresholds-proposed-rules-to-mitigate-risk-of-a-lab">this post on compute thresholds</a>. This approach ensures that AI governance standards are set via representative decisions that incorporate expert opinions.&nbsp;</p><p>A common misconception is that having a tool for understanding and governance necessarily provides a government or manufacturer centralized regulatory control. This isn't the case. We would be supportive of an international consortium of frontier labs, governments, and/or device manufacturers interested in setting plans to use and implement HEGs. This could even be created and mandated through something like a professional society requiring all frontier labs that hope to employ top talent to adopt these measures. </p><p>The Atlas perspective is that humans should understand and oversee these technologies, ideally accountable humans. However, accountability to a nation state is not necessary (nor desirable).<br></p><h3>What is Atlas doing?</h3><p>Atlas Computing's mission is to improve humanity's capacity for review. This goal clearly includes the objective of scaling the capacity to assess the creation of AI systems from development to deployment. HEGs support this mission by providing tools that enhance transparency and accountability in AI development, allowing us to better understand what AI systems are being created.</p><p>Currently, Atlas is focused on advancing the hardware aspects of HEGs. Leveraging Evan's experience in hardware systems and a strong professional network, we are well-positioned to help drive this initiative. Atlas will also coordinate efforts among various stakeholders, including funders, potential grantees, startups, nonprofits, researchers, and developers. This top-level coordination is essential for translating our vision into clear, concrete progress.&nbsp; Additionally, we&#8217;ve brought on Mehmet Sencan to move technological readiness levels on components for at least the next 3 months, thanks to support from the AISTOF.</p><p>Our goal is to demonstrate a proof of concept for the tamper response mechanism (Technological Readiness Level 3) by this fall, with plans to achieve scalable technology development (TRL 4) by February and prototype demonstration (TRL 5 or 6) on all subcomponents by the end of 2025.<br><br>The implementation of HEGs should be open-hardware to foster collaboration and improve security.&nbsp; This technology is not intended to be controlled by one party by another, but rather a tool for the broadest possible coordination.&nbsp;</p><h3><br>A Broader Theory of Change</h3><p>Ensuring that AI systems are created safely is a sociotechnical problem that requires both technical and policy solutions.&nbsp;</p><p>Proving the feasibility of the technical solution is essential. This involves identifying abstraction boundaries, de-risking the components,&nbsp; and demonstrating integration. Technical de-risking must ensure the device can be built quickly and cheaply without a meaningful impact on device performance. <br><br>There may be concerns from manufacturers if they don't see the value or need for these interventions, which can be addressed with policy incentives. The solutions need to be affordable and scalable, deployed on an incredibly large scale. The machines must be robust against adversaries, both in hardware and in adversarial environments.</p><p>In addition to the technical solution, engaging with policymakers and stakeholders throughout is necessary to ensure that solutions are practical and can be widely accepted. This integrated approach is vital for stabilizing the AI development landscape and promoting responsible AI advancements globally.</p><h3><br>A Call to Action</h3><p>Implementing policies to enforce these measures is key and requires approval from policymakers and subject matter experts. Convincing policymakers involves demonstrating why these measures are necessary and integrating them into existing IP and licensing frameworks. This socio-technical problem demands both technical de-risking and substantial dialogue with policymakers to establish robust standards.<br></p><p>A group of funders and researchers are advocating for and actively pursuing development of HEGs.&nbsp; Atlas Computing is helping to develop this community to drive this initiative forward. If you are interested in funding this project or contributing as an engineer, researcher, or developer, please reach out to us.&nbsp; We&#8217;re keeping the community somewhat small to move quickly, and cannot guarantee we&#8217;ll include you, but would love to hear from you and will include you as soon as it feels like it will accelerate the cause.</p><p>*also see: <a href="https://www.cnas.org/publications/reports/secure-governable-chips">https://www.cnas.org/publications/reports/secure-governable-chips</a>&nbsp;</p><p><a href="https://www.governance.ai/post/computing-power-and-the-governance-of-ai">https://www.governance.ai/post/computing-power-and-the-governance-of-ai</a></p><p><a href="https://futureoflife.org/ai-policy/hardware-backed-compute-governance/">https://futureoflife.org/ai-policy/hardware-backed-compute-governance/</a>&nbsp;</p><p><a href="https://arxiv.org/abs/2402.08797">https://arxiv.org/abs/2402.08797</a></p>]]></content:encoded></item><item><title><![CDATA[Retrospective on Mathematical Boundaries Workshop]]></title><description><![CDATA[Primarily written by Evan Miyazono (with help from Manuel Baltieri and others) - mistakes my own]]></description><link>https://blog.atlascomputing.org/p/mathematical-boundaries-retro</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/mathematical-boundaries-retro</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 07 May 2024 18:37:51 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nlv7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff370c2bd-71e7-4808-a3c6-f0e1be165e0d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p>Primarily written by Evan Miyazono (with help from Manuel Baltieri and others) - mistakes my own</p></blockquote><h2>Minimum viable introduction</h2><ul><li><p>We ran a workshop on Mathematical Boundaries from April 10-14. This was the successor of the <a href="https://formalizingboundaries.ai/">Conceptual Boundaries workshop</a> Feb 10-12</p><ul><li><p>The overlap in participants and approach was fairly low (notably lower than intended, due to availability restricting participation, which in turn led to a natural difference in approach)</p></li><li><p>Intent:</p><ul><li><p>The first workshop was intended to develop a sense of what one might want to do with boundaries, and explore possible avenues</p></li><li><p>This event was more focused on making mathematical design decisions that would lead to a more concrete model that was opinionated enough to be useful (the natural question becomes &#8220;useful for what&#8221;)</p></li></ul></li></ul></li><li><p>You&#8217;re probably here because you want to see the outputs, so let&#8217;s get to them:</p></li></ul><h2>Outputs from the workshop</h2><p>Here are write-ups started during writing sessions during the workshop</p><ul><li><p>Manuel Baltieri 1: <a href="https://www.dropbox.com/scl/fi/5wryuiyzejn4olxcib3iv/CrossingBoundaries.pdf?rlkey=l7ibmyuy4rs5l1nbbr0qd4tfx&amp;dl=0">Crossing boundaries</a>&nbsp;</p></li><li><p>Manuel Baltieri 2: <a href="https://www.dropbox.com/scl/fi/m8edmqwcj9uot8yp52azi/FightingBoundaries.pdf?rlkey=u213vbxtpu2wwmlla9o2eogvo&amp;dl=0">Fighting for boundaries</a></p></li><li><p>Kevin Carlson: <a href="https://forest.localcharts.org/kdc-0006.xml">Nondeterministic dynamical systems and crossing boundaries </a>&nbsp;</p></li><li><p>Martin Biehl 1: <a href="https://docs.localcharts.org/s/gxTknAr8W#">Gliders and similar phenomena in (categorical) systems theory</a>&nbsp;</p></li><li><p>Martin Biehl 2: <a href="https://docs.localcharts.org/s/OOBkquFcY#">Towards a more general law of requisite variety</a>&nbsp;</p></li><li><p>Owen Lynch: <a href="https://forest.localcharts.org/ocl-001V.xml">Grothendieck lenses for functors into 2Cat</a>&nbsp;</p></li><li><p>Sophie Libkind: <a href="https://topos.site/blog/2024-04-25-ontological-commitments-for-boundaries/">Ontological commitments for boundaries</a></p></li><li><p>Nathaniel Virgo: <a href="https://docs.localcharts.org/s/QKcwqm5ai">Boundaries and Good Regulators</a></p></li></ul><p>Noting that I&#8217;m getting these to you before I&#8217;ve read them, so don&#8217;t expect me to be able to answer questions about them.</p><p>Also worth noting, Nathaniel Virgo and Martin Biehl participated in <a href="https://www.youtube.com/watch?v=lEzJfJsXbyE">this panel discussion at a later workshop</a> in Kyoto, where we discussed some of the issues that came up at the boundaries workshop</p><h2>General structure from the workshop</h2><p>The general daily structure was scheduled to be &#8220;a talk and a breakout session before lunch, then a breakout session and a longer-form discussion after lunch,&#8221; though we weren&#8217;t particularly strict adherents.&nbsp;</p><p>We found on Thursday (day 1 of 4) that the group wanted to continue discussing after Martin&#8217;s interesting talk and ending up doing more like &#8220;A talk and a discussion, followed by breakouts after lunch.&#8221;&nbsp; Thursday breakouts were (1) a session on trying to work out a cocategorical formalism for specifying things via wholes in which they participate, rather than by composing together their parts and a session on, and (2) an idea to formalize / keep track of gliders as non-deterministic or possibilistic closed dynamical systems.</p><p>Friday morning Nathaniel gave a talk on control theory that was so engaging we reached a consensus on pointing the rest of the workshop towards fleshing out adjacent ideas.</p><ul><li><p>One breakout the rest of the day Friday was focused on choosing formalisms for various words in Nathaniel&#8217;s talk, and resulted essentially in Sophie&#8217;s blog post.</p></li><li><p>The other one ended up focusing on an idea of generalizing the law of requisite variety resulting in Martin&#8217;s second write-up.</p></li></ul><p>Saturday was primarily time for writing down outputs (learnings from last time: have a big block of time to support people in generating written artifacts), and also included a small breakout group on nondeterminism (that one led to Kevin&#8217;s blog post).</p><p>Sunday morning some individuals started departing and we had some visitors, most activities involved chatting about a wide range of topics after an intense few days.</p><h2>Next steps</h2><ul><li><p>We're still genuinely interested in boundaries and would like to see additional work happen. We're exploring funding options for work on these open problems, so email me (evan@atlascomputing.org) you would like to work on them.</p></li><li><p>One possible next step is setting up a workshop adjacent to a conference which most of the Mathematical Boundaries Workshop participants are likely to attend</p><ul><li><p>Interestingly, it seems like the attendees were split somewhat [40]/[40]/[20]% between researchers who seem most likely to attend conferences exclusively in [<a href="https://en.wikipedia.org/wiki/Applied_category_theory">applied category theory</a>], [<a href="https://en.wikipedia.org/wiki/Artificial_life">artificial life</a>], and [cross-domain and para-academic conferences like this one], which I think makes this goal hard, but also makes the conversations at such an event particularly interesting.</p></li></ul></li></ul><h2>Evan&#8217;s personal takes</h2><p>Here&#8217;s some notes that are very specific to me.</p><ul><li><p>How it differed from the first one:</p><ul><li><p>Chris, Manuel, and I set out with the intent of bringing people together to build mathematical models of boundaries.&nbsp; As a result, we ended up inviting more people with stronger math background, and people who we expected, based on prior interactions and training, to be inclined toward formalizations and reach for math as a tool.</p></li></ul></li><li><p>Where I could have done better:</p><ul><li><p>It wasn&#8217;t ex ante clear that much moderation would fall to me; there was some hope that davidad would be able to attend, but through no fault of his, he was unable.</p></li><li><p>Believing that I knew enough math to even moderate this workshop was probably my greatest act of hubris since at least founding Atlas Computing.&nbsp; I knew I didn&#8217;t have enough background knowledge to contribute, but I thought at least I would be able to make proposals that could be iterated on to reach a local equilibrium, but others were far better than I at identifying what the participants agreed was a better starting point.</p><ul><li><p>Huge thanks to Manuel Baltieri and Brendan Fong for taking the reins.</p></li></ul></li></ul></li><li><p>What&#8217;s next from here:</p><ul><li><p>I&#8217;m not sure how involved in logistics, curation, or moderation of future boundaries workshops I&#8217;ll be.&nbsp; I&#8217;ll likely advocate for their utility, and potentially support aspects like fundraising and translation, but I think I&#8217;d be happy if others took up the mantle. (To be fair, that&#8217;s what I said before the first and second workshops as well, though &#128517;)</p><ul><li><p>This could be particularly compelling if it invited participation from a broader conference &#8211; if someone would like to&nbsp;propose this event as a side event to a relevant existing conference, please reach out! </p></li></ul></li><li><p>To the extent that davidad&#8217;s ARIA program is focused on building a github for science, but not a monorepo of science, I think it could be really valuable to have the following:</p><ul><li><p>If you have two &#8220;repositories&#8221; of interoperable / composable scientific theories, we should be able to identify boundaries and define boundary violations in each &#8220;repository&#8221; in a way that we&#8217;re confident that specifying a boundary violation in one scientific model (combination of scientific theories) is sufficient to confidently identify the same boundary violation in another scientific model.</p></li><li><p>At this point, Manuel, Brendan, and I are discussing what it would look like to organize a continuation on this theme.&nbsp; On the bright side, this starts highlighting and framing concrete problems that could be solved.&nbsp; On the other hand, pursuing solutions to this specific problem could also significantly diverge from the original VAPE formulation from Critch&#8217;s &#171;boundaries&#187; formulation.&nbsp;</p></li></ul></li></ul></li></ul><p>Lastly, here are some random assorted brief insights that I liked:</p><ul><li><p>Some boundaries are (sets of) physical boundaries. Others are parameter regimes, and might be better called &#8220;margins&#8221; or &#8220;viability regimes&#8221;. These seem sufficiently distinct that they&#8217;re worth calling by different names. &#8220;Membranes&#8221; may work well for singling out the &#8220;physical&#8221; boundaries, which don&#8217;t have to actually be literally made out of matter but should demarcate an agent&#8217;s &#8220;body&#8221; from its environment, rather than the space of happy states for an agent from its space of sad states.</p></li><li><p>Models could be defined as low-loss compressions of the environment and agents could be defined as models that scale in complexity with the scope of the universe unless you ascribe them some telos or desires.</p></li></ul><p>Feel free to comment here, or reach out to via email (first name at domain.org).</p>]]></content:encoded></item><item><title><![CDATA[Evan's thoughts on boundaries (Apr 2024)]]></title><description><![CDATA[AI Guardrails may benefit from formalizing the concept of boundaries]]></description><link>https://blog.atlascomputing.org/p/evans-current-thoughts-on-boundaries</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/evans-current-thoughts-on-boundaries</guid><dc:creator><![CDATA[Evan Miyazono]]></dc:creator><pubDate>Tue, 09 Apr 2024 00:47:55 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/$s_!Nlv7!,w_256,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Ff370c2bd-71e7-4808-a3c6-f0e1be165e0d_1280x1280.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<blockquote><p>This blog post is an informal summary of progress on a technical topic. Feel free to let us know in the comments if you want to see more or less content like this.</p></blockquote><p>Evan helped organize and run the <a href="https://formalizingboundaries.ai/">&#8220;Conceptual Boundaries&#8221; workshop</a>, which was initiated and primarily organized by Chris Lakin.  This workshop was intended to extend Andrew Critch&#8217;s initial work on &#171;Boundaries&#187; <a href="https://www.lesswrong.com/s/LWJsgNYE8wzv49yEc">here</a>, though I&#8217;ll try not to assume you&#8217;ve read any of that.</p><h3>Notes on these notes (i.e. meta-notes):</h3><ol><li><p><strong>This is not an official summary or debrief from the event, and has been published without input from the other workshop participants.&nbsp; This is simply a summary of things I wanted to either remember or share with others.</strong></p><ul><li><p>Assume insights are primarily from other participants, while controversial takes are my own.</p></li></ul></li><li><p>I&#8217;ll try to answer questions in various venues, either based on my opinions or participant notes from the workshop, but the participant notes themselves are not directly sharable.</p><ol><li><p>That said, if you&#8217;re curious what a particular participant thought of the workshop, they likely have an early draft of something they started writing on the last day of the workshop, and your question might be enough to prompt them to finish writing and publish the result.</p></li></ol></li><li><p>I like outline formats, as they should let you skim at your desired level of detail.</p><ol><li><p>You&#8217;re welcome to leave feedback on this format.</p></li></ol></li></ol><h3>Context on the workshop</h3><ol><li><p>A conversation at a Foresight Institute event initially prompted the workshop. Chris, Allison Duettmann, and I agreed on the need for more work in this area, leading to Chris committing to lead a workshop with support from others.</p><ol><li><p>I&#8217;ve personally been surprised at the amount of interest in the workshop generally, and the variety of paths that have led interested people to the topic.</p><ol><li><p>One reason might be that there&#8217;s a modern movement in the mental health space&nbsp;(e.g. described <a href="https://www.parapraxismagazine.com/articles/boundary-issues">here</a>) to recast many interpersonal issues in terms of boundaries, which has the interesting perspective of mapping customs onto the concept of property.&nbsp; This didn&#8217;t come up in the workshop; I just thought it was interesting.</p></li></ol></li></ol></li><li><p>The workshop itself included participants who had fairly different backgrounds, context, and goals for boundaries.&nbsp;&nbsp;</p><ol><li><p>As a result, much of the time was spent trading context, figuring out what assumptions or goals were shared</p></li></ol></li><li><p>At the end of the workshop, many participants (as well as the organizers, i.e. Chris and I) felt we&#8217;d done an interesting breadth-first exploration and wanted a second workshop.</p><ol><li><p>The goal of the next workshop is to lay the foundation for boundaries as a new research subfield by developing clear and useful definitions, identifying interesting open problems, and setting goals that we think boundaries research agendas could achieve.</p><ol><li><p>This next workshop starts April 10th (i.e. two days from writing this)</p></li><li><p>Most of this document is my attempt to provide a download of my thoughts leading into that workshop.</p></li></ol></li></ol></li></ol><h3>My hope for boundaries:&nbsp;</h3><ol start="3"><li><p>If you&#8217;ve come to this page via the Atlas Computing <a href="https://atlascomputing.org/">website</a>, you probably know that we&#8217;re working to build safeguards for AI, and one way to achieve that might be to provide some baseline constraints.</p><ol><li><p>in other words, can we define boundaries in a way that is both </p><ol><li><p>sufficiently grounded in quantifiable, objective (i.e. not subjective) information so that an AI could be trusted to understand what constitutes a boundary and a boundary violation <br>AND</p></li><li><p>is sufficiently useful as a framework that it can easily be made consistent with most people&#8217;s intuition for what a boundary is</p><ol><li><p>There&#8217;d necessarily be some parameters to set/tune, but the goal would be to have most of the heavy lifting done by the framework rather than, for instance needing to use something like a boundaries language to generate descriptions on a case-by-case basis.</p></li></ol></li><li><p>Unsurprisingly, this has a lot of overlap with <a href="https://www.lesswrong.com/posts/HnWiSwyxYuyYDctJm/what-does-davidad-want-from-boundaries">What does davidad want from &#171;boundaries&#187;?</a>, as davidad is an advisor of Atlas Computing.</p></li></ol></li><li><p>This could look like some abstracted version of object identification that also encodes some notion of separability or independence of objects. </p><ol><li><p>Current object identification mostly requires existing data or explanations of what a thing is before it can start identifying instances of that thing, or identifies a thing because its pieces move together; boundaries should identify a thing because of some aspects of its &#8220;thing-ness&#8221;.</p><ol><li><p>I&#8217;ll give a very lossy summary of Critch&#8217;s <a href="https://www.lesswrong.com/s/LWJsgNYE8wzv49yEc/p/HrtqLy46Fx7xqRrMo#Definition_part__a____part_of_the_world_">VAPE formalization of boundaries</a> here:</p><ol><li><p>You can define a set of Viscera, Active boundary (or Actions), Passive boundary (or Perception), and Environment states that interact with each other, modeled as a <a href="https://en.wikipedia.org/wiki/Bayesian_network">Bayesian network</a></p></li><li><p>These states are limited in what they can act on (e.g. environment and viscera act on each other only indirectly, via the active and passive boundaries)</p></li><li><p>This model assumes discrete time, but empowers you to potentially label different parts of the world (or an image, simulation, or video) as different parts V, A, P, or E.</p></li><li><p>If this is interesting, you should at least read <a href="https://www.lesswrong.com/s/LWJsgNYE8wzv49yEc/p/HrtqLy46Fx7xqRrMo">that whole post</a>, if not <a href="https://www.lesswrong.com/s/LWJsgNYE8wzv49yEc">the whole sequence</a>.</p></li></ol></li></ol></li></ol></li><li><p>If we had this way to identify objects, maybe we could identify a minimum viable set of boundaries, where, if they were not violated by an action, then we could be confident that the action did not result in a catastrophic unforeseen (and therefore unspecified) outcome.</p><ol><li><p>A simple example: if you can assure that an agent&#8217;s strategy for making a cup of tea doesn&#8217;t end respiration for any humans, perhaps you could claim that it&#8217;s more likely that the strategy [makes a cup of tea and doesn&#8217;t kill anyone] than the strategy [makes a cup of tea AND creates a hellscape that maintains respiration]. (My language is a little facetious/hyperbolic, but hopefully you get the idea.)</p></li><li><p>If a system can identify boundaries objectively and understand what it means to violate them, we can validate if an action violates a boundary via something like formal methods.</p><ol><li><p>This could be important because you can use a Safeguarded AI architecture in conjunction with an objective definition of boundaries without worrying about if the AI is trying to subvert your goals*.</p></li></ol></li></ol></li></ol></li><li><p>I really like the perspective that &#8220;boundaries might provide a way to identify the nouns in a normative language&#8221;.&nbsp; </p><ol><li><p>If you want to make statements about what <em>things</em> <strong>should</strong> do (with or to other things), you probably want an objective way to start identifying <em>things</em>.&nbsp;</p><ol><li><p>As an example: operationalizing the statement &#8220;people shouldn't hurt others&#8221; requires definitions of people, others, and hurt that should minimally rely on interpretation so that observers can agree if a proposed or past action violates the statement.</p></li></ol></li><li><p>Part of what I like about this framing is that I&#8217;ve found it fairly compelling to map ethical and political questions into the framework of &#8220;which boundary takes precedence in this case&#8221;, which is nontrivial because people on both sides of an argument seem willing to accept that both sets of boundaries DO exist.</p><ol><li><p>E.g. pro-choice vs pro-life could be mapped onto the questions &#8220;when does a fetus&#8217;s boundary exist independently from the boundary of the person in whose uterus the fetus exists?&#8221; and &#8220;when do governments have the right to violate the will/boundaries of constituents&#8221;</p></li><li><p>E.g. immigration could become a question about &#8220;how do we distinguish benefits of being inside the intersecting boundaries of &#8216;physically in the country&#8217; vs &#8216;citizen of the country&#8217;&#8221;</p></li></ol></li></ol></li></ol><h3>Some topics that were discussed:</h3><ol start="6"><li><p>Boundary protocols</p><ol><li><p>In practice, you <strong>do</strong> want boundaries to be crossed or modified under the right conditions, because that&#8217;s stagnation.&nbsp; An organism with perfectly preserved boundaries will starvation; preserved national boundaries prevent trade; etc.</p><ol><li><p>Realistically, you want to be able to describe (and perhaps even infer) when it&#8217;s acceptable to the object for something to cross its boundary.</p><ol><li><p>One challenge is that cells seem to love letting viral DNA in, but that feels like a boundary violation. </p></li><li><p>Meanwhile, only some people want surgeons to operate on their cancer, so language and the study of informed consent clearly also play a role at some level.</p></li></ol></li><li><p>Boundary protocols are embedded in physical reality (e.g. cell receptors on the boundary of a cell encode what is allowed in).</p><ol><li><p>How would one infer boundary protocols?  And how would a protocol be updated or renegotiated?</p></li></ol></li><li><p>My Q: How much of a boundary protocol can you infer from observation?&nbsp; </p><ol><li><p>For example, by only observing people within a culture, is it possible to learn the social norms sufficiently to participate without causing disruptions? Could you learn them well enough to not change the culture if you now made up &gt;90% of the participants? I&#8217;m not sure you could, which prevents this approach from enabling AI to act ethically.  It still might not limit its ability to act safely, though, there are interventions (like destroying a food supply) that clearly disrupt a culture in a predicatable way.</p></li></ol></li></ol></li></ol></li><li><p>Models of Boundaries</p><ol><li><p>It seemed like Yann LeCun&#8217;s H-JEPA (<a href="https://openreview.net/pdf?id=BZ5a1r-kVsf">section 4.6 here</a>) is quite relevant, and we explored that.</p></li><li><p>We also discussed if <a href="https://en.wikipedia.org/wiki/Petri_net">Petri Nets</a> could be used to model the state of a system, its boundary, and its boundary protocol. </p></li><li><p>Another potential model that&#8217;s come up since are Port-Hamiltonian systems</p></li><li><p>Generally, it felt like progress was needed (especially on answering questions like &#8220;how could model boundaries in a way that allows for continuous time?&#8221;). </p><ol><li><p>There were also a bunch of explorations around things like &#8220;do you need to be able to label things as &#8216;boundary&#8217; or is labeling inside and outside of objects sufficient&#8221; or &#8220;how to deal with non-contiguous physical boundaries&#8221; that didn&#8217;t feel to me like they reached clear endpoints.</p></li></ol></li></ol></li><li><p>Types of boundaries</p><ol><li><p>I created this list of <a href="https://docs.google.com/spreadsheets/d/1zfDoAAx3Xv7qctOedYrA1aOYvxI4rzSLTOLnQEmE6UY/edit#gid=0">Examples of Boundaries</a>.&nbsp; It&#8217;s definitely got issues, but it was helpful to make sure a statement made about one type of boundaries held for other types one might want to consider</p></li><li><p>I also thought this formulation of boundaries was interesting:</p><ol><li><p>If we identify types of things that are interesting to preserve, it&#8217;d be nice to have a way of relating things to other things.&nbsp; Here&#8217;s 4 categories of things</p><ol><li><p>Objects (physical arrangements that perservere in time)</p><ol><li><p>E.g. atom or rock; it makes sense to say there&#8217;s a &#8220;boundary&#8221; around it because it&#8217;s intuitively recognizable as a thing.  </p></li></ol></li><li><p>Cycles of objects (physical objects that indirectly beget themselves)</p><ol><li><p>E.g. metabolic cycles; carbon cycle; chicken + egg</p></li></ol></li><li><p>Patterns&nbsp;(arrangements of information encoded in physical objects where the objects are transient but the information persists)</p><ol><li><p>E.g. forests or civilization: the trees or people change but the pattern remains; Dawkensian memes, The Ship of Theseus, and living things (probably) fit into this category as well. </p></li></ol></li><li><p>Cycles of patterns</p><ol><li><p>E.g. centralization vs decentralization of power within society; the model of punctuated equilibrium in evolutionary biology</p></li></ol></li></ol></li><li><p>&#8220;Things&#8221; on this scale are clearly composed of other &#8220;things&#8221;.  </p><ol><li><p>While it might be possible to list all types of boundaries from the bottom up or create some sort of directed graph, I don&#8217;t think that&#8217;s necessary, since the most relevant piece is likely the ability to relate different boundaries to each other, which can be done more succintly in a case-by-case basis than falling back to a taxonomy of boundaries.</p></li><li><p>Very hot take: a lot of my intution says that preserving cycles of patterns (the fourth category), with deference going to the patterns recurring on the longest timescale) is an interesting extrapolation of moral trends. (I don&#8217;t think this is particularly defensible, but it&#8217;s an interesting thought.)</p></li></ol></li></ol></li></ol></li></ol><p>Again, this is very incomplete, and I&#8217;m mostly trying to get something out the door in time for the next workshop.  We&#8217;ll try to have a more comprehensive (and more timely) summary out of the next workshop!</p><div class="subscription-widget-wrap-editor" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe&quot;,&quot;language&quot;:&quot;en&quot;}" data-component-name="SubscribeWidgetToDOM"><div class="subscription-widget show-subscribe"><div class="preamble"><p class="cta-caption">Thanks for reading Atlas Blog! Subscribe for free to receive new posts and support my work.</p></div><form class="subscription-widget-subscribe"><input type="email" class="email-input" name="email" placeholder="Type your email&#8230;" tabindex="-1"><input type="submit" class="button primary" value="Subscribe"><div class="fake-input-wrapper"><div class="fake-input"></div><div class="fake-button"></div></div></form></div></div>]]></content:encoded></item><item><title><![CDATA[Welcome to Atlas Computing's blog!]]></title><description><![CDATA[A brief summary of where we write about our work]]></description><link>https://blog.atlascomputing.org/p/blog-says-hello-world</link><guid isPermaLink="false">https://blog.atlascomputing.org/p/blog-says-hello-world</guid><dc:creator><![CDATA[Atlas Computing]]></dc:creator><pubDate>Thu, 04 Apr 2024 14:11:36 GMT</pubDate><enclosure url="https://substackcdn.com/image/fetch/f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2F4a919151-a360-4cd0-8f23-02cb3524cb53_700x700.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi Atlas community!</p><p>Welcome to our blog.  As a first post, we wanted to share the main ways to share information:</p><ol><li><p>Our website, <a href="https://atlascomputing.org/">atlascomputing.org</a>, covers slow-changing descriptions of what we want to achieve and how we&#8217;re going about doing it</p></li><li><p>We&#8217;ve got an email list <a href="https://groups.google.com/a/atlascomputing.org/g/updates">here</a> for regular major updates, roughly quarterly.</p></li><li><p>We'll also be using the safe-by-design email list <a href="https://groups.google.com/g/safe-by-design">here</a> for very informal conversations. A small group also spun out <a href="https://provablysafeai.zulipchat.com/">a Zulip chat</a> from that list.</p></li><li><p>On this blog, you can expect brief, informal updates on a topic probably about one every two weeks for now.  </p></li><li><p>We&#8217;ve got profiles on <a href="http://linkedin.com/company/atlas-computing-org">LinkedIn</a> and <a href="https://twitter.com/SafeWithAtlas/">Twitter</a>; currently rarely used, but we have ambitious to change that.</p></li></ol><p>Each of those venues has their own sign-up; if #4 sounds good, subscribe here &#128071;</p><p class="button-wrapper" data-attrs="{&quot;url&quot;:&quot;https://blog.atlascomputing.org/subscribe?&quot;,&quot;text&quot;:&quot;Subscribe now&quot;,&quot;action&quot;:null,&quot;class&quot;:null}" data-component-name="ButtonCreateButton"><a class="button primary" href="https://blog.atlascomputing.org/subscribe?"><span>Subscribe now</span></a></p>]]></content:encoded></item></channel></rss>