Our Standards
Guidelight's standards describe what safe frontier AI development looks like in concrete terms, balanced by what's achievable. Read more about our standards development process.
Control (v1.0)
View full standard →Control refers to the technical and operational measures that constrain what an AI system can do, regardless of whether it is aligned. These measures both reduce catastrophic risk from a misaligned AI and can surface evidence of an AI's misalignment.
Principle 1
Be able to see what your AI is doing during internal deployment.
→Principle 2Scan for signs of concerning behavior.
→Principle 3Stress-test the sufficiency of your scanning.
→Principle 4Stop the AI from taking harmful actions even if it tried.
→Principle 5Have independent third parties verify the adequacy of your control regime.
→Principle 6Prepare for a possible breach of control.
→Transparency (v1.0)
View full standard →Transparency makes safety-relevant information legible to stakeholders outside of AI labs, including scientists, civil society, and governments. This both incentivizes safer practices and gives external stakeholders more-informed views of risk.