Our Standards

Guidelight's standards describe what safe frontier AI development looks like in concrete terms, balanced by what's achievable. Read more about our standards development process.

Control (v1.0)

View full standard →

Control refers to the technical and operational measures that constrain what an AI system can do, regardless of whether it is aligned. These measures both reduce catastrophic risk from a misaligned AI and can surface evidence of an AI's misalignment.

Principle 1

Transparency (v1.0)

View full standard →

Transparency makes safety-relevant information legible to stakeholders outside of AI labs, including scientists, civil society, and governments. This both incentivizes safer practices and gives external stakeholders more-informed views of risk.

Principle 1

Expose your risk assessment to public scrutiny.

→Principle 2

Inform the public about incidents in a complete and timely fashion.

→

Our Standards

Control (v1.0)

Be able to see what your AI is doing during internal deployment.

Scan for signs of concerning behavior.

Stress-test the sufficiency of your scanning.

Stop the AI from taking harmful actions even if it tried.

Have independent third parties verify the adequacy of your control regime.

Prepare for a possible breach of control.

Transparency (v1.0)

Expose your risk assessment to public scrutiny.

Inform the public about incidents in a complete and timely fashion.