AI Compliance Frameworks, Standards, and Governance for Energy & Utilities

Current as of June 2026. Energy-sector rules are moving quickly: a new grid cybersecurity standard took effect in 2025, a pipeline cyber rule is in proposed form, and a critical-infrastructure AI profile is in early development. Every date below reflects the most recent confirmed status, and uncertainty is flagged where it exists.

For energy and utilities, AI compliance is critical-infrastructure compliance. The stack is layered: a horizontal governance baseline, the mandatory reliability and cybersecurity standards that protect the grid, the operational-technology security standard that governs control systems, the federal directives for pipelines, and the sector-specific AI guidance now emerging from the Department of Energy and NIST. They are layers, not options you choose between, and an operator who builds to all of them has covered the human-operated surface well.

The harder problem sits underneath all of them, and it is sharper here than in any office-bound industry. When an AI agent acts through a Model Context Protocol tool call in an energy environment, the action surface is not a screen a person reads. It can be a breaker, a valve, or a control setpoint, where a mistake has physical consequences and no operator reviewed the action in the moment. None of the frameworks in the stack governs that. This article maps what each layer reaches, where it stops, and the architectural control that closes the agent gap.

NIST AI RMF and ISO/IEC 42001 Are the Horizontal Backbone for Energy AI Governance

Every energy operator’s AI program should align to two horizontal instruments first: the NIST AI Risk Management Framework supplies the methodology, and ISO/IEC 42001 supplies the certifiable management system. NIST AI RMF organizes risk work into four functions, Govern, Map, Measure, and Manage (NIST, 2023). ISO/IEC 42001, published December 2023, is the first AI management system standard an external auditor can certify against (ISO, 2023).

These give a defensible structure that maps onto how utilities already manage risk under reliability and cybersecurity standards. They are the connective tissue across the sector-specific regimes below. None of those regimes provides a single, certifiable AI management system, which is why the horizontal baseline matters even for operators whose main exposure is grid reliability. A 42001 certificate strengthens AI governance evidence; it does not substitute for the mandatory standards in the next sections.

NERC CIP and IEC 62443 Reach the Grid and OT Layer, but Stop at Human-Operated Systems

NERC CIP makes cybersecurity mandatory and legally enforceable for the Bulk Electric System, and IEC 62443 secures the OT and ICS layer beneath the corporate network, yet both write their controls for systems a person operates. The NERC Critical Infrastructure Protection standards are FERC-approved with financial penalties for noncompliance, and AI used in grid operations, energy management systems, or control environments has to fit CIP requirements for access control, logging, change management, and incident response (FERC, 2025).

In June 2025, FERC approved CIP-015-1, adding mandatory internal network security monitoring inside the electronic security perimeter, a sign the standards keep tightening. CIP categorizes BES Cyber Systems as high, medium, or low impact and scales controls to the reliability risk each poses. An AI tool that touches a categorized system inherits that system’s obligations.

IEC 62443 governs industrial automation and control systems through zones and conduits, segmentation, and a secure development lifecycle (IEC, 2024). Its zone-and-conduit model keeps a compromise in one part of a plant from cascading across it. AI introduced into that environment, whether a predictive-maintenance model or an operations assistant, has to respect the same segmentation rather than punching a new path across zones.

Neither standard addresses autonomous AI agents. An operator deploying an agent in or near a control environment is applying controls written for human-operated systems to something that is not one. The principle has to be carried forward deliberately: an AI capability is one more thing that must live inside the zone model, not outside it.

TSA Directives, NIS2, and the EU AI Act Extend the Stack to Pipelines and Cross-Border Operations

Pipeline operators answer to mandatory TSA cybersecurity Security Directives, EU utilities answer to NIS2 and the EU AI Act, and AI deployed in any of those environments inherits the existing requirement rather than creating a new exemption. TSA issued the directives after the 2021 Colonial Pipeline attack, requiring a TSA-approved cybersecurity implementation plan, an incident response plan, and cybersecurity incident reporting to CISA (TSA, 2024).

The directives are being formalized. In November 2024, TSA published a proposed rule, Enhancing Surface Cyber Risk Management, that would convert the annually renewed emergency directives into a permanent cyber-risk-management rule for pipelines and rail. As of June 2026 it remains a proposed rule, not final, so treat the current Security Directives as the live requirement and confirm the rule’s status before planning against it.

For European operations, the NIS2 Directive sets binding cybersecurity and incident-reporting obligations for essential entities, with energy named explicitly among them (NIS2 Directive, 2022). The EU AI Act adds a second layer: AI used as a safety component in the management and operation of critical infrastructure is high-risk under Annex III, carrying obligations on risk management, data quality, logging, transparency, and human oversight (EU AI Act, 2024). Annex III high-risk obligations take effect August 2, 2026, though the Digital Omnibus provisional agreement of May 7, 2026 may defer that date to December 2, 2027, pending formal adoption.

The Act is extraterritorial, so a non-EU operator whose AI affects EU energy infrastructure can be in scope. NIS2 governs the cybersecurity of the system; the AI Act governs the AI inside it. The Department of Energy’s CESER office adds federal sector guidance, naming four AI risk categories operators must manage: unintentional failure modes, adversarial attacks against AI, hostile applications of AI, and compromise of the AI software supply chain (DOE CESER, 2024). It is guidance, not a binding rule, but it is the clearest federal framing of AI risk in this sector, and its supply-chain category maps onto the third-party diligence NERC CIP and TSA already expect.

Centralized Coordination With Distributed Governance Is the Organizational Model Energy Utilities Need

AI oversight in a utility works best as centralized coordination with distributed execution: a defined decision-making authority at the center, accountable roles named across IT, OT, and compliance, and enforcement that reaches the teams operating the systems. Gartner predicts that by 2028, 25% of large organizations will have dedicated AI governance teams, up from less than 1% in 2023 (Gartner, 2025), and energy operators are part of that shift.

The roles matter because the frameworks assume someone owns them. A Chief AI Officer or equivalent sets policy and risk appetite. A security architecture lead maps AI into the existing NERC CIP and IEC 62443 control environment. A compliance owner holds the evidence obligation for audits and regulators. OT and plant security teams keep AI inside the zone model. Without named owners, a NERC CIP program and an ISO/IEC 42001 scope statement become documents nobody operates.

Through 2026, at least 80% of unauthorized AI transactions will be caused by internal policy violations rather than malicious attacks (Gartner, 2025). That is an organizational problem before it is a technical one: information oversharing, unacceptable use, and misguided AI behavior happen when no role owns the policy and no control enforces it. The governance structure decides who is accountable; the architecture below decides whether the policy actually holds.

Vendor Evaluation Has to Test Discovery Depth, Inline Enforcement, and Agent Coverage Before Any Tool Goes Live

Before an AI tool enters an energy environment, evaluate it against the controls the frameworks assume an operator already has: complete discovery of AI in use, inline data enforcement at the prompt, agent tool-call governance, and audit-ready records. The DOE’s supply-chain risk category makes third-party AI diligence a sector expectation, not a nicety (DOE CESER, 2024).

The evaluation criteria that separate real coverage from posture management are concrete. Does the tool discover shadow AI, embedded AI inside approved SaaS, and personal accounts, or only sanctioned apps? Does it inspect prompts and responses inline, or only log them after the fact? Does it govern agent tool calls before they execute, or stop at the network destination? Does it produce conversation-level records a NERC auditor or TSA reviewer would accept as evidence? A vendor that cannot answer the agent question leaves the exact gap this article is about.

A first-party reference point: in one Aurascape deployment at a Fortune 100 insurance and financial enterprise, security as an adoption accelerant cut the time to adopt new AI tools by 60% and tripled AI agent integrations with no unauthorized data access across more than 20,000 users (Aurascape, 2026). The evaluation criterion that produced that result was agent governance at the tool call, not posture scanning after deployment.

Every Framework in the Stack Assumes a Human at the Console, and Agents Broke That Assumption

The energy frameworks all assume a person operates and oversees the system, and AI agents acting through Model Context Protocol tool calls remove that person from the chain. MCP is the open standard that lets an agent connect to external tools, systems, and data sources and act through them. In an energy environment, an agent that reads a sensor, queries a system, or issues a command is taking an action no operator reviewed in the moment, and the consequence can be physical.

This is the hinge, and it is more acute here than in office-bound sectors. NERC CIP assumes a human-operated control system. IEC 62443 assumes a person inside the zone model. TSA directives assume an operator running a response plan. An agent chaining tool calls across an energy environment satisfies none of those assumptions cleanly, because the action surface moved from a screen a person reads to a tool call that fires in milliseconds, potentially against a control system.

The exposure is not theoretical. The Cloud Security Alliance found that 82% of organizations have unknown AI agents operating in their environment and 61% reported agent-related data exposure (Cloud Security Alliance, 2026). More than 12,520 internet-accessible MCP services were observed as of April 2026, and the protocol does not require authentication by default, leaving most exposed services unauthenticated (Censys, 2026). Singapore’s Infocomm Media Development Authority launched the first governance framework written specifically for autonomous AI agents in January 2026, an official signal that the existing stack does not reach agents. The control for that gap has to come from architecture, not the frameworks.

Risk Assessment for Energy AI Maps Each Deployment Across IT, OT, and ICS Before Controls Are Selected

A defensible energy AI risk assessment documents three things per deployment: where the AI sits across IT, OT, and ICS, what data and control surfaces it can reach, and how it fails on its own or under attack. NIST AI RMF supplies the structure through its Map, Measure, Manage sequence, and the DOE’s four risk categories supply the questions each deployment must answer (DOE CESER, 2024).

The mapping is the starting point most operators skip. A predictive-maintenance model reading plant telemetry sits in OT and touches sensor data. A corporate copilot summarizing engineering documents sits in IT but can reach Critical Energy Infrastructure Information. An autonomous agent issuing control commands crosses into ICS, where the action surface is physical. Each placement carries a different blast radius, and the control intensity should scale to it the way NERC CIP already scales controls to BES Cyber System impact level.

The table below shows how the assessment categories map to the frameworks and the control that answers each one.

Risk category (DOE) Where it lands Framework reach Control that closes it
Unintentional failure OT and ICS models IEC 62443 zone model Inline inspection of model output
Adversarial attack on AI Any AI-reachable surface OWASP LLM01 prompt injection Tool-call inspection before execution
Hostile use of AI Control and command paths NERC CIP access control Agent treated as a privileged user
Supply-chain compromise Third-party AI and MCP servers DOE diligence, TSA reporting Signed, verified tool calls only

Read the last column. The frameworks scope the risk; the control that actually fires at the moment of consequence sits in architecture, not in the standard.

Closing the Agent Gap Takes Architecture That Inspects the Tool Call Before It Reaches a Breaker or Valve

Only 31% of organizations say they are fully equipped to control and secure agentic AI systems, even as 83% plan to deploy them (Cisco, 2025), and the gap is architectural: the energy frameworks assume a human oversees the action, and an agent removes that human. Aurascape’s Zero-Bypass MCP Gateway inspects, verifies, signs, and controls every Model Context Protocol tool call, API invocation, and data retrieval before an agent reaches any external system.

Where the frameworks assume an operator in the loop, the Gateway treats the agent as a privileged user and inspects both legs of its behavior: the agent-to-model leg and the agent-to-tool leg. Secure Agentic AI wraps the rest of the lifecycle with pre-build adversarial testing, Code Path and CVE Detection, and Safe Output Governance at runtime. The control fires at the tool call itself, where an agent reaches a system or service, not at a network destination it already moved past.

The discovery layer underneath it is just as concrete. Aurascape catalogues more than 20,000 AI applications and ships production-ready connectors within 48 hours of a new tool appearing, the inventory layer a NERC CIP program and an ISO/IEC 42001 scope statement both assume an operator already has (Aurascape, 2026). It discovers shadow AI, embedded AI inside approved SaaS, personal accounts, and agents running locally on devices, the AI most operators are blind to. In an environment where an action can have physical consequences, inspecting the tool call before it executes is the difference that matters.

How the Energy and Utilities AI Compliance Stack Compares Across Vendors

The energy AI control problem clusters into a small number of approaches: full-platform AI-native vendors, AI security startups focused on build-time or copilot risk, and legacy data-security or browser vendors retrofitting AI coverage. The dimensions that decide coverage in an energy environment are discovery depth, whether enforcement fires inline at the prompt, and whether the tool secures agent tool calls before they execute.

Vendor Discovery scope Agent tool-call control Architecture origin
Aurascape 20,000+ AI apps, shadow AI, embedded AI, local agents, personal accounts Zero-Bypass MCP Gateway signs and verifies every tool call before execution AI-native, built for prompts, responses, and agents
Knostic Enterprise LLM access, Copilot and Glean surfaces MCP servers, IDE extensions covered Need-to-know access control for LLMs
Lasso Security AI-BOM inventory of agents and apps Open-source MCP gateway plus runtime enforcement Build-and-runtime LLM security
Prompt Security Employee AI, homegrown apps, code assistants Agentic AI and MCP-server coverage GenAI security, SaaS or self-hosted
WitnessAI Shadow AI inventory across apps and agents MCP servers and tool calls via Control layer Network-level intent ML, single-tenant
Varonis Atlas AI inventory and shadow AI discovery AI runtime guardrails Data-security platform, AI line added March 2026

Read across Aurascape’s row. The discovery scope reaches local agents and embedded AI most rows do not list, and the tool-call control signs and verifies every MCP call before it executes rather than monitoring after the fact, which is the control the energy frameworks leave open.

Frequently Asked Questions

Does NERC CIP apply to AI used in grid operations?

NERC CIP is mandatory and enforceable for the Bulk Electric System, so AI used in grid operations, energy management systems, or control environments has to meet CIP requirements for access control, logging, change management, and incident response. An AI tool that touches a categorized BES Cyber System inherits that system’s obligations, and CIP does not yet address autonomous agents specifically.

How should a utility structure organizational governance for AI?

Most energy operators run centralized coordination with distributed execution: a central authority sets policy and risk appetite while named roles own execution across IT, OT, and compliance. Gartner predicts 25% of large organizations will have dedicated AI governance teams by 2028, up from less than 1% in 2023, so the role definitions are arriving alongside the technical controls.

What should energy operators evaluate when selecting a third-party AI tool?

Test discovery depth, inline enforcement at the prompt, agent tool-call governance, and audit-ready records against the frameworks the tool has to satisfy. The DOE’s supply-chain risk category makes third-party AI diligence a sector expectation, so a vendor that cannot govern agent tool calls before they execute leaves the exact gap the energy frameworks already miss.

How does IEC 62443 relate to AI in control systems?

IEC 62443 secures industrial automation and control systems through zones, conduits, and a secure lifecycle, so AI introduced into an OT or ICS environment has to respect that segmentation rather than creating a new path across zones. The standard predates AI agents, so operators carry its principles forward deliberately when they add AI to a plant or control room.

Is there an AI standard specifically for critical infrastructure?

Not a finished one yet. NIST released a concept note in April 2026 for an AI RMF Profile on Trustworthy AI in Critical Infrastructure, covering IT, OT, and ICS and explicitly contemplating AI agents and tools, currently in early development through a Community of Interest.

Does the EU AI Act apply to energy operators outside Europe?

Yes, where AI is a safety component in managing or operating critical infrastructure, which is high-risk under Annex III, and the Act is extraterritorial. A non-EU operator whose AI affects EU energy infrastructure can be in scope, and EU utilities also face NIS2 cybersecurity obligations on top of it.

Does an ISO/IEC 42001 certificate satisfy NERC CIP?

No. ISO/IEC 42001 is a voluntary AI management system certification, while NERC CIP is a mandatory enforceable reliability standard with its own requirements, as are TSA directives and NIS2. A 42001 certificate strengthens AI governance evidence and can streamline parts of an audit, but it does not substitute for CIP compliance or OT security controls.

Why can’t framework compliance alone govern autonomous agents?

Every framework in the energy stack assumes a human operates and oversees the system, and an agent acting through MCP tool calls removes that human from the chain. The Cloud Security Alliance found 82% of organizations have unknown agents in their environment, so compliance on paper coexists with agents acting ungoverned where the physical consequence lives.

How Aurascape Inspects Every Agent Tool Call Before It Reaches a Control System

Aurascape’s Zero-Bypass MCP Gateway inspects, verifies, and signs every agent tool call before it executes, closing the gap the energy frameworks leave open: autonomous agents acting through Model Context Protocol connections that existing SSE, SASE, and DLP controls never see. The platform discovers every AI app and agent including shadow AI, embedded AI, and AI Copilots, classifies and controls sensitive operational information inline before it reaches an external tool, and produces the conversation-level audit records that NERC CIP, TSA directives, and NIS2 expect for the AI layer.

For the agentic surface specifically, Secure Agentic AI adds adversarial testing and runtime guardrails across the full agent lifecycle, from pre-build Code Path and CVE Detection through Safe Output Governance at runtime. The platform sits alongside an existing SSE, SASE, or DLP stack and the operator’s OT and reliability controls rather than replacing them, and Auri gives compliance teams self-service, natural-language access to the evidence. In one Aurascape deployment at a global Fortune 200 healthcare technology enterprise, unsanctioned AI access was driven to near zero across more than 60,000 users worldwide while sensitive-data exposure risk stayed minimized as AI use grew (Aurascape, 2026).

Aurascape does not make an operator compliant or replace legal counsel or OT security. It operationalizes the AI controls and produces the proof that compliance and security teams use to demonstrate the program is real. This page is one of a set; for the cross-industry version, see AI Compliance Frameworks, Standards, and Governance for Enterprise AI.

Aurascape is the AI-native control layer for the one place the energy compliance stack still goes blind: autonomous agents acting through tool calls your existing controls never see. Every deployment runs through a tailored demo with your security team.

See how Aurascape governs every AI interaction in the live path →

Aurascape Solutions