US Treasury's New AI Framework for Financial Services: What It Addresses—and What It Doesn't
- ggstoev
- Feb 20
- 5 min read
Updated: Feb 25
On February 19, 2026, the U.S. Department of the Treasury released two foundational resources for AI governance in financial services: an AI Lexicon and the Financial Services AI Risk Management Framework (FS AI RMF). These resources establish common terminology and adapt NIST's AI Risk Management Framework to the specific operational and regulatory context of financial services.
For institutions deploying traditional predictive AI models, these frameworks provide valuable guidance. However, they reveal an emerging challenge: the governance assumptions embedded in these frameworks were designed for AI systems fundamentally different from the agentic systems many institutions are now implementing.
Understanding the Governance Challenge
Traditional AI governance frameworks like NIST AI RMF and ISO 42001 were developed primarily for AI systems with certain operational characteristics - models with static parameters, periodic retraining cycles, and human decision points at critical junctures such as data provenance, model selection, pre-deployment decisions and so on. The Treasury's adaptation maintains these foundational assumptions while adding financial services-specific considerations around consumer protection and regulatory compliance.
Agentic AI systems operate differently. Unlike traditional predictive models or generative AI that outputs recommendations, agentic systems can independently plan, reason, and take autonomous actions to achieve objectives on behalf of users.
As Singapore's recently published Model Governance Framework for Agentic AI (released January 22, 2026) describes, these systems can:
Break tasks into subtasks and select appropriate tools
Execute actions and adapt to real-time feedback
Access sensitive data across operational systems
Modify operational environments (updating databases, processing payments) without direct human intervention
This creates fundamentally different governance challenges. When AI systems can act autonomously across multiple steps toward an objective, traditional governance assumptions about human checkpoints, periodic reviews, and manual oversight become increasingly misaligned with operational reality.
Three Critical Gaps in Current Approaches
1. Accountability Attribution
When multiple AI agents collaborate on a financial decision—for instance, one agent screening for fraud, another assessing credit risk, and a third executing the transaction—traditional accountability frameworks struggle. Standard RACI matrices and role-based controls assume discrete human decision points that may not exist in agentic workflows.
The Treasury's FS AI RMF addresses accountability through governance structures and oversight mechanisms, but these recommendations were developed for systems where decisions can be traced to specific model outputs and human approvals. Agentic systems require what Singapore's framework describes as making "humans meaningfully accountable"—the ability to trace collaborative decisions back through multiple agent interactions to specific approval authorities and boundary conditions.
2. Continuous Monitoring Requirements
Financial regulators have increasingly moved from guidance to mandates around continuous monitoring for high-frequency AI systems. This shift reflects recognition that traditional point-in-time assessments cannot adequately oversee systems operating at machine speed.
However, most institutions still operate governance on point-in-time assessment cycles. A risk assessment conducted in Q2 captures model behavior at that moment, but agentic systems may drift, adapt, or encounter edge cases continuously throughout Q3. The gap between assessment cadence and operational reality creates unquantified risk exposure.
Recent analysis from financial services researchers, including work by London School of Economics Vikram Singh on what he terms "living compliance," suggests this isn't merely a matter of conducting reviews more frequently. It requires architectural changes where governance operates as a continuous data stream integrated into system operations, rather than as a periodic external review.
3. Decision Traceability at Scale
Traditional audit trails assume that critical decisions can be sampled and reviewed. When an agentic system processes thousands of micro-decisions per hour across multiple operational domains, sampling becomes statistically problematic. What percentage of decisions requires human review? How are edge cases identified in real-time versus discovered retrospectively?
The Treasury framework provides guidance on documentation and explainability, but these recommendations assume a volume and velocity of decisions that human teams can meaningfully review. This assumption requires re-examination for agentic deployments.
Operational Approaches for Agentic Governance
Based on analysis of emerging frameworks and early institutional implementations, several operational patterns are developing. Singapore's framework, for example, outlines four key dimensions for agentic AI governance:
Assess and bound the risks upfront
Make humans meaningfully accountable
Implement technical controls and processes throughout the agent lifecycle
Enable end-user responsibility through transparency and training
However, the framework stops short of prescribing specific operational mechanisms. Leading institutions are translating these principles into practice through approaches such as:
Automated Decision Registries
Rather than sampling decisions for review, leading institutions are implementing comprehensive decision registries that automatically capture:
Agent identity and role in multi-agent workflows
Decision parameters and boundary conditions at execution time
Escalation triggers and human override points
Stakeholder notification requirements and execution
This approach addresses the accountability dimension by shifting from periodic sampling to comprehensive attribution of agent decisions and actions.
Measurable Governance Metrics
Moving beyond compliance checklists, institutions are defining specific, trackable accountability metrics:
Decision attribution completeness: percentage of autonomous decisions with complete audit trails
Response time for stakeholder redress and inquiry
Human oversight effectiveness: correlation between review triggers and actual risk events
Transparency satisfaction: stakeholder confidence in governance processes
Implementation Considerations
For financial institutions evaluating their governance readiness for agentic AI:
Assessment Phase (30-60 days):
Complete inventory of AI systems, distinguishing between traditional predictive models and agentic deployments
Map current governance cadence against actual AI decision velocity for each system
Identify gaps where manual review cannot keep pace with operational requirements
Document current accountability structures and decision traceability mechanisms
Implementation Phase (90-180 days):
Deploy automated decision registries for highest-velocity systems
Establish continuous monitoring for regulatory boundary violations
Define and instrument measurable governance metrics
Develop escalation protocols for real-time human intervention
Maturity Phase (6-12 months):
Integrate governance controls into system architecture ("governance by design")
Develop internal expertise in accountability and transparency governance practices
Conduct maturity assessment to benchmark effectiveness and identify improvement areas
Establish feedback loops for continuous governance refinement
Looking Forward
The Treasury's release of the AI Lexicon and FS AI RMF represents important progress in establishing common understanding around AI governance in financial services. Deputy Secretary Derek Theurer's emphasis on "practical resources that institutions can use" reflects appropriate recognition that effective governance requires implementable frameworks, not merely aspirational principles.
However, as institutions move from predictive AI to agentic systems, governance frameworks must evolve to match operational reality. This doesn't mean abandoning established principles around transparency, accountability, and risk management. Rather, it requires rethinking how these principles are operationalized when AI systems operate at machine speed with autonomous decision-making authority.
The institutions that successfully navigate this transition will be those that view governance not as a constraint on AI deployment, but as the architectural foundation that enables safe scaling of autonomous systems. In an environment where regulatory scrutiny of AI is intensifying, at least as imposed by the EU AI Act as well as regional (state) regulations, and stakeholder expectations around transparency are rising, robust governance becomes a competitive advantage rather than a compliance burden.
References
Recent Developments:
Foundational Frameworks:
ISO/IEC 42001: AI Management System requirements
