Three years ago, every finance leader had an RPA program. Today, most of those programs are quietly being decommissioned. The bots break. The maintenance burden grew faster than the savings. The use cases that worked — copy-paste from one screen to another — are already invisible inside modern SaaS.
Meanwhile, "AI" has been on every vendor pitch since November 2022. Most of it is a chatbot pasted on top of a workflow.
The actual shift — the one that changes how finance and procurement work — is agents. Not generative AI suggestions. Not RPA bots. Agents: software that perceives a situation, decides what to do, takes the action, and learns from the outcome.
This piece is the practitioner's guide to what that means. What an agent actually is (and is not). Why finance and procurement are the killer use case. What to look for in an agent platform. And why "open" matters more than "how many".
The category confusion
The word "agent" is overloaded right now. Five different things get marketed under the same label. Let us be precise.
| Pattern | What it does | What it does NOT |
|---|---|---|
| RPA bot | Replays clicks and keystrokes on a UI | Reason about exceptions; adapt when the UI changes |
| Chatbot | Answers questions from a knowledge base | Take action in source systems |
| Copilot | Drafts content the human reviews | Decide; act autonomously |
| Workflow automation | Runs a deterministic graph of pre-defined steps | Plan its own steps |
| AI Agent | Perceives state, plans steps, calls tools, acts, learns | None — but only if all five capabilities are present |
If a vendor calls something an "agent" and it is missing any of: autonomous goal pursuit, dynamic planning, tool use, action authority, and learning loop — it is something else. Usually a copilot or workflow with an LLM glued to it.
The distinction matters because the capability you can promise the buyer is fundamentally different. A copilot saves the user a few minutes per task. An agent removes the task from the user's day.
Why finance and procurement is the killer use case
A lot of agent demos look the same: a Slack-like UI, a question, a tool call, an answer. Cute, but rarely the point.
The reason finance and procurement is where agents will land first — and hardest — is structural:
1. The data is structured and digitized. Invoices, POs, contracts, payments, GL postings — these are not free-form prose. They are typed entities with known schemas. An agent does not have to guess what an invoice is. It can reason about it.
2. The KPIs are crisp and shared. DSO, days payable outstanding, working capital, processing cost per invoice, error rate, on-time payment rate. There is no debate about whether the agent did better. The number is the number.
3. The work is high-volume, repetitive, and surprisingly judgement-light. 80% of invoices follow standard patterns. 80% of POs against framework contracts are routine. 80% of supplier-onboarding flows are KYS / KYB / KYC checks against the same data sources. The 80% is what an agent eats. The 20% is what the human still does — but with two hours instead of forty.
4. The downside of an agent error is bounded and reversible. A wrong payment can be reversed. A wrong PO can be cancelled. A wrong invoice approval can be re-routed. Compare this with healthcare, legal, or aerospace, where mistakes have step-function consequences. Finance is forgiving enough to learn in production.
5. The economic value is enormous and trackable. A 70% reduction in invoice processing time on a 100K invoices/year operation is 2 to 3 FTEs of savings. The CFO sees it in the budget. The procurement leader sees it in supplier satisfaction. The board sees it in the EBITDA contribution.
This is the rare combination — structured data, clear KPIs, high-volume routine work, bounded risk, measurable economic value — that makes finance and procurement the most fertile ground for agents in the enterprise.
The six things an agent must do
If you are evaluating an agent platform — for finance, procurement, or anything else — these are the capabilities to inspect.
Perceive. The agent must observe the state of the world it operates in. For an invoice processing agent: read the invoice, retrieve the matching PO, fetch the supplier record, check the contract, look up the buyer's authority. Without grounded perception, the agent hallucinates.
Plan. Given a goal ("process this invoice") and the perceived state, the agent must decide the sequence of steps. For a non-PO invoice, the plan branches: classify the spend, infer the cost center, route for approval based on amount, validate against budget. The plan is not pre-written; it is generated.
Use tools. The agent must call concrete tools — APIs, queries, action endpoints — to gather information and take action. No tool use, no agent. This is what the Model Context Protocol (MCP) standardized for the industry: a typed contract for what an agent can do.
Act. The agent must have action authority — the ability to write to source systems with proper credentials, audit trails, and reversibility. A "read-only" agent is a chatbot. A real agent posts the journal entry, schedules the payment, sends the email, updates the supplier record.
Reason about uncertainty. Every action carries confidence. The agent must know when it knows, and know when to escalate to a human. Without confidence-aware action, you ship hallucinations into production.
Learn. Outcomes feed back. The agent that misclassified a marketing invoice as IT spend last month should not make the same mistake this month. The agent that flagged a supplier as risky and was overridden by the buyer should incorporate that signal.
If any of these six is missing, you are not buying an agent. You are buying a copilot or a workflow.
The "open platform" argument (why agent count is the wrong number)
Most agent platforms market a count: "5 finance agents", "8 procurement agents". This framing tells you the wrong story. It positions the platform as a closed catalog — a SKU list of pre-baked bots. Sales love it because it is concrete.
The reality is different. Finance and procurement are infinite in shape because every business is configured slightly differently. The supplier-onboarding flow at a French manufacturer is not the same as at an Indian SaaS company. The treasury optimization at a multi-entity industrial group is not the same as at a single-entity retailer. The intake-to-pay flow at a healthcare distributor has compliance gates that a media holding company will never have.
A closed catalog of 5 or 8 agents will, by definition, miss at least 30 use cases per customer.
The right question is: does the platform let me build agents I have not seen pre-built? This is the difference between a closed product and an open platform.
The architectural test: an open agent platform exposes a substrate — typed data, typed actions, governance — that any agent (vendor-built, customer-built, partner-built) can run on. The same orchestration engine, the same audit trail, the same policy enforcement, the same reversibility, regardless of which agent is acting.
The closed catalog test: a closed product hardcodes 8 use cases and lets you configure them at the edges.
We optimize Flowie for the open architecture. Dozens of agents are pre-built and ready to deploy — invoice processing, supplier onboarding, approval copilot, collection, compliance, anomaly detection, sourcing, treasury, and more — but the platform is the dozens-becomes-infinite part. Build your own on Astral via MCP. Same substrate, same governance, same audit trail.
This matters because the agent SKU you need next year is not on this year's roadmap. Yours or ours. The platform that lets you ship that agent in days is the one that lasts.
The part most agent demos skip: governance
Every agent demo shows the happy path. The invoice arrives, the agent matches it, the agent approves it, everyone goes home. Wonderful.
The CFO's question is not about the happy path. It is about the unhappy one.
What happens when the agent is wrong?
This is where serious agent platforms diverge from demos.
Six governance capabilities to require:
- Policy engine. Hard rules the agent cannot break. "No payment over €50K without controller approval." "No supplier onboarding without KYS clearance." The agent's plan is constrained against policy at planning time, not at execution time.
- Human-in-the-loop by default. Every consequential action defaults to human review. The customer dials autonomy up, not down. A finance team that has burned itself on RPA does not want full-autonomy on day one.
- Confidence thresholds. The agent escalates anything below a configurable confidence level. Configurable per use case, per amount, per supplier tier.
- Provenance trail. Every action has a lineage — this was decided based on PO 4471 retrieved at 14:02, supplier record version 7, contract clause 4.3, controller policy 2.1. When the auditor asks why, the answer is in the graph.
- Audit log. Immutable, queryable, exportable. ISO 27001 / CyberVadis / regulatory audit-ready.
- Reversibility. Every action is reversible. The agent that approved an invoice in error can un-approve it, and the system records both the original and the reversal.
Without these six, you have built a faster way to make mistakes at scale. With them, you have built a system that runs in production.
What changes when humans become the orchestrator (not the executor)
The interesting shift is not that agents replace humans. It is that agents invert the human role.
Before: humans execute the workflow. They open the invoice, copy data, paste into the ERP, click approve, draft the email, click send. The agent (or RPA bot) is a tool the human uses.
After: agents execute the workflow. The human reviews exceptions, approves edge cases, sets policy, looks at exception trends, redesigns the workflow when patterns shift. The human is the orchestrator the agent reports to.
This is a different job. It is a more interesting one. It also requires fewer people for the same volume — which is the entire economic argument.
For the AP team that processes 100K invoices a year today with 8 people, the post-agent operation processes 200K invoices a year with 3 people. The 5 redeployed people work on supplier strategy, exception analysis, and process redesign — the work that was previously deprioritized because routine work consumed the calendar.
This is the strategic conversation finance leaders should be having now. Not "should we deploy AI?" — that is settled. The conversation is "what does my org look like in 18 months when 60% of routine F&P work is agent-executed?". The teams that answer that early get to redesign. The teams that wait get reorganized.
A 90-day roadmap to your first production agent
Concretely, what does the first 90 days look like?
Days 0–30: Pick the highest-volume, lowest-risk use case. For most teams, this is invoice processing. Structured data, clear KPI (cycle time + cost per invoice), bounded downside (errors are reversible), measurable ROI in 60 days.
Days 30–60: Deploy with high-touch human-in-the-loop. Set the confidence threshold high. Have the agent process every invoice, but have a human review every action for the first 30 days. Build trust with the controller before turning autonomy up.
Days 60–90: Tune autonomy by class. Routine invoices against framework contracts under €5K? Full autonomy. Anything above the threshold? Human review. Anything novel? Escalate. The agent will earn autonomy class by class.
After 90 days, you have one agent in production with measured ROI, an audit trail, a stakeholder (the controller) who trusts the system, and an operating model. From there, the second agent (supplier onboarding, or approval routing, or collection) takes 30 days, not 90. Each subsequent agent inherits the substrate.
The bottom line
The vendors selling you "an AI agent for finance" are mostly selling you copilots with confidence. The vendors selling you "8 agents" are selling you a closed catalog that will not match your shape.
The platform that matters is the one that gives you:
- A substrate any agent can run on (typed data, typed actions, governance)
- Dozens of pre-built agents to start with
- The ability to build agents you have not yet imagined
That is the architecture finance and procurement need. That is what we have built with Flowie's open agentic platform — running on Astral, our knowledge graph substrate.
If you are evaluating where agents land in your stack, explore the AI Agents page or book a call. We will walk you through what is in production today, what governance looks like, and how the first 90 days actually go.
The shift is real. The work is structural. The platforms that lock you in will not be the ones that win. The open ones will.

