Given the evolving vendor landscape and the increasing methods of integrating AI throughout platforms, organizations interested in leveraging procurement software are understandably interested in cutting through the noise and truly understanding how vendors compare against each other. This is where we are often brought in as analysts.
Spend Matters analysts compare procurement technologies, including AI capabilities, what they do and why it matters as their core competency.
What has changed is not just the volume of AI claims, but how intelligence is being applied inside procurement tech platforms. Instead of appearing mainly as standalone features, AI is increasingly influencing how requests move through the system, how work is routed and how exceptions are handled. As a result, what an agent does is closely tied to how the platform is designed, which makes simple, feature-by-feature comparisons misleading.
However, the way in which organizations approach these decisions may be fundamentally flawed. Tech selection should always be about finding the software provider that best fits the particular organization, usually due to a combination of technical capabilities, services, integrations/industry-specific support and, of course, price. But because vendors are AI-focused, the questions we get from these organizations increasingly stray from just finding the best fit. Instead, we are typically asked, “Whose AI agents are better?” because that has already been decided as the main driver behind the decision. This framing implicitly assumes that agents are standalone assets, comparable in isolation, rather than behaviors that emerge from underlying data models, orchestration logic and control structures within the platform.
There are several reasons why this is not the best way to approach this decision, but the first reason is that the question cannot be answered objectively. We receive product demos regularly from the relevant vendors which largely act as the backbone of our understanding of what each vendor can (and cannot) do. But due to the probabilistic nature of agents, what we see in a demo is not necessarily indicative of how that same feature will work in a customer’s environment. The agent may behave differently once it is using real customer data, policies and integrations. We have methods for understanding this in our demos, but simply seeing an agent execute a task in a demo is not sufficient evidence of its usefulness. The outcome is what is most important, and there are several organization-specific factors that can contribute to the outcome. In practice, agent behavior is constrained less by model capability than by platform architecture: document-centric data models, static workflows, brittle integrations and weak-state management often determine how much an agent can act before human intervention is required.
Another reason is ‘agentwashing,’ which is an extension of ‘AI-washing’ which itself comes from its predecessor of ‘greenwashing’ — essentially, software providers are claiming that features are agentic when that is not always the case. For example, a rules-based workflow that may leverage an LLM at some point in the process is not agentic; in order for the process to truly be agentic, there must be true autonomy and orchestration. The agents need to have the autonomy to decide, at least partially, how to achieve their micro-goals. And there must be an orchestration layer or agent that decides which specialized agents to trigger and when. Additionally, supervision or quality control agents are also useful to act as another governance later. While these capabilities do exist, much of what is being promoted as ‘agentic’ does not meet this criteria. In many procurement tech platforms, intelligence is still layered onto deterministic workflows, meaning AI can optimize steps but cannot reframe decisions or resolve trade-offs. This creates the illusion of agency without materially changing execution behavior.
Another reason is that, as mentioned before, this is not the ideal way to approach a software selection. This is both because of the multitude of factors that impact how the ideal vendor would serve each specific customer and because the ‘best’ agent is relative. The answer would depend on what the organization wants the agent to do, why it needs it and what outcome it expects. An agent designed to reduce manual effort through confidence-based automation will look very different from one intended to manage ambiguity, prioritize exceptions or dynamically select execution paths across modules. Comparing them without anchoring on outcomes is not as revealing as one may assume.
Overall, we would advise against making the ‘best agents’ question the primary factor during software selection. What is ‘best’ will differ based on each organization and its desired outcomes, and it is also important that the software providers have platforms that are extensible enough to allow other agents (i.e., ones that were not developed by that same vendor) to interact with the data and other agents on those platforms. This extensibility is not just about APIs, but about whether the platform can support shared context, durable state, explainability and orchestration across agents rather than treating them as isolated automations. Having this in mind will also allow organizations to better understand whose agents actually match the definition of the word, which will better enable these companies to achieve the outcomes they expect. In practice, the more durable differentiator will not be whose agents appear more capable in a demo, but whose platforms allow agent behavior to evolve without constantly adding rules, workflows or manual oversight.
Please reach out if you’d like help or to discuss further.

