AI

Ontology is the real guardrail: How to stop AI agents from misunderstanding your business

Companies are investing billions of dollars in AI agents and infrastructure to transform business processes. However, we see limited success in real-world applications, often due to the inability of agents to truly understand corporate data, policies, and processes.

While we manage integrations well with technologies like API management, model context protocol (MCP) and others, it’s a different story when agents truly understand the “meaning” of data in the context of a given business. Business data typically resides in different systems in structured and unstructured forms and must be analyzed with a domain-specific business lens.

For example, the term “customer” might refer to a different group of people in a Sales CRM system, compared to a finance system that might use this tag for paying customers. One department might define “product” as a SKU; another might represent a “product” family; a third as a marketing bundle.

Thus, “product sales” data varies in meaning without agreed upon relationships and definitions. If agents want to combine data from multiple systems, they need to understand different representations. Agents need to know what the data means in context and how to find the right data for the right process. Additionally, schema changes to systems and data quality issues during collection can lead to increased ambiguity and the inability of agents to know how to act when such situations arise.

Furthermore, the classification of data into categories such as PII (personally identifiable information) must be strictly followed to comply with standards such as GDPR and CCPA. This requires that the data is properly labeled and that agents can understand and respect this classification. So we see that building a cool demo using agents is very doable – but putting it into production working with real business data is a whole different story.

See also  Boston's seaport has become a billion-dollar real estate hotspot, but rising waters threaten

The ontology-based source of truth

Building effective agentic solutions requires an ontology-based single source of truth. Ontology is a business definition of concepts, their hierarchy and relationships. It defines terms related to business domains, can help establish a single source of truth for data, establish uniform field names, and apply classifications to fields.

An ontology can be domain specific (healthcare or finance), or organization specific based on internal structures. Defining an ontology in advance is time-consuming, but can help standardize business processes and provide a strong foundation for agentic AI.

Ontology can be realized using common searchable formats such as triplestore. More complex business rules with multi-hop relationships could use labeled property graphs like Neo4j. These charts can also help companies discover new relationships and answer complex questions. Ontologies such as FIBO (Finance Industry Business Ontology) and UMLS (Unified Medical Language System) are available in the public domain and can be a very good starting point. However, these usually need to be customized to capture specific details of a business.

Getting started with ontology

Once implemented, an ontology can be the driving force for business agents. We can now enable AI to follow the ontology and use it to discover data and relationships. If necessary, we can have an agentic layer self-serve key details of the ontology and discover data. This ontology can implement business rules and policies that agents must adhere to. This is a great way to ground your agents and create guardrails based on real business context.

Agents designed in this way and tuned to follow an ontology can adhere to the guardrails and avoid hallucinations that can be caused by the large language models (LLM) that power them. For example, a company policy may specify that unless all documents associated with a loan do not have their verified flags set to true, the loan should be kept in pending status. Agents can bypass this policy and determine which documents are needed and consult the knowledge base.

See also  Google says its updated Gemini 2.5 Pro AI model is better at coding

Here is an example implementation:

(Original figure by author)

As illustrated, we processed structured and unstructured data by a document intelligence agent (DocIntel) that populates a Neo4j database based on a business domain ontology. A data discovery agent in Neo4j finds and queries the right data and passes it on to other agents who handle business process execution. The communication between agents takes place using a popular protocol such as A2A (agent to agent). A new protocol called AG-UI (Agent User Interaction) can help build more generic UI screens to capture the operations and responses of these agents.

This method allows us to avoid hallucinations by forcing agents to follow ontology-driven paths and maintain data classifications and relationships. Furthermore, we can easily scale by adding new resources, relationships, and policies that agents can automatically comply with, and we can control hallucinations by defining rules for the entire system rather than for individual entities. For example, if an agent hallucinates an individual ‘customer’, because the connected data for the hallucinated ‘customer’ will not be verifiable upon discovery of the data, we can easily detect this anomaly and make plans to eliminate it. This helps the agent system scale with the business and manage its dynamic nature.

A reference architecture like this does indeed add some overhead in data discovery and graph databases. But for a large enterprise, it adds the right guardrails and instructs agents to orchestrate complex business processes.

Dattaraj Rao is an innovation and R&D architect at Persistent systems.

Read more of our guest writers. Or consider posting yourself! See our guidelines here.

See also  Pope Leo makes AI’s threat to humanity a signature issue

Source link

Back to top button