
In Part 1, we looked at how AI is transforming four core data workloads: analytics, data operations, data-powered applications, and security. Each one is embedding AI to work faster and at greater scale. The potential is real, but so is the gap between potential and results.
That gap exists because enterprise data environments are not clean or static. They are highly distributed across platforms, teams, and tools. Data is constantly changing, duplicated, redefined, and repurposed. Definitions drift. Ownership is unclear. Assets accumulate faster than they are curated. What was trusted last quarter may be stale, broken, or risky today.
AI agents are dropped into this environment and expected to perform.
But AI agents need context to function. Without it, they return inconsistent results, miss critical nuances, or hallucinate entirely. An AI agent asked to surface revenue metrics is useless if it can't distinguish between a tested, governed dataset and a sandbox experiment. A security classifier is ineffective if it can't trace where sensitive data flows after it's tagged.
In fast-changing, cluttered, and siloed data environments, this context does not exist in one place. It is scattered across tools and teams, in the form of metadata, embedded in pipelines or locked in dashboards.
To make AI work reliably across your data workloads, you need a context platform. A foundation that collects this metadata, connects it across systems, and turns it into something AI can actually use at query time. A consistent understanding of what your data means, how itβs used, where it flows, and whether it can be trusted. Only with this foundation can AI operate safely and accurately at enterprise scale.
β
A context platform is the combination of three capabilities that work together: metadata preprocessing, a classification system, and a strong lineage backbone.
Every data asset in your organization carries information: what it is, how it was created, who owns it, how frequently it's used, when it was last updated. This is metadata. This includes lineage, usage patterns, logic, ownership, quality, and trust signals.
Without centralized metadata, AI operates on incomplete information. It might generate queries against deprecated sources, or rely on stale dashboards or reports. A context platform pre-processes this metadata and makes it accessible at query time.Β Metadata collection and pre-processing is the foundation that allows AI to generate the right queries and produce reliable results in complex, fast-changing enterprise data environments.
Basic metadata isn't always enough. You also need a way to interpret it, and create new, higher-order metadata based on rules and logic.
This is what a classification system provides. It takes existing metadata and applies rules to generate meaningful tags automatically. For example: if a dashboard was created in a personal workspace, classify it as non-governed. If a column hasn't been queried in 90 days, flag it as inactive. If a metric is derived from a certified source and passes freshness checks, mark it as trusted.
Classification turns raw signals into actionable intelligence. It's what allows AI to answer not just "what is this data?" but "should I use this data?"
Metadata tells you what exists. Classification tells you what it means. Lineage tells you how everything connects.
Lineage tracks the relationships between data assets: where data comes from, what transformations have been applied, and where it flows downstream. This serves three critical purposes.
Assessing reliability. Understanding where data comes from and how it was produced reveals whether it is derived from certified, production-grade sources or from experimental and loosely governed workflows. The origin and dependencies of an asset provide powerful signals about its reliability, and business/AI readiness.
Investigation. When something breaks, lineage lets you trace the problem. Which upstream source changed? What downstream reports are affected? Without lineage, investigation is slow and manual. Teams waste hours hunting through pipelines to find the root cause. With a strong lineage, the path is visible immediately.
Propagation. Lineage also allows signals to flow through the data stack. If a source table is flagged as containing PII, that signal can propagate downstream to every dashboard and report that depends on it. If a pipeline's health status changes, dependent assets inherit that property automatically. Without lineage, signals stay stuck where they're created. A security team might classify a table as sensitive, but that classification never reaches the BI layer where analysts are building reports. Lineage is the connective tissue that makes the context platform coherent.
The four data workloads we covered in Part 1(analytics, security, data-powered applications and data operations) each depend on context in different ways.
When an AI agent is asked a business question, it needs to understand which metrics are official, how they are defined, which models they rely on, and whether the underlying data is fresh, complete, and actively used.
By drawing on lineage, business logic, usage signals, and trust indicators, a context platform allows AI to select the right assets for a given question and generate queries that reflect how the business actually measures performance. This prevents AI from relying on experimental models, deprecated tables, or inconsistent definitions, and enables analytics agents to deliver accurate, consistent answers at scale.
For example, if a business user asks, βWhat is our current revenue growth?β, an AI agent without context may rely on a model used only for ad-hoc analysis. With a context platform, the AI can identify the governed revenue metric, trace it to its source systems, verify usage, and generate a query that reflects the approved definition.Β
In data operations, context determines whether changes can be made safely. Pipelines, models, and tests are constantly evolving, and even small changes can have wide-reaching downstream impact. For AI to assist or act in this environment, it needs to understand how data assets are connected, who depends on them, and which changes carry real business risk.
End-to-end lineage is critical for operating data systems safely at scale. It allows AI to understand how upstream sources and transformations influence downstream outputs, so issues can be detected, localized, and explained. Most failures originate in broken or drifting data rather than in models, making upstream monitoring and quality checks essential.
With this context, AI can continuously monitor data behavior, assess the downstream impact of changes, and determine whether an issue affects experimental assets or business-critical workflows. This enables safer changes at scale, allowing teams to move quickly while minimizing the risk of breaking production systems.
In data-powered applications, small data issues can quickly compound into large downstream errors. Predictive models, recommendation systems, and optimization engines continuously consume data and feed their outputs into business decisions. When inputs drift, definitions change, or upstream data degrades, these systems can reinforce incorrect behavior at scale.
A context platform helps prevent this by giving AI visibility into lineage, data changes, and usage patterns across the full data flow. With this context, AI can detect when inputs to a model have changed, assess whether those changes affect model assumptions, and flag or recommend adjustments before errors propagate into production decisions.
This allows data-powered applications to adapt safely over time, maintaining accuracy and reliability even as data and business conditions evolve.
Security teams invest heavily in classifying sensitive data: identifying PII, financial information, and other regulated content in the data warehouse. But BI tools such as Power BI and Tableau often connect through shared service accounts, leaving security teams with little visibility into which business users are accessing sensitive data through dashboards, reports, or AI agents.
Lineage ensures that governance signals propagate across the entire data stack. When a column is tagged as containing PII at the source, that tag flows through transformations and into the BI layer. Reports built on that data inherit the classification automatically.
This solves one of the hardest problems in data security: maintaining consistent governance as data moves and transforms. Instead of re-classifying at every layer, lineage carries signals through, keeping security in sync with how data actually flows.
For Data and Analytics leaders, the context platform is what makes AI safe to deploy at enterprise scale and this completely changes what's possible.
A sales agent can connect to your warehouse and BI layer to surface business insights and recommend next actions, without surfacing stale data or hallucinating metrics. An HR agent can answer workforce questions while respecting access controls, because sensitivity classifications propagate through lineage automatically. A finance agent can pull numbers for reporting, knowing the underlying sources are fresh and reconciled.
The context platform is what lets you connect agents to enterprise data with confidence. It's the difference between AI that's a liability and AI that's a trusted operator.
The organizations that get this right will be the ones that move beyond experimentation and into real, measurable ROI.
β