Listen now on YouTube | Spotify | Apple Podcasts | Amazon Music
The spreadsheet might still be the most widely used data catalog on the market. That’s not a joke. It’s a finding from Stewart Bond, Research VP at IDC, who has spent the past decade studying how companies manage intelligence about their data. When I sat down with Stewart on the Data Faces podcast, he pointed out that every survey he runs surfaces the same contradiction. Organizations rank data quality as their top AI concern, yet they fail to invest in the one technology category designed to address it.
The frustrating part is that the data teams usually know exactly what the problem is. They flag quality and governance issues, but the budget continues to flow towards AI model development and agents instead. Stewart has been tracking this gap longer than most, and his perspective on how data intelligence evolved from an analyst’s shorthand into a global market category offers a useful lens for understanding why the gap persists.
“One of the biggest challenges organizations have is managing the intelligence about their data. Data catalogs, business glossaries, data lineage, all that stuff is so important now as we get into AI. And yet, their top investment categories are not on data catalogs.” — Stewart Bond, Research VP, IDC
About Stewart Bond
Stewart Bond is a Research VP at IDC, where he leads the data intelligence and data integration software research practice. His career spans over 30 years in IT, including a decade as a certified IT architect at IBM before moving into industry analysis in 2011. Outside of work, Stewart is a competitive curler who came within one match of representing Ontario at a Canadian national championship. In our conversation on the Data Faces Podcast, we discuss:
How Stewart coined the term “data intelligence” and watched it become a global market category
The difference between intelligence about data and intelligence from data
Why agentic AI demands a shift-left approach to data quality
What CDOs are most concerned about and where they’re under-investing
How one research note became a market category
Stewart joined IDC in 2015 and inherited a research area covering data integration and data access, which at the time included eight sub-markets like metadata management, data quality, and master data. A conversation with ASG Technologies introduced him to their term “enterprise data intelligence.” Stewart saw something useful in the phrase but dropped the “enterprise” qualifier. Data intelligence, as a simpler label, stuck.
The real momentum came in 2018, when GDPR was about to take effect. Enterprise data leaders started calling Stewart with the same question. “Where can I buy a data governance solution?” His response surprised them. You can’t buy governance. Data governance is an organizational discipline that requires people, processes, and accountability. What you can buy is data intelligence technology, the tools that tell you everything you need to know about your data so you can govern it.
“I had a lot of end-user clients calling me and saying, ‘Where can I buy a data governance solution?’ And I just kind of laughed, because data governance isn’t a technology solution.” — Stewart Bond, Research VP, IDC
Stewart framed this through his 5 W’s of data. Who is using it? How is it being used? Where does it live? What does it mean? Why do you even have it? How long do you have to keep it? These questions form the foundation of effective data governance, and answering them requires technology that most organizations still haven’t fully invested in.[1]
Intelligence about data vs. intelligence from data
The term spread faster than Stewart expected. Collibra became “the data intelligence company.” Erwin (now Quest) adopted it for their data catalog. Alation started using it in 2020, after learning the phrase wasn’t a Collibra trademark but an industry-level concept. Informatica wove it into their intelligent data platform messaging. Then, in late 2023, Databricks made a major push with its own version of data intelligence.
The Databricks definition, however, expanded the original meaning. Stewart had always treated data intelligence as intelligence about data. What is this data, where did it come from, who uses it, and how good is it? Databricks extended the concept to include intelligence from data, using the metadata and context layer to generate smarter analytics and AI outcomes from the data itself. The distinction matters because it changes what organizations expect from the category and how they evaluate platforms.
Dave Kellogg was serving as acting CMO at Alation when he first explored the term’s origins with Stewart. After the Databricks announcement, Kellogg reached out with a direct assessment. “I think you did it. I think you created a new market category.” Last year, IBM confirmed the trend by rolling its entire portfolio of data cataloging, quality, lineage, and observability products into IBM watsonx Data Intelligence. IBM’s product leadership told Stewart the renaming was a direct result of his work and the broader market momentum he helped create.
“I’d always treated data intelligence as intelligence about the data. I’d say Databricks has extended it to intelligence from the data, getting more into the case of leveraging that intelligence about the data to make sure you’re using the data intelligently.” — Stewart Bond, Research VP, IDC
Agents can’t wait for clean data
The shift to agentic AI fundamentally changes how organizations need to approach data quality. Traditional analytics workflows gave organizations a buffer. Data moved through batch processes, giving teams time to spot anomalies and intervene before a bad number reached a dashboard. Autonomous agents don’t offer that luxury. An agent monitoring a change data capture stream sees a new order event and starts fulfilling it on the spot. If the data in that event is wrong, the agent acts on it before anyone has a chance to review it.
Stewart describes this as the “shift left” imperative. Data quality, privacy, and integrity all need to move as close to the source as possible, because once data enters the agentic pipeline, there is no batch window to clean it up. Deloitte flagged this as one of four critical data quality challenges for AI, finding that companies building agentic systems need quality controls embedded at the point of data creation, not applied after the fact.[2]
“You’d better make sure the data in that order event is good and that it’s a real and reliable order event. You may have heard the term shift left. Your data quality, your data privacy, your data integrity all need to be as close to the source as possible.” — Stewart Bond, Research VP, IDC
The challenge extends beyond structured data. Stewart raised a question that most organizations still haven’t answered well. What do you do about the unstructured data that makes up the bulk of enterprise information? Every organization has countless versions of the same PowerPoint file, thousands of PDFs, and documents that LLMs are eager to ingest. Some vendors are starting to crack this problem. Shelf.io, for example, has developed methods to assess the quality of unstructured documents, a capability that seemed impossible just a few years ago.
The broader issue remains, though. Most organizations lack the data context needed to determine whether their unstructured data is safe to use, let alone high-quality.[3] Stewart sees agentic AI as part of the eventual solution. Agents that pre-populate data catalogs and reduce the manual burden on data stewards could finally solve the adoption problem that has held these tools back for years. But that future depends on investing in the foundation today.
The investment gap CDOs can’t ignore
Stewart runs an annual survey of the Office of the Chief Data Officer, and the results tell a consistent story. When you ask CDOs what their biggest organizational concern is, skills top the list. They struggle to find people who can do the work. The second concern is managing expectations around what AI can deliver, not just within their own teams, but across the C-suite, where leadership is under pressure to show results quickly and often treats AI as a magic bullet.
“Their top investment categories are not on data catalogs. Back to the spreadsheet might still be the most widely used data catalog on the market. I don’t have data to prove that, but anecdotally, that could be the case.” — Stewart Bond, Research VP, IDC
What makes this frustrating is that CDOs now have more influence over IT spending than ever before. IDC predicted in 2024 that chief data officers would gain significantly more budget authority by 2025, driven by the fact that every major AI concern in enterprise surveys points to data: quality, correctness, privacy, and security. CDOs are accountable for all of it. Deloitte’s 2025 CDO Survey tells a similar story. These leaders are increasingly expected to demonstrate direct business impact from their data programs, even as their organizations resist the investments required to achieve it.[4]
And yet, when Stewart looks at where enterprises are actually putting their money, the top investment categories are not data catalogs or data quality tools. The data lineage, metadata management, and business glossary capabilities that form the backbone of data intelligence remain underfunded, even as AI programs depend on them.[5] That spreadsheet Stewart mentioned at the top of our conversation? For many organizations, it is still doing the job that a proper data catalog should be doing.
You’ll never score 100%
Stewart closed our conversation with an insight he picked up years before he even joined IDC. A life insurance company told him they had finally accepted that their data would never be 100% clean. Instead of chasing perfection, they started measuring how clean or dirty their data was and feeding that score into their calculations. Their actuaries knew how to work with uncertainty. They just needed the number.
Data intelligence doesn’t promise perfect data. It gives you a clear picture of how much you can trust what you have. Organizations that know the quality of their data before it enters an AI pipeline avoid the costly cycle of debugging outputs that were doomed from the start. A data quality score of 75 means something different from a score of 95, and both are more useful than no score at all. When that score travels alongside the data into an AI model or an autonomous agent, the organization can make informed decisions about how much confidence to place in the output.
Start with Stewart’s 5 W’s. Audit how your organization currently tracks who uses its data, where it lives, and how trustworthy it is. If the answer is a spreadsheet, you have your business case.
The spreadsheet is still winning. It doesn’t have to be.
Listen to the full conversation with Stewart Bond on the Data Faces Podcast.
Based on insights from Stewart Bond, Research VP at IDC, featured on the Data Faces Podcast.
Frequently asked questions
What is data intelligence?
Data intelligence is the category of technology that provides intelligence about your data. It encompasses data catalogs, business glossaries, data lineage, data quality, and metadata management. Stewart Bond, Research VP at IDC, coined the term to describe the tools that answer foundational questions about data, including who uses it, where it lives, what it means, and how trustworthy it is. More recently, vendors like Databricks have expanded the definition to also include intelligence from data, using that context layer to improve analytics and AI outcomes.
How does data intelligence differ from data governance?
Data governance is an organizational discipline that requires people, processes, and accountability. Data intelligence is the technology that supports it. You cannot buy a data governance solution, but you can invest in data intelligence tools that tell you everything you need to know about your data so you can govern it. Organizations that try to solve governance with technology alone tend to fail, according to IDC’s Stewart Bond.
Why does agentic AI require a shift-left approach to data quality?
Traditional analytics workflows gave teams time to spot and fix data issues in batch processes before results appeared on the dashboard. Autonomous AI agents operate in real time and act on data the moment they receive it, with no batch window to clean things up. This means data quality, privacy, and integrity controls need to move as close to the data source as possible. Deloitte identified this as one of four critical data quality challenges for organizations building agentic AI systems.
What are CDOs most concerned about in 2025?
According to IDC’s annual survey of the Office of the Chief Data Officer, skills gaps rank as the top concern. CDOs struggle to find qualified people to do the work. The second biggest concern is managing leadership expectations around what AI can realistically deliver. Despite growing influence over IT budgets, CDOs face a persistent disconnect between the data foundation AI requires and where their organizations actually invest.
Where are organizations under-investing in data intelligence?
IDC survey data show that the top enterprise investment categories are not data catalogs, data quality tools, or data lineage capabilities, even though managing data intelligence is one of the biggest challenges organizations report. Stewart Bond notes that the spreadsheet may still be the most widely used data catalog on the market, a sign that foundational data intelligence technology remains significantly under-funded relative to AI program spending.
Podcast highlights
[0:05] Introduction and Stewart’s background at IDC
[2:31] Stewart’s life outside work, competitive curling, and fishing
[5:00] The origin of the term “data intelligence” and the ASG Technologies connection
[6:44] GDPR drives demand for governance solutions, the 5 W’s of data
[8:15] Collibra, Erwin, Alation, and Informatica adopt the term
[10:00] Databricks expands the definition, Dave Kellogg’s “you created a category” moment
[14:00] IBM rebrands to watsonx Data Intelligence
[18:00] Intelligence about data vs. intelligence from data
[26:00] Agentic AI and the shift-left imperative for data quality
[29:00] Unstructured data quality and Shelf.io
[31:00] What CDOs are most concerned about in 2025
[35:00] Where organizations are under-investing in data intelligence
[36:40] Data quality will never be 100%, the life insurance anecdote
[38:00] Agentic AI and the future of data catalog adoption
About David Sweenor
David Sweenor is a Top 25 AI thought leader, six-time author, and founder of TinyTechGuides. He spent the first half of his career as a practitioner at IBM, building data warehouses and running predictive models, and the second half in product marketing leadership at SAS, Dell, Quest, TIBCO, Alteryx, and Alation. He hosts the Data Faces Podcast, where he talks with the people who are making data, analytics, and AI work in the real world.
Books
- Generative AI Business Applications
- The Generative AI Practitioner’s Guide
- The CIO’s Guide to Adopting Generative AI
Follow David on Twitter @DavidSweenor and connect with him on LinkedIn.
[1]Sweenor, David. “Why the Biggest AI Enthusiasts Care Most About Governance.” TinyTechGuides, January 27, 2026. https://tinytechguides.com/blog/why-the-biggest-ai-enthusiasts-care-most-about-governance/
[2]Deloitte. “Four Data and Model Quality Challenges for AI.” Deloitte AI Institute, 2025. https://www.deloitte.com/global/en/our-thinking/insights/topics/artificial-intelligence/ai-data-quality-challenges.html
[3]Sweenor, David. “Your AI Doesn’t Have a Model Problem. It Has a Data Context Problem.” TinyTechGuides, February 24, 2026. https://tinytechguides.com/blog/your-ai-doesnt-have-a-model-problem-it-has-a-data-context-problem/
[4]Deloitte UK. “CDO Survey 2025.” Deloitte United Kingdom, 2025. https://www.deloitte.com/uk/en/services/consulting/analysis/chief-data-officer-survey.html
[5]Sweenor, David. “Data Lineage for AI: Why Truth Beats Hope in Banking.” TinyTechGuides, December 2, 2025. https://tinytechguides.com/blog/data-lineage-for-ai-why-dotrth-beats-hope-in-banking/











