Mastercard’s massive structured data stores drive its success with today’s AI applications

Mastercard is no stranger to massive volumes of structured data, a foundation that has helped make it a pioneer in applying artificial intelligence across its business. Long before generative AI captured corporate attention, Mastercard was leveraging this structured data to apply machine learning to tackle challenges like fraud detection and real-time transaction analysis.
CTO George Maddaloni has been with the company since 2020, right around the time the pandemic hit. The world-altering event drove most commerce online, which had a direct impact on the amount of data Mastercard had to process. That shift made AI and machine learning more important than ever to the credit card industry giant.
Yet it’s not just a simple matter of setting technology loose on a problem. Maddaloni also has to take into account the challenges that come with being a heavily regulated company and what that means in terms of safeguarding the company’s substantial data store both from misuse and hacking.
In spite of that, as AI has become an essential part of nearly every company’s tech strategy, Mastercard is well positioned to be ahead of the pack, mostly because it’s been working on AI for years, long before it was the trendy thing to do.
It’s all about the data
Since data is central to large language models, Mastercard is well prepared. “We've had a long history of having systems that were AI-enabled,” Maddaloni told FastForward. This has required developing a technology strategy and the necessary connectivity to make everything function effectively, while continually modernizing their network infrastructure, whether in the cloud or on-premises. Regardless of location, the goal is ensuring that they have access to large-scale, low-latency capabilities and diverse computing platforms in order to shift workloads around as needed. That could involve the core business functions or newer generative AI use cases.
With a base set of flexible technologies in place, it enables Maddaloni and his team to process vast amounts of data and put it to work. And that’s a good thing because Mastercard deals with a ton of data. We’re talking 159 billion transactions per year. The good news from Mastercard’s perspective is that the data is highly structured around transactions, and that puts the company in a unique position to process it in different ways.
"We have a tremendous amount of data from transactions, but it's pretty structured data, and it's a relatively concrete set of information in terms of payments, so much so that you could think about it as a language in its own right,” he said. Beyond that, having structured data "helps put boundaries around the bigger challenges people have when dealing with massive data sets and the like."
Yet even with that staggering volume of data, the company has years of experience managing those kinds of numbers, which helps ease the burden. “We have been moving data around our organization for a long time and I think we have had a rich history in making sure that we treat that data appropriately,” he said.
Protecting the data
To that end, Mastercard has to work hard to ensure the data they collect is safe as they put it to work across various applications. They have established a framework to help. “We've had a set of data principles that we've been public about, and our position in terms of those principles is really around the protection of personal information first and foremost. So most of the data that we move around is completely anonymized,” he said.
One way to ensure the security and safety of the information is by a process known as tokenization, which IBM defines as “the process of converting sensitive data into a nonsensitive digital replacement, called a token, that maps back to the original.” That means if hackers access these tokens, they tend to have little value.
We have a tremendous amount of data from transactions, but it's pretty structured data, and it's a relatively concrete set of information in terms of payments, so much so that you could think about it as a language in its own right.
~George Maddaloni, Mastercard
“Tokenization is a key for our service, so that the actual card information is masked, as well as the information behind the card. That gives us the ability to have control over the data as we move it around,” he said.
For a company with a target on its back, it has been remarkably successful at protecting customer data with no evidence of a first-party data breach. While it has been affected by some partner breaches over the years, those were not in the company’s direct control. In fact, beyond a DNS naming issue discovered by a researcher earlier this year, which didn’t appear to impact core systems, there is little evidence of major problems. It’s worth noting that the company corrected the naming error upon receiving the report, according to reports.
Making improvements with LLMs
When you have the kind of data volume that Mastercard deals with, it’s impossible for humans to keep up with. That makes it a great use case for AI. While the company has been using AI for years, generative AI required a change in how they trained the newer models. Maddaloni says that combining generative techniques with neural networks has resulted in better, faster models. And the company has seen great results.
“So already our AI engine runs at [around] 1.5 trillion parameters. At an inferencing level, it's an extremely real-time system, but at a training level we run over 100 models,” he said.

That has resulted in real-world benefits. Using fraud detection as an example, Maddaloni says the new models have improved results up to 30% and eliminated false positives by 80%, which he points out is huge. “You don't want to be telling people that this looks like fraud and decline transactions when it's really not. So on both ends of the spectrum, we've really seen benefit from those generative techniques against our specific data.”
If you’re wondering if Mastercard is looking at agents, the answer is yes. At the end of April, the company released Agent Pay. It combines a payment system with a tokenization agent for increased security. The agent can act as a helper for either a customer or a business, depending on the requirement, while the token piece builds on the company’s tokenization framework to help ensure that agentic transactions meet all the security and privacy criteria.
The startup angle
The company has an incubation period for all new technologies including AI, and applies a similar approach to startups as well. In fact, there is a formal evaluation process for startups to interact with Mastercard called Start Path.
“Every year we look at many startups and then adopt somewhere between six to 10 each year into a class. They get coached from our leadership in terms of products they're potentially developing in the payment space that we might want to look at and use,” he said. They work with them to develop the solution and go through a maturation process, supporting the startups for over a year.
Much like Larry Feinsmith, managing director and head of global tech strategy at JPMorgan Chase, another large organization that partners with startups, Maddaloni recognizes that you can’t ask a startup to build a product strictly for the needs of your company. “It's not that we're coaching them just for Mastercard. We literally are coaching them for the benefit of the entire payments ecosystem,” he said.
Ultimately. whether it’s new AI solutions or working with an early stage startup, the company keeps the fundamentals in mind at all times. “We keep those foundational thoughts around cyber security and data privacy in everything we do, and we're constantly bringing these new technologies in and testing and making sure that we're providing solutions that make the overall payments ecosystem stronger.”
Feature photo courtesy of Mastercard.