How SuperAnnotate is Solving AI’s Data at Scale Challenge

DTC invested in SuperAnnotate’s Series B in 2025.
The phrase “data is the new oil” was coined almost two decades ago but it’s never been more relevant than it is right now. At this point, many enterprises have digital wells pumping out petabytes of the value-rich material every day. Compared to the early 2000s, the challenge now is less scarcity of data resources and all about how it’s refined and managed for building tomorrow’s AI tech stack.
So how do the likes of Hinge Health and customers of NVIDIA and Databricks build confidence in their AI training data? They turn to SuperAnnotate, a company blending human domain expertise with machine-level speed and accuracy for data classification that lets companies confidently scale new AI use cases.
“In the first wave, with the ChatGPT launch, it was very, very clear AI has an insane amount of power. It’s also very, very clear that data is what’s fueling these innovations,” co-founder and CEO Vahan Petrosyan said in a recent conversation with us. But for enterprise applications, he argues we’re still early days and that a huge wave of automation is coming as every enterprise is out there asking, “what can we automate?”
Ambition to impact
SuperAnnotate’s origin story is rooted in research and a forward-thinking vision. Founded by Vahan and his brother, Tigran, in 2018, they initially set out to build a company based on Vahan’s PhD research.
“My PhD research resulted in an initial image labeling tool powered by my algorithm. So this was like a huge success at the time. We were thinking, we can build a business based on one algorithm. It was... a wrong assumption. But it was perfect because it enabled us to start from somewhere,” Vahan shared.
Once they moved the company to Silicon Valley, they realized the challenges were bigger than one algorithm. “This is where the huge change happened in our mindset. We were research people who had to turn into business people. We needed to grow a company and not just continue to do research.”
But first, it starts at the annotation layer
In most organizations, data flows in from a multitude of sources and in myriad structures and formats. Before it can be useful in AI applications where it’ll interface with models, bots, and humans, that data needs to be unified and it needs context such as:
- This collection of pixels is a horse.
- These polygons represent the horse’s natural movement.
- This text is the highest integrity information available on the horse.
- This answer has led to the highest satisfaction when answering a user’s question about the horse.
“What this means is that for enterprises, AI data is only becoming more complex to manage,” Petrosyan said. “So basically, with SuperAnnotate, we’re helping them centralize all their AI data operations with our ‘operating system’ that labels the data and connects it with models, and creates a more continuous workflow to do evaluation and observability, and do monitoring of those models.”
Petrosyan says it’s the evaluation, monitoring and observability across unstructured data and iterations across these areas that will matter for most enterprise applications. When you’re building AI systems, the concept of right and wrong is nuanced. Human preferences mean “there could be 10,000 different right answers to a question in an interaction. But, which one is more right than the other when you’re trying to create a best-in-class customer support agent?”
Nuance and integrity in answers is especially important for companies building AI applications in highly error-sensitive areas such as finance, transportation, and healthcare. Consider Flo Health’s “Ask Flo” app, which helps millions of women access clinical-grade answers about reproductive and women’s health topics. “SuperAnnotate enabled us to transform deep clinical expertise into scalable, structured ground truth data,” says Flo Health CTO Roman Bugaev. He also noted that with SuperAnnotate, boosted evaluation throughput 10x, strengthening alignment between the company’s medical and engineering teams.
Building the continuous loop
Soon, use cases like Flo Health’s will be much more common. But that creates a new set of challenges for businesses.
This process of funneling data to agentic AI systems doesn't happen once; it’s a continuous cycle, one that is constantly changing as models evolve, more and new data comes into play, and business needs change. For example, one of SuperAnnotate’s customers uses 30 different AI-enabled judges to continually assess empathy, accuracy and factuality in their AI customer support agent. This type of learning and evaluation loop is going to be essential for solving AI problems.
“The bigger pains are monitoring the model, putting guardrails around it and continually improving the system’s performance. And the company preference might change over-time, so you also might need to change the behavior of the model to accommodate that,” said Vahan.
Vahan remains optimistic that any speed bumps created by AI adoption will be surmountable. He believes that while some jobs will be automated away, others will be created. And that next gen challenges such as evaluating accuracy as AI tool interactions scale exponentially will create “at least a billion dollar opportunity in just that space only” for the team that can standardize a solution.
Looking ahead
As AI becomes integral to every industry, SuperAnnotate is proving to be indispensable. They’ve got a whole roster of notable customers including Canva, Motorola Solutions, and Databricks. They’ve consistently earned high marks from customers on the highly influential G2 platform and were recently named a Customer Impact Partner of the Year by Databricks. We believe this is just the beginning for Vahan, Tigran, and the entire SuperAnnotate team. We’re honored to be along for the ride 🚀🚀🚀