We are excited to announce our Series B investment in Galileo, an end-to-end GenAI Stack.
The AI bottleneck
Companies are increasingly looking to AI to automate tasks and interact with end users. But as with any software product, getting solutions from hacked together prototypes to production grade systems requires rigorous testing and production monitoring. When it comes to generative AI, that process is not so simple.
Unlike “traditional machine learning” where models typically classify data into predefined labels and categories, generative models produce novel outputs such as long-form text, images, or even code. This shift from discriminative to generative outputs introduces new complexities in determining their accuracy and reliability. Simply put, there is not usually a binary right or wrong answer when it comes to the outputs of generative AI. To make things even more complicated, there are also infinite ways to prompt and interact with these models, and it’s hard to predict how real users will behave. As a result, businesses struggle with measuring the quality and trustworthiness of these new systems at scale, and the absence of effective evaluation and observability methods poses a significant barrier to deployment.
This challenge is not new. Galileo’s founder, Vikram Chatterji, recognized this gap during his time at Google where he worked on NLP applications powered by BERT-based transformers long before the current surge in popularity of today’s LLMs. The inability to effectively evaluate unsupervised and non-discriminative models became a bottleneck for innovation. If you cannot evaluate the output of a language model, how will you test an application powered by one? And if you cannot test an application, how do you expect to deploy it in the first place?
The evaluation unlock
Galileo was founded to address this very problem. The platform offers a comprehensive suite of tools for AI debugging and observability, enabling AI engineers and stakeholders to gain visibility into the performance of their AI models. By providing not only customizable evaluation functionality, but also error analysis and tracing, Galileo goes beyond bare bones evaluation and enables customers to understand not just if their models are wrong, but why they are wrong and how to fix them.
In these early days of AI adoption and product build out, customers are already looking to Galileo as a non-negotiable for reliably getting their AI-powered products off the ground. In less than a year since their product launch, the company has signed impressive contracts with major Fortune 500 companies like HP. With the complexity and prevalence of AI only continuing to explode, the best has yet to come.
Looking ahead
Galileo is uniquely positioned to become the standard for AI evaluation and observability and the central hub for AI product development. Vikram and his team bring years of deep expertise, a clear vision for how the AI SDLC will evolve, and conviction in the product shape that will support it. We are beyond excited to partner with them and see the AI products they will continue to enable.