Work-Bench Snapshot: Augmenting Streaming and Batch Processing Workflows
The Work-Bench Snapshot Series explores the top people, blogs, videos, and more, shaping the enterprise on a particular topic we’re looking at from an investment standpoint.
Over the past few years at Work-Bench, I’ve spent a lot of time working with early-stage founders building across AI and Machine Learning Infrastructure, many of whom are exploring RPA, Applied AI, and adjacent industries. Across these conversations, I’m hearing founders index towards a similar product path:
Classify, extract, and tabularize unstructured data (text, images, documents, speech, video) and add some layer of downstream business process automation.
In practice, use cases could include, processing invoices for payments, transcribing medical notes, aggregating and analyzing clinical trial data, automating the bills of lading process for shippers, and much more. However, as LLMs have come into the limelight, there’s been a mad dash of founders racing to apply AI technology to existing use cases.
Having previously worked in Product and Growth roles at Hyperscience, a now Series E scale-up with ~$300M from top investors and a first-mover that leveraged proprietary data for model training and reached significant scale, I’d like to share some learnings from building and investing in business process automation:
Accurately extracting and structuring data is hard. Across every industry there’s ample opportunity for automation, but founders need to choose the one use case they can solve for before moving to the next. It’s exciting to build a flexible product that can cater to many use cases and industries out-of-the-box, but this is a trap. You might win a few deals across use cases/industries, but this will do more harm than good. All of a sudden you’ll have 10 customers with 10 vastly different use cases and no repeatable sales motion or earned insight about how your customer’s processes actually work or evolve. This harm only compounds as you think you’ve reached PMF and start hiring more AEs. The more time founders spend building a flexible tool that can solve many use cases means they're less likely to build a product that’s deep enough to solve a repeatable enterprise problem.
Lesson: Start small to go big. Ensure your wedge use case is a ‘need to have’. The more value you provide, the more the buyer will natively seed new use cases. Building for and selling multiple use cases out of the gate is a recipe for disaster.
It’s rare to have a totally greenfield use case in enterprise software. This can be tricky because an enterprise solution doesn’t always mean a SaaS product. For many organizations, the cost of spinning up an offshore team to manually review invoices or add PDFs to emails is still cheaper than your startup and works just fine. Beyond the price tag, there’s inherent risk and social capital that needs to be validated when a buyer decides to shake up a tried and true practice and purchase a startup solution. How will you de-risk the sales process to help customers onboard your solution?
Lesson: Understand the buyer psychology and cultural dynamics for your use case. Know why processes are the way they are and why past solutions have failed.
We know that time kills all deals, but a lack of usage kills all renewals. Many founders building in vertical AI/business process automation are going after mission-critical workflows. Whether it be your actual product implementation, confusing workflows, or downtime, any inaccuracy or pause in operations can hurt a customer's business, and in turn, your chance of getting a renewal.
Lesson: Elongated implementations and complex integrations end up hurting the customer and your chances of a renewal. Make customer success your priority.
Even though humans are expensive and slow, when it comes to manual business processes, they’re generally quite reliable. Whether you’re selling one use case or multiple, it’s unlikely that your third-party LLMs will beat out a human’s accuracy levels for data extraction or mind numbingly boring tasks. Products aiming to automate human tasks need to have some combination of automation + human in the loop to achieve strong accuracy benchmarks — anything lower than their existing process and enterprise customers will be skeptical of your product.
Lesson: It doesn’t have to be full agentic automation — enterprise organizations run on workflow tools.
I’ve previously written about Applied AI and the LLM Adoption Curve. The TLDR is that as a use case becomes more complex / high value, your models need to improve in tandem. At Hyperscience, we spent years acquiring and training our ML models with high quality data to insulate ourselves from competitors leveraging third-party models. Today, off the shelf LLMs can get you halfway, but aren’t necessarily the best fit for all extraction use cases. Founders building in Applied AI need to understand the edge cases associated with their users' workflows and account for the fact that LLMs may aren’t the best.
Lessons: Small models trained on high quality data samples + synthetic data could outperform generic models and help differentiate you from the competition.
As I continue to dig into these topics, I’d love to speak with more founders, practitioners, and builders in the category. Reach out here.