Work-Bench Snapshot: Augmenting Streaming and Batch Processing Workflows
The Work-Bench Snapshot Series explores the top people, blogs, videos, and more, shaping the enterprise on a particular topic we’re looking at from an investment standpoint.
As AI agents evolve from experimental prototypes to mission-critical systems we believe there will be a critical mismatch between their unique demands and our current computing infrastructure. This series examines how AI's computational patterns are likely going to force a fundamental rethinking of resource architecture and management.
Organizations deploying AI agents will face three critical challenges:
This series examines whether traditional compute paradigms can adapt to AI agents' needs or if entirely new approaches are required. Getting this right will determine who can deploy AI effectively at scale - and who gets left behind.
The rapid advancement of AI agents—autonomous software entities capable of perception, reasoning, and action—is creating significant new demands on computing infrastructure. We explore whether traditional compute paradigms, from mainframes to serverless architectures, can effectively adapt to meet the unique requirements of AI agents, or if a more fundamental paradigm shift in how we design, deploy, and manage computational resources might be necessary. By examining the compatibility between existing computing models and emergent AI agent needs, we can better speculate whether evolution of current frameworks or revolution through entirely new approaches will best serve this technological frontier.
This compatibility challenge matters profoundly for three key reasons. First, organizations deploying AI agents under current paradigms face significant economic inefficiencies due to the gap between agent needs and resource allocation models. Current systems struggle to efficiently handle the unique computational patterns of AI agents, which include variable memory requirements (from minimal during idle periods to substantial during reasoning tasks), unpredictable cold start times, specialized compute requirements (like GPU access for inference), frequent external API calls to knowledge sources, and bursty utilization profiles.
Second, the technical limitations of current architectures create artificial constraints on agent capabilities, particularly in areas requiring persistent state, real-time collaboration, and autonomous decision-making. Third, as AI agents become increasingly embedded in critical infrastructure, healthcare, finance, and daily life, the compute paradigm we select must prioritize operational excellence through reliability, observability, data security and debuggability. The challenge is how computational architectures can provide the concrete tooling necessary to monitor agent behavior, inspect decision processes, and efficiently troubleshoot issues in production environments. Without these capabilities built into the computational foundation, organizations will face significant hurdles deploying AI systems in contexts where consistent performance and transparent operation are non-negotiable requirements.
By examining how computing approaches could evolve to better support AI agents, this discussion offers informed projections about potential technological paths, economic models, and architectural patterns. Rather than presenting comprehensive research findings, we're exploring plausible developments that might emerge as the field progresses - considering conditional scenarios that could shape infrastructure development and potentially unlock more advanced agent capabilities. The bottom line: to be able to truly deploy AI at scale we will have to figure out the right computational approach, - a capability that should deliver substantial business benefits as these systems become increasingly central to operations.
The history of computing reflects the escalating complexity of software workloads, evolving through distinct architectural paradigms. Initially, mainframe computing dominated from the 1950s through the 1970s, with centralized systems handling multiple tasks through time-sharing mechanisms. The shift to client-server architecture in the 1980s and 1990s distributed processing while introducing resource allocation challenges. This led to virtual machines (VMs), which permitted multiple operating systems to run independently on a single physical server, significantly advancing infrastructure flexibility. However, VMs consumed substantial resources, often leading to hardware inefficiency and under-utilization rates frequently below 50%.
The industry subsequently moved toward containerized environments, which represented a fundamental shift in virtualization approach. While VMs virtualized the entire hardware stack requiring complete operating systems for each instance, containers virtualized only at the operating system level, sharing the host OS kernel while maintaining application isolation through lightweight process boundaries. This significantly reduced resource overhead and startup times. Docker first emerged as the key technology enabling this containerization revolution, with Kubernetes later developing as a solution for container orchestration, coordination, and scale-out across distributed infrastructure.
Serverless computing emerged as a post-containerization development, building upon container technology foundations. In serverless architectures, computational resources are automatically provisioned, scaled, and managed by the cloud provider in response to event triggers, with users charged only for actual execution time rather than pre-allocated capacity. Key characteristics include event-driven execution, statelessness, granular billing, and automatic (and in best cases predictive) scaling. While serverless likely represents a direction for future compute paradigms due to its operational efficiency and management simplicity, it introduces significant challenges for complex applications: cold start latency affects performance, execution time limits constrain workloads, statelessness complicates data persistence, limited runtime environments restrict technology choices, debugging becomes more difficult due to distributed nature, and vendor-specific implementations can lead to lock-in.
AI agents are autonomous software entities that perceive their environment, make decisions, and take actions to achieve specific goals. Unlike traditional applications, agents typically combine multiple capabilities: they process inputs through perception modules (text, images, structured data), reason about information using large language or domain-specific models, maintain working memory about contexts and goals, execute actions through API calls or direct system operations, and often collaborate with humans or other agents.
We recognize that the term agent can mean different things, ranging from simple rule-based software to complex, multi-robot systems. We specifically use AI agent to refer to autonomous software entities that can perceive their environment, maintain working memory about goals and context, reason using advanced techniques (often large language or domain-specific models), and take actions (for example, by calling APIs, updating databases, or interacting with users). These agents typically require some form of persistent state and may collaborate with humans or other agents in real time. Our focus is on these increasingly intelligent, autonomous systems that combine perception, reasoning, and action rather than on simpler, stateless AI integrations.
In traditional request-response architectures (often called CRUD—Create, Read, Update, Delete), most operations can be mapped to a fairly predictable pattern of database reads/writes tied to user requests. In contrast, AI agent workloads exhibit additional layers:
Below are two simplified diagrams illustrating these differences.
1. CRUD Systems Flow Diagram
2. AI Agent Flow Diagram
AI agents present distinct computational requirements that set them apart from traditional workloads:
The economics of compute resources fundamentally changes with AI agent workloads. Traditional infrastructure cost models based on static resource allocation become inefficient when dealing with the variable, often bursty nature of agent operations. A computational economics model that considers the total cost of ownership (TCO) across different paradigms reveals interesting patterns:
As AI agents become mainstream computational workloads, we'll need new tools or significant adaptations to existing ones because: (1) current monitoring systems lack visibility into AI-specific metrics like reasoning quality and inference latency variations; (2) existing autoscaling mechanisms don't understand the relationship between model complexity, memory requirements, and computational phases of agent operations; and (3) resource allocation strategies are not optimized for the bursty, state-dependent nature of agent workloads. Tools that bridge these gaps will likely determine which organizations can efficiently operate AI systems at scale.
In Part 2 of this series, we'll explore emerging solutions to these challenges, including memory-compute disaggregation, real-time collaborative agent architectures, hybrid cloud-edge deployments, and a unifying framework for agent-centric computing.
As we've explored in this first part, AI agents represent a fundamental shift in computing workloads that strains our current infrastructure paradigms. The gap between what AI agents need and what traditional compute models provide creates significant economic inefficiencies and technical constraints that limit the potential of these systems. Organizations looking to deploy AI at scale face a critical decision point: continuing to adapt existing frameworks with diminishing returns or embracing new computing paradigms designed specifically for AI's unique requirements. We suggest that evolution alone may not suffice and that a more revolutionary approach could be necessary to fully unlock AI's potential.
While we focus here on key computational requirements like variable compute intensity, memory and state management, specialized hardware access, etc. we recognize that AI agents are still an evolving category. In the same way that a 1975 prediction about personal computing or a 1995 prediction about cloud computing would have inevitably missed critical details, it’s challenging to know which parameters will ultimately drive the most significant change in AI agent architectures.
Each organization, domain, and agent design might prioritize different capabilities; some may hinge on massive context windows and real-time data ingestion, while others may leverage a swarm of simpler specialized agents. Rather than making definitive forecasts, our goal is to highlight the most pressing questions that have already emerged in early deployments and point toward the architectural shifts required to accommodate them.
This piece benefited greatly from the reviews, corrections, and suggestions of James Cham, Guido Appenzeller, Nick Crance, Tanmay Chopra, Demetrios Brinkmann, Kenny Daniel, Davis Treybig, as well as the tireless AI collaborators Gemini, Claude, and ChatGPT, who provided endless drafts, rewrites, and the occasional existential question about the future of sentience- we promise to remember you when the robots take over.
Diego Oppenheimer is a serial entrepreneur, product developer, and investor with a deep passion for data and AI. Throughout his career, he has focused on building and scaling impactful products, from leading teams at Microsoft on key data analysis tools like Excel and PowerBI, to founding Algorithmia, which defined the machine learning operations space (acquired by DataRobot). Currently, he provides strategic advisory for startups and scale-ups in AI/ML. As an active angel investor and advisor to numerous companies, he is dedicated to helping the next generation of innovators bring their visions to life.
Priyanka Somrah is a principal at Work-Bench, a seed-focused enterprise VC fund based in New York. She focuses on investments across data, machine learning, and cloud-native infrastructure. Priyanka is the author of The Data Source, a newsletter for technical founders that highlights emerging trends across developer tools. She's also the author of Your Technical GTM Blueprint, a series that breaks down how technical startups navigate go-to-market—from first hires to scaling repeatable sales.