Two years ago I left a job I loved with no plan other than to explore a question I couldn’t stop thinking about. Why is it so hard to ship software that uses AI?

I’d spent the past ten years doing NLP research at MIT, taking a dev tool startup through Y Combinator, and growing a language processing unicorn at Instabase. The common theme seemed to be that the AI kept improving, but the difficulty of shipping it stayed the same.

Here’s an example. Imagine you work at Spotify. You’ve just won a hackathon with a prototype that transcribes podcast episodes and predicts topic hashtags. Now imagine what it would take to actually ship that: months of custom code, infrastructure, and process. Fine-tuning different models, queueing distributed work, monitoring uptime, managing cost…

When it comes to production AI, there’s a huge gap between what’s possible and what’s practical. And if we’re going to build the Star Trek universe, that’s a problem. It’s not enough for AI to be smart. It also has to be easy to build with. Because it’s the combination of AI and software together, not AI alone, that will enable that universe.

Today I’m thrilled to announce Steamship — a company dedicated to putting language understanding in the toolbox of every software developer.

Steamship lets you add bundles of targeted, full-lifecycle language understanding to your software in minutes. We call these bundles “packages”. They import into your code like regular software packages, but they run on our auto-managed stack in the cloud.

Our launch today is a beta of three of these packages: a classifier that transitions smoothly from zero-shot to trained, a topic clustering tool for community chat rooms, and a topic indexer for podcasts and YouTube channels. Each of these can be used from any development environment with just a few lines of code. Streamlit demos are on our homepage.

This beta is just the start of our broader platform: a Heroku for language AI that lets anyone build, bundle, ship, and share full-lifecycle language AI packages without having to care about the infrastructure that makes them possible.

If you’re a developer: read more about our first three packages here or fill out our onboarding form here. (We’re early on so onboarding over Zoom, but it’s quick!)

Or, continue reading for a deeper dive into one of the core usability problems we’re solving.

Production AI is 100x harder than Prototype AI because of spaghetti infrastructure.

You know what spaghetti code is. Spaghetti infrastructure is the same thing, but with machines in the cloud: EC2 instances, Sagemaker endpoints, S3 buckets, and more, held together with duct tape and Terraform.

We think spaghetti infrastructure is one of the most important problems holding back AI.

On the surface, it’s not immediately obvious this problem is inevitable. After all, AI is just software. Why not just run your file on a larger machine?

The problem is that each line of code in your prototype needs to scale in a different way. So when you adapt it for production, each line of code becomes a different piece of distributed architecture. Add in the coordination between those pieces, multiply by the number of model versions and user-tunings, and it all adds up FAST to a tangled stack of machines covered in marinara sauce.

Here’s a sampling of some of that spaghetti that you’ve probably seen yourself:

  • Data. At scale, your data won’t all be in memory. Some will live in a relational database. Model parameters will live in S3. Features and embeddings might live in a vector store. You’ve still got all of your business data to integrate with, plus logging, metrics, and prediction feedback.
  • Versioning. Your model will have multiple versions over time and different fine-tunings for each customer scope. Managing the training, parameter storage, and inference availability of these is its own project that may require internal dev tools.
  • Compute. To stay under budget, different pieces of your prototype will transition to different types of machines. Training might be on spot GPUs. Inference might be high-memory reserved CPUs. Logic and glue code may up in lambdas or smaller EC2 instances.
  • Control Flow. Different components take different times to run, fail in different ways, and accept different input sizes. Synchronous lines of code in your prototype are now asynchronous task workers, with wrappers to marshal data formats.

And so on, and so on, until your prototype’s AWS footprint may rival the rest of your product in complexity. All that for a single new feature!

Sketch of the AWS footprint of an AI prototype after it’s been adapted for production.

To businesses, spaghetti infrastructure makes AI risky and expensive. And to developers, it puts hard limits on what’s practically achievable beyond demo-ware.

We think this is a critically important problem because it’s the final roadblock to making prolific use of AI once the models become “good enough”.

Good news: the web had a spaghetti infrastructure problem, and we solved it.

We think AI’s spaghetti infrastructure rhymes with the growing pains of the web application stack.

It wasn’t so many years ago that web developers had to put tremendous effort into configuring servers, task queues, static caches, auth systems, database schemas, CDNs — it was endless! — all just to transition a prototype web app into a production product.

The root cause was the same: each line of yesterday’s web code needed to scale in a different way. And so each line required a separate piece of distributed infrastructure.

Companies like Heroku, Netlify, and Vercel solved this problem by building world-class reference stacks, automating their management, and providing insanely simple software frameworks on top.

As a result, today you can run a single command to deploy your prototype to global scale, kitchen sink included. It’s incredible.

To businesses, these stacks manage cost and complexity. And to developers, they unlock everything from creativity, to education, to profit.

We’re applying this style of thought to the language AI world.

Steamship is a managed stack for language AI packages.

Steamship is the managed stack we would have wanted on our past language AI projects.

We’ve built a world-class, auto-managed reference stack capable of ingesting natural language data, training and running models on it, and querying across their results. On top of that stack, we’ve added an insanely simple SDK for building packages that you can share and use from any software environment.

We think the result is a game changer.

It gives developers that same “build, bundle, ship, share” mechanic that regular software has, with zero spaghetti infrastructure. And we’ve built it to work with all the great APIs out there today, whether it’s OpenAI, HuggingFace, Big Tech, or Hot Startup


Steamship makes it possible to publish language AI packages that anyone can use in minutes.

We’re rolling Steamship out over the coming months, starting today with three beta packages we’ve built ourselves. You can read more about them here or fill out an onboarding form here.

Over the next few months, we’ll follow that up with self-service signups, more packages, and the broader SDK for building and sharing packages of your own.

More than anything, we’re excited to be working on something that we know will help others build great things.

As fellow builders, we can’t wait to see what you ship!