ML for SWEs 69: This is how top AI research labs will make money

My thoughts on OpenAI's Dev Day killing startups and monetizing top AI research labs

Oct 08, 2025

I want to say a quick thank you to you all! We hit bestseller status on Substack this past week. Thank you for your support!

I apologize that this won’t be one of my usual roundups full of resources. I’m revamping how I track things to make it more efficient and happened to lose what I saved this week. On an unrelated note, if you can get the Gmail connector to work reliably in ChatGPT, Claude, or Gemini, shoot me a DM.

By next week it’ll be figured out and more helpful for both me and you.

That being said, there’s still something really important you should know about that I’ve spent a good amount of time thinking about this week. First, here’s a quick roundup on OpenAI’s Dev Day:

Apps are in ChatGPT now. This means you can ask ChatGPT to do something you’d normally do in an app via its chat interface, and it’ll take care of it for you. This changes the app interface from a visual, touch-focused interface to a chat-first interface, which is much more suitable for digital assistants. This is something that needs to be done if we’re ever to have truly capable digital assistants, but I’m curious to see how well it works. I’ve yet to get either Projects or Tasks working reliably, so I’m not 100% confident when this will get there.
AgentKit. This is a framework for building and deploying agents. It’s similar to Google’s Opal or something like n8n. It seems locked to OpenAI’s models. I don’t see a platform like this thriving without flexibility in model choice. It will surely be useful to some people, but I know a lot of people exploring agents who are constantly fitting new models to problems to optimize the agent’s workflow. I don’t see the advantage a platform like this, locked to a specific vendor, provides over an open platform with the same features. It’s possible I’m missing some key financial info, but that’s my take.
Chat integration SDK. This makes it easy for anyone to make their own ChatGPT-style chatbot directly on their website. It seems this is also OpenAI models only, but for this application, I actually think it works. Most people using this are looking for a low-effort solution to have a chatbot. My guess is anyone using this will care more about it “just working” than what the underlying model is. They won’t be tweaking nearly as much as the agent builders in the previous bullet.
Agentic evals. Model evals and benchmarks are difficult. Agent evals are even harder. OpenAI has released a toolkit for creating and running evals in AgentKit. This toolkit can also be used for agents built with third-party models. The community will take as many agent eval tools as they can get at this point because they’re so difficult most people don’t even do them. That’s like creating code without proper testing but possibly even scarier. Making agent evals easier and reminding people to create evals is a net positive for the developer community.
Codex will be getting updates. There wasn’t much of an update to Codex, but OpenAI realizes how the developer community has embraced it and will continue improving it.
Sora 2 is available via API. High-quality video generation can now be built directly into products with control over multiple generation properties (think length, resolution, pacing). I think this alone opens up more agent use cases.

Dev Day and a few interactions I’ve had this week have had me thinking a lot about how money has shaped the AI race.

Leading up to Dev Day, many people were talking about how OpenAI was going to kill startups. This has been a theme for the past few years, but this year was a bit different. I didn’t feel like any of the announcements immediately killed off many startups relying on OpenAI’s APIs to do the same.

This is because OpenAI understands where their income potential lies—and it isn’t at the application layer but in enabling it. I felt that was the main takeaway from Dev Day.

This is something Google realized a while ago, but Google was already building for developers, so it made more sense. OpenAI started by building a productivity app and gaining recognition that way. But that app is far less profitable than enabling others to build apps using OpenAI’s technology.

As I was reading What 55 Billion Chatbot Visits Actually Tell Us About the AI Race and the Best AI Chatbots Right Now by

Devansh

this week, I realized this is actually why the Gemini app lacks so many features. I believe Google realizes the Gemini app brings in a lot more token consumption without bringing in significant revenue. It makes sense to focus development efforts elsewhere and provide just enough to stay relevant. As a Google employee, you’d think I would have realized this sooner.

You might think the Gemini app has potential to bring people into the Google ecosystem to spend more—and you’d be right. But the Gemini built into Google Workspace, Colab, and other popular user tools has far more potential to do this than the standalone Gemini app.

If you haven’t read Devansh’s article I linked above, you should. It’s an excellent read.

Interestingly, the company I think identifies and targets revenue sources best is Anthropic. For years, it seemed like Anthropic would struggle to compete with the likes of OpenAI and Google. Anthropic doesn’t have any video or audio generation, they’re not nearly as generous with user limits, and they don’t provide as many tools to users—but they’re still around and thriving.

Everyone knows them for vibes and safety, but in reality they do an incredible job of understanding where they can provide value and focus on that (I’m looking at you, Claude Code). To me, it seems like they do a great job of going heads down on the problems they know matter and running their own race.

There’s always such a focus on how large AI research labs burning through money will monetize. I find these conversations always come back to the applications they’ll provide to make money. This perception might be skewed by the fact that I generally chat with engineers and we live at the application layer, but the point still stands—I don’t think the application layer is the plan for these labs, at least not entirely.

These labs are focused on providing the tools needed for others to build applications with their technology. I see fewer companies being killed by OpenAI, DeepMind, Anthropic, and others, and more thriving because of them.

I don’t think this Dev Day killed too many startups. Most people I know building agents went, “That’s cool,” and went back to building.

Let me know if I’m wrong in the comments and what you think. Do you think top AI research labs will dominate at the application layer or focus on enabling it?

Thanks for reading!

Always be (machine) learning,

Logan

Machine Learning for Software Engineers

Discussion about this post