ML for SWEs 69: This is how top AI research labs will make money
My thoughts on OpenAI's Dev Day killing startups and monetizing top AI research labs
I want to say a quick thank you to you all! We hit bestseller status on Substack this past week. Thank you for your support!
I apologize that this wonāt be one of my usual roundups full of resources. Iām revamping how I track things to make it more efficient and happened to lose what I saved this week. On an unrelated note, if you can get the Gmail connector to work reliably in ChatGPT, Claude, or Gemini, shoot me a DM.
By next week itāll be figured out and more helpful for both me and you.
That being said, thereās still something really important you should know about that Iāve spent a good amount of time thinking about this week. First, hereās a quick roundup on OpenAIās Dev Day:
Apps are in ChatGPT now. This means you can ask ChatGPT to do something youād normally do in an app via its chat interface, and itāll take care of it for you. This changes the app interface from a visual, touch-focused interface to a chat-first interface, which is much more suitable for digital assistants. This is something that needs to be done if weāre ever to have truly capable digital assistants, but Iām curious to see how well it works. Iāve yet to get either Projects or Tasks working reliably, so Iām not 100% confident when this will get there.
AgentKit. This is a framework for building and deploying agents. Itās similar to Googleās Opal or something like n8n. It seems locked to OpenAIās models. I donāt see a platform like this thriving without flexibility in model choice. It will surely be useful to some people, but I know a lot of people exploring agents who are constantly fitting new models to problems to optimize the agentās workflow. I donāt see the advantage a platform like this, locked to a specific vendor, provides over an open platform with the same features. Itās possible Iām missing some key financial info, but thatās my take.
Chat integration SDK. This makes it easy for anyone to make their own ChatGPT-style chatbot directly on their website. It seems this is also OpenAI models only, but for this application, I actually think it works. Most people using this are looking for a low-effort solution to have a chatbot. My guess is anyone using this will care more about it ājust workingā than what the underlying model is. They wonāt be tweaking nearly as much as the agent builders in the previous bullet.
Agentic evals. Model evals and benchmarks are difficult. Agent evals are even harder. OpenAI has released a toolkit for creating and running evals in AgentKit. This toolkit can also be used for agents built with third-party models. The community will take as many agent eval tools as they can get at this point because theyāre so difficult most people donāt even do them. Thatās like creating code without proper testing but possibly even scarier. Making agent evals easier and reminding people to create evals is a net positive for the developer community.
Codex will be getting updates. There wasnāt much of an update to Codex, but OpenAI realizes how the developer community has embraced it and will continue improving it.
Sora 2 is available via API. High-quality video generation can now be built directly into products with control over multiple generation properties (think length, resolution, pacing). I think this alone opens up more agent use cases.
Dev Day and a few interactions Iāve had this week have had me thinking a lot about how money has shaped the AI race.
Leading up to Dev Day, many people were talking about how OpenAI was going to kill startups. This has been a theme for the past few years, but this year was a bit different. I didnāt feel like any of the announcements immediately killed off many startups relying on OpenAIās APIs to do the same.
This is because OpenAI understands where their income potential liesāand it isnāt at the application layer but in enabling it. I felt that was the main takeaway from Dev Day.
This is something Google realized a while ago, but Google was already building for developers, so it made more sense. OpenAI started by building a productivity app and gaining recognition that way. But that app is far less profitable than enabling others to build apps using OpenAIās technology.
As I was reading What 55 Billion Chatbot Visits Actually Tell Us About the AI Race and the Best AI Chatbots Right Now by
this week, I realized this is actually why the Gemini app lacks so many features. I believe Google realizes the Gemini app brings in a lot more token consumption without bringing in significant revenue. It makes sense to focus development efforts elsewhere and provide just enough to stay relevant. As a Google employee, youād think I would have realized this sooner.You might think the Gemini app has potential to bring people into the Google ecosystem to spend moreāand youād be right. But the Gemini built into Google Workspace, Colab, and other popular user tools has far more potential to do this than the standalone Gemini app.
If you havenāt read Devanshās article I linked above, you should. Itās an excellent read.
Interestingly, the company I think identifies and targets revenue sources best is Anthropic. For years, it seemed like Anthropic would struggle to compete with the likes of OpenAI and Google. Anthropic doesnāt have any video or audio generation, theyāre not nearly as generous with user limits, and they donāt provide as many tools to usersābut theyāre still around and thriving.
Everyone knows them for vibes and safety, but in reality they do an incredible job of understanding where they can provide value and focus on that (Iām looking at you, Claude Code). To me, it seems like they do a great job of going heads down on the problems they know matter and running their own race.
Thereās always such a focus on how large AI research labs burning through money will monetize. I find these conversations always come back to the applications theyāll provide to make money. This perception might be skewed by the fact that I generally chat with engineers and we live at the application layer, but the point still standsāI donāt think the application layer is the plan for these labs, at least not entirely.
These labs are focused on providing the tools needed for others to build applications with their technology. I see fewer companies being killed by OpenAI, DeepMind, Anthropic, and others, and more thriving because of them.
I donāt think this Dev Day killed too many startups. Most people I know building agents went, āThatās cool,ā and went back to building.
Let me know if Iām wrong in the comments and what you think. Do you think top AI research labs will dominate at the application layer or focus on enabling it?
Thanks for reading!
Always be (machine) learning,
Logan