ML for SWEs #22: 1000+ AI Agents Make a Civilization, xAI Creates the Largest Supercomputer in 4 Months, Possible AI Research Fraud, and More
Machine learning resources and updates 2024-09-09
You may notice an update to the format this week. Iām going to take more time to write about why something is important and highlight the most important developments. Iām also going to include important discussions that occurred throughout the week. Iāve also gotten rid of the table of contents now that Substack adds it automatically. Of course, the full reading list will still be available to supporters.Ā
Finally, Iām going to add some information about the job market and the skills to succeed in ML. Iām still figuring out the best way to do that, but for now Iāve added a job skills section with resources for how to learn the skills to my ML road map. Let me know what you think of the new format! I also apologize for any grammatical or spelling mistakes this week. Iām excited about this format and wanted to get it together ASAP even though my sources werenāt as organized as Iād like them to be.
Here are the most important machine learning resources and updates from the past week. Follow me on X and/or LinkedIn for more frequent posts and updates. You can find last week's updates here.
Support the Society's Backend community for just $1/mo to get my full reading list each week. Society's Backend is reader-supported. Thanks to all paying subscribers! š
Highlights
Anthropicās Customer Support RAG
Anthropic partnered with Pietro Schirano to create a customer support agent using RAG. The entire repo is available on GitHub for anyone to use. This is an excellent starting point for any RAG system and has been added to the Claude Quickstart repo. Iād encourage anyone looking to build something like to give it a try. Itās reported to be easily scalable and developer friendly. I know many of you build AI systems so I figured this would be valuable. Iām debating using it to organize all these sources I come up with and making them easily searchable. Let me know if you think that would be useful.
Project Sid
A Minecraft server was developed with 1000+ autonomous AI agents to create a virtual world and the results were astounding. Iām going to link to the post on X with a video detailing the process of creating this world and its outcome. It shows the capabilities of AI agents working together and how they can interact at an unprecedented scale. The agents created religion, government, culture, and economy and more.
Understanding Microsoft's āAurora: A Foundation Model of the Atmosphereā [Breakdowns]
Microsoft introduced Aurora, a 1.3 billion parameter foundation model designed for high-resolution weather and atmospheric forecasting. Aurora leverages a 3D Perceiver encoder and Swin Transformer U-Net backbone to handle heterogeneous data and predict weather patterns with high accuracy, even with limited training data. It outperforms traditional simulation tools and specialized deep learning models by producing 5-day global air pollution predictions and 10-day high-resolution weather forecasts quickly. Devansh wrote an excellent breakdown on the Aurora paper here.
xAIās massive training cluster
xAI has launched the Colossus supercomputer, currently the world's most powerful AI training system, featuring 100,000 Nvidia H100 GPUs. Plans are in place to double its capacity to 200,000 GPUs, combining 50,000 Nvidia H100 and H200 models, in the coming months. This is an incredible feat because it took only about 4 months to get it set up. This is a logistic masterpiece and great collaboration between Dell, Nvidia, and xAI.
š„Top ML Papers of the Week
I always highlight paper overviews and this week has had many interesting papers released. Key papers include AlphaProteo, which significantly improves protein binding affinities, and MemLong, which extends LLM context length up to 80k tokens. Additionally, research shows generative AI tools boost software developers' productivity by over 26%. Check out the summaries here.
Discussions
AlphaProteo
Google released AlphaProteo, an AI model that designs novel proteins for drug development. There was some interesting discussion on X that I had to get involved in where the sentiment toward the AI Google puts out to help with drug development is negative. The general sentiment seems to be that Google is putting out these models and they arenāt helping with anything. There was even a large account that stated Google should be focusing on curing cancer and diabetes instead of these models as if those two tasks are simple.
I had to clarify that misinformation and explain how beneficial these models have been. It resulted in me getting blocked, but one of the reasons Iām on X is to help shake misinformation. For a simple explanation, Googleās AI models help design novel proteins that can be used to treat illnesses and symptoms. The AI has taken a process from hours to minutes. This is only a small portion of drug development. After these proteins are determined, there are years of other processes before a drug hits the market. Drug development is a complex and time-consuming process and most people donāt seem to understand this. If you want a more comprehensive timeline for drug development, hereās a good explanation.
OpenAIās Large Price Tag
Thereās been a lot of discussion surrounding an estimate for OpenAIās pricing going forward. The estimate (unconfirmed as far as I can tell) puts OpenAIās cost for monthly subscription at ~$2000. This is enterprise pricing. Many people on X are making posts as if this will be charged to consumers. This is a pretty fair enterprise price in my opinion as it can bring in this much value and enterprise-grade support must also be provided.
TIMEās Top 100 Influential People in AI
TIME posted their top 100 most influential people in AI. They included AI influencers and celebrities while leaving out many people who have contributed significantly to actual technical contributions. A glaring omission is Elon Musk who has, love him or hate, contributed significantly by facilitating a lot of cutting-edge AI research, development, and products. This has sparked discussion about how truthful journalism really is.
Reflection 70Bās Misfire
Iām not a huge fan of highlighting when people/companies make mistakes unless itās important and I feel this is. A company released a 70B parameter model claiming it outperformed much larger models. Upon independent inspection, the claims couldnāt be verified and there was evidence shown that release was entirely truthful. For a more detailed report, check out this post on X. Itās a bit more negative than I would like, but it does show all the claims found across X over the past few days. Hereās another post about the current state of analyzing it. If something comes up where Reflection 70B is actually really good, Iāll let you know.
CosmDallas is a Massive Dome to Create an Immersive VR-like Experience
Iām including this one just because I think itās cool and it also solves a lot of the problems with current VR. There is a dome in Dallas that displays football games from a camera inside the actual stadium. The dome allows for many fans to fit inside and experience the game via the camera. Watching sports is very much a group experience and this method of doing so captures the benefits of VR while creating a more immersive experience.
Keep reading with a 7-day free trial
Subscribe to Machine Learning for Software Engineers to keep reading this post and get 7 days of free access to the full post archives.