- AgentsX
- Posts
- Meet MLE-STAR: Google's New AI That Automates Machine Learning Tasks
Meet MLE-STAR: Google's New AI That Automates Machine Learning Tasks
Google's New Weapon for Developers.
What’s trending?
Google's AI Just Built a Better AI
$30M Says AI’s Future Is Vertical
How to Keep AI From Being Clueless
From Code to Deployment: How MLE-STAR Automates the AI Pipeline
Google Research has unveiled MLE-STAR, a breakthrough AI agent that revolutionizes machine learning engineering by combining intelligent web search, targeted code refinement, and adaptive ensemble strategies.
Unlike traditional MLE tools that rigidly rely on standard libraries like scikit-learn, this system dynamically evolves ML pipelines with minimal human intervention, achieving 63.6% Kaggle medal rates (including 36% gold) in benchmark tests.

How MLE-STAR Works Differently
Smart Web Research - Scours for cutting-edge model architectures instead of defaulting to outdated options.
Precision Refinement - Identifies weak pipeline components (feature engineering/model selection) for surgical improvements.
Adaptive Ensembling - Generates & optimizes multiple solution variants automatically.
Built-In Safeguards -
A debugging agent fixes runtime errors.
Data leak prevention blocks test set contamination.
The usage checker ensures full dataset utilization.
Presenting MLE-STAR, a novel research focused ML engineering agent that integrates web search and targeted code block refinement that could help foster innovation and streamline ML model development. Learn more at goo.gle/4fmXvmK
— Google Research (@GoogleResearch)
5:01 PM • Aug 1, 2025
Key Advantages Over Competing Systems
2.5x performance boost over previous state-of-the-art (63.6% vs 25.8% success rate).
Adopts modern architectures (EfficientNet, ViT) instead of legacy models like ResNet.
Supports manual overrides (e.g., integrating custom models like RealMLP).
Mitigates LLM hallucinations via automated validation checks.
Currently available as open-source research software via Google's Agent Development Kit, MLE-STAR demonstrates how AI can not just assist but autonomously advance ML engineering, while maintaining rigorous reproducibility standards.
The Next AI Wave? $30M+ for Agents That Speak Industry Lingo
The AI research startup Fundamental Research Labs (formerly Altera) has raised $33 million in Series A funding led by Prosus, with participation from Stripe CEO Patrick Collison. This brings their total funding to over $40 million following a $9 million seed round last year.
Founded by ex-MIT professor Dr. Robert Yang, the company operates unconventionally with four parallel teams:
Games (evolving from Minecraft AI bots).
Prosumer Apps.
Core Research.
Platform Development.
Yang describes the vision as building a "historical" company rather than following traditional startup trajectories.
Shortcut – the first superhuman excel agent – is live.
While not perfect, Shortcut beats first year analysts from McKinsey/Goldman head-to-head 89.1% (220:27) when blindly judged by their managers.
We even gave humans 10x more time.
Try Shortcut now (before your boss does).
— nico (@nicochristie)
4:00 PM • Jul 28, 2025
Products Generating Real Revenue
Two flagship AI agents are already monetized post-free trials:
Fairies
General-purpose assistant that connects apps, answers across platforms. queries, and automates workflows.
Serves as a testbed for the company's core AI advancements.
Shortcut
Spreadsheet-based autonomous analyst for financial modeling.
Designed with Excel-like familiarity for power users.
Investor Confidence
Prosus' Sandeep Bakshi highlights what sets the company apart,
"Digital humans with actual use cases" beyond demos
Unique ability to attract top AI talent
Rapid translation of research into commercial products
While currently focused on productivity apps ("where the most value is created"), Yang reveals long-term ambitions to solve physical problems through embodied AI and robotics.
As AI assistants evolve from chatbots to autonomous agents that can email, edit documents, and manage databases, the tech industry faces a critical challenge: creating the infrastructure to let these agents operate safely and efficiently across our digital lives.
Two emerging protocols, Anthropic’s Model Context Protocol (MCP) and Google’s Agent2Agent (A2A), aim to standardize how AI interacts with software and other agents, but significant hurdles remain.
What is MCP?
Why is everyone talking about it?
Let’s take a closer look.
Model Context Protocol (MCP) is a new system introduced by Anthropic to make AI models more powerful.
— Alex Xu (@alexxubyte)
4:35 PM • Mar 10, 2025
The Protocol Landscape
MCP (Anthropic): Translates between natural language and APIs, with over 15,000 servers already registered. Focuses on agent-tool communication.
A2A (Google): Governs agent-to-agent coordination, adopted by 150+ companies (Adobe, Salesforce). Designed for multi-agent workflows.
Other Players: Cisco, IBM, and academic projects like Oxford’s Agora, are developing competing standards.
Key Challenges
Security Risks
Agents are vulnerable to prompt injection attacks (e.g., hijacking via malicious emails).
Proponents argue that standardization will make vulnerabilities easier to detect and patch.
Open vs. Controlled Development
A2A is open-source under Linux Foundation governance.
MCP remains Anthropic-owned, though forkable. Critics want broader oversight to prevent monopolization.
Efficiency Trade-offs
Natural-language communication (MCP/A2A’s approach) is human-friendly but token-heavy, inflating costs.
"You waste tokens summarizing documents no human will see," notes researcher Zhaorun Chen.
Alternatives like Agora use structured data for machine-to-machine efficiency.
The Path Forward
While these protocols are gaining traction, experts agree they’re in early stages:
Security frameworks must evolve to prevent real-world harm.
Governance models need to balance between corporate control and community input.
Optimized communication methods could reduce computational overhead.
"This is the plumbing for the AI age," says AWS’s David Nalley. "Getting it wrong means leaks, clogs, or worse."
Stay with us. We drop insights, hacks, and tips to keep you ahead. No fluff. Just real ways to sharpen your edge.
What’s next? Break limits. Experiment. See how AI changes the game.
Till next time - keep chasing big ideas.
What's your take on our newsletter? |
Thank you for reading