📰 The news — say this first
Google DeepMind's AI solved 9 unsolved math problems (two stuck for 56 years) and verified its own proofs with no human checking. A day after OpenAI claimed a similar win that still needed human experts to confirm it.
Why it matters
More compute doesn't help if the AI just hallucinates faster. The only way to use that infinite power is if the machine can
verify its own reasoning. That's exactly what happened in math this week.
"First, OpenAI claimed a breakthrough, disproving an 80-year-old math conjecture. But they had to lean on human experts to verify it. Then, a day later, Google DeepMind announces AlphaProof Nexus solved 9 open Erdős problems. Two of them unsolved by humans for 56 years."
"And the real story is how. DeepMind's proofs were machine-verified, automatically, with no human in the loop."
"Walk people through that, because it's the key."
"They paired a creative language model with a formal logic solver. The language model is the creative student tossing out ideas. The logic solver is the ruthless professor. It grades every step, and if there's one flaw, it rejects it and makes the model try again. Thousands of loops, until the proof is flawless."
"And here's what I love. Demis Hassabis, the CEO who just oversaw this, comes out and tempers the hype. He says this is still not AGI. He calls it the foothills of the singularity, and puts true AGI about 4 years out."
"4 years is still fast. But once a machine can formally verify its own logic, it graduates from a text predictor to an autonomous reasoning engine. That's the same mechanism powering the coding agents."