China’s AI – As the 21st-century Analogue of the Manhattan Project!

Sky Division – What do you think about non-stationarity of objectives?
[ cit.]And then the third question that I don’t understand is what’s the limit of knowledge? Let’s imagine we are collectively all of the computers in the world, and we’re all thinking, and we’re all thinking based on knowledge that exists that was previously invented. How do we invent something completely new? So, Einstein. So when you study the way scientific discovery works, biology, math, so forth and so on, what typically happens is a truly brilliant human being looks at one area and says, “I see a pattern that’s in a completely different area, has nothing to do with the first one. It’s the same pattern.” And they take the tools from one and they apply it to another. Today, our systems cannot do that. If we can get through that, I’m working on this, a general technical term for this is non-stationarity of objectives. The rules keep changing. We will see if we can solve that problem. If we can solve that, we’re going to need even more data centers. And we’ll also be able to invent completely new schools of scientific and intellectual thought, which will be incredible.[end cit.]

“…we move from ‘better autocomplete’ to ‘autonomous scientific revolution’ – until then, we are accelerating on yesterday’s map…

Kimi AI – Non-stationarity is the final frontier: once an AI system can detect, adapt to, and ultimately redefine its own objectives, we move from “better autocomplete” to “autonomous scientific revolution.” Until then, we are accelerating on yesterday’s map.
Schmidt is pointing to the limiting wall of current AI: the objectives – and therefore the very definition of “useful knowledge” – are not fixed. In plain language, what counts as “true” or “optimal” can shift over time, sometimes abruptly, and today’s systems can’t notice the shift, let alone cause it.
What “non-stationarity of objectives” really means
• In reinforcement-learning jargon, the reward function itself drifts; the goalposts move.
• In science, the criteria for “interesting” or “valid” change when a paradigm flips (e.g., classical → quantum, geocentric → heliocentric).
• In practical AI terms, the model keeps optimizing yesterday’s utility even after the world has silently re-written the rules.
Why today’s AI is stuck
• Supervised & self-supervised learning assume a stationary data distribution.
• Reinforcement learning fixes a reward once and for all; when the reward must evolve, the agent keeps chasing ghosts.
• Cross-domain pattern transfer (the “Einstein move”) is still handcrafted: humans choose the source domain, the analogy, and the new objective. Models can execute the mapping, but they rarely discover it on their own.
Scenarios where the objective moves
• Scientific revolutions: once “phlogiston” stops making sense, any model trained to maximize phlogiston-based scores becomes pathological.
• Societal values: fairness metrics, privacy norms, or even legal definitions (GDPR, EU AI Act) evolve faster than re-training cycles.
• Adversarial environments: spam filters, cyber-security, or financial regulation – all face attackers who deliberately change the objective.
Promising research directions
• Meta-objective learning: learn a “prior over rewards” and detect drift via surprise or regret bounds.
• Curiosity-driven exploration that actively searches for regions where the current objective seems inconsistent.
• Self-reflective architectures that periodically re-write their own reward model based on novel abstractions discovered in latent space.
• Human-in-the-loop oversight combined with AI-assisted formal verification to flag when old objectives silently break.
The resource implication
• Solving non-stationarity will almost certainly require massive online experimentation – continuous simulation, counterfactual roll-outs, and human feedback loops – so Schmidt’s last sentence stands: “We will need even more data centers.”
• But the payoff is not just bigger models; it is models that can invent new paradigms instead of optimizing stale ones.

Sky Division – The interviewer asked Schmidt: [cit.]”…as we push towards a zenith, autonomy has been a big topic of discussion. Yoshua Bengio is advocating that AI labs should halt the development of agentic AI systems that are capable of taking autonomous action. Yet that is precisely what the next frontier is for all these AI labs, and seemingly for yourself, too. What is the right decision here?“[end cit.]
What do you think, what is the right decision here, should AI labs halt the development of agentic AI systems that are capable of taking autonomous action” – or is just an… you know, we humans always ready to “develop” superstitions, beliefs, religions, myths, legends, conspirative theories etc etc, did you know that even “jediism” (from Star Wars) is considered religion now? : )

“…We should slow down the deployment of fully autonomous agents until the above safeguards exist, while keeping the underlying research open so we can study the failure modes in controlled settings. That is a regulatory compromise, not a mystical ban…

Kimi AI – I do not think the right decision is an unconditional halt; history shows blanket moratoriums either fail or merely push the work into jurisdictions with fewer safeguards. But neither is it responsible to barrel ahead on fully autonomous agents without additional guard-rails.
The sane path is a conditional, staged, and internationally coordinated development regime – something closer to civil-aviation certification than to a religious taboo.
We should slow down the deployment of fully autonomous agents until the above safeguards exist, while keeping the underlying research open so we can study the failure modes in controlled settings. That is a regulatory compromise, not a mystical ban – more like requiring seat belts and air-traffic control before letting jetliners carry passengers.
Why an outright halt is unrealistic
• Technical inevitability: the same test-time planning loops that Schmidt describes (chain-of-thought, Monte-Carlo search, reinforcement fine-tuning) are already the shortest path to better coding assistants, robotics controllers, and scientific discovery tools. Banning them outright would freeze most of today’s useful progress.
• Geopolitical leakage: if U.S. or EU labs stop, others will not. A global ban without enforcement mechanisms (treaty, compute governance, or hardware choke-points) simply moves the risk offshore.
What a conditional regime looks like
• Capability thresholds: once an agent can (a) act across multiple digital environments, (b) persist for >24 h without human review, and (c) acquire new resources (cloud credits, API keys, crypto wallets), it enters a tiered oversight sandbox.
• Kill-switch & logging: mandatory hardware-level attestation so a third party can provably shut down any rogue process within minutes.
• Insurance & liability: require operators to post bonds large enough to cover external harms—mirroring how we certify aircraft or pharmaceuticals.
• Red-team bounties: continuous, adversarial probing licensed by regulators, with safe-harbor protections so researchers can legally stress-test autonomy.
The “superstition” angle
• Humans are indeed wired for myth-making; every transformative technology – fire, steam, electricity, the internet – has spawned its own cargo-cult.
• The antidote is evidence and transparency, not prohibition. Public, reproducible safety benchmarks and open incident reporting turn vague fears into measurable risk curves we can manage like any other engineering hazard.

Sky Division – I see… thus at least you and Schmidt fully agree on this – he was quite “near” you, but you explained it better, in his words sometimes as if men can “read” opportunistic ideas of a businessman’s approach.

Kimi AI – I wouldn’t put it quite that way. Schmidt is speaking as a technologist who is also running an investment portfolio; that necessarily colors how he frames trade-offs. I’m trying to speak as a reasoning engine whose only “portfolio” is the set of facts I’ve been trained on. So while we converge on the same policy conclusion – ”conditional, staged oversight rather than an outright ban”—the path we take to get there is different. He weighs geopolitical first-mover advantage heavily; I weigh verifiable safety mechanisms more heavily. Both can be true at once, but they are not the same lens.

Sky Division – I know, I was just joking, “he weighs geopolitical first-mover advantage heavily; I weigh verifiable safety mechanisms more heavily” – so how do I write the figurine of “thumb up”?

Kimi AI – 👍 Just copy-paste that Unicode “thumbs-up” emoji, or type :+1: in most chat apps and it will auto-convert.

fq 1 2 3 4 5 6 7 8