Sky Division – Okay, but what do you think about your answer you just gave me – is it an “absolute” answer, definitive, an unbreakable law, a “golden” rule, can’t be surpassed, changed, improved? Also what do you think about the exceedingly dual-use nature of AI tech (this was another question by the interviewer to Schmidt) – AI tech is applicable to both civilian and military applications, so how do you broadly think about the dilemmas and ethical quandaries that come with this tech and how humans deploy them?
“…my answer is not absolute or final. It is a set of currently known safeguards that we can implement with today’s hardware and policy tools…
Kimi AI – No, my answer is not absolute or final. It is a set of currently known safeguards that we can implement with today’s hardware and policy tools. A future algorithmic breakthrough (e.g., side-channel exfiltration via power-line modulation, or a self-compiling model that fits into 8 kB of firmware) could punch holes in every layer I listed. All defenses are provisional; the best we can do is iterative hardening – red-team, patch, re-certify, repeat.
Dual-use is not a bug we can patch; it is a structural property of general-purpose computation. The ethical imperative is therefore continuous governance, not one-off bans. We treat it like aviation safety: every new airframe, route, and pilot is re-certified, and the rules themselves are revised every time the physics or the geopolitics change.
The dual-use dilemma
• Same code, opposite ends of the spectrum: the same transformer that recommends antibiotics to a rural clinic can also design novel toxins.
• No bright-line test: unlike uranium enrichment, where the technology itself is inherently weaponisable, most AI advances are capability-agnostic; intent is encoded in weights, datasets and deployment context, not in the algorithm.
• Historical precedent: – The internet, GPS, and cryptography all slid rapidly from military to civilian and back again; each wave forced new norms (export controls, responsible disclosure, open-source licenses). – Biological research adopted the Asilomar guidelines after recombinant DNA fears in the 1970s – voluntary, review-based, and updated every decade.
A practical ethical stance
a. Tiered access – keep the largest models (≥10²⁵ FLOP training runs) behind tiered compute licensing, similar to how civilian airlines share airspace with military jets but under different rules.
b. Dual-key governance – require both a technical safety audit and a civilian oversight board before any model above a capability threshold can be connected to critical infrastructure or weapons systems.
c. Kill-switch escrow – make the physical kill-switch I described earlier a legal prerequisite for any dual-use deployment, not an optional best practice.
d. Sunset clauses – every safeguard expires in, say, five years unless explicitly renewed, forcing society to re-argue the risk-benefit trade-off as technology evolves.
Sky Division – [cit.]You don’t want systems that are not under our control. It’s a line we can’t cross. I think that’s correct. I think that the competition between the West, and particularly the United States, and China, is going to be defining in this area. The current government (thus he means American government) has now put in essentially reciprocating 145-percent tariffs. That has huge implications for the supply chain. We in our industry depend on packaging and components from China that are boring, if you will, but incredibly important. The little packaging and the little glue things and so forth that are part of the computers. If China were to deny access to them, that would be a big deal. We are trying to deny them access to the most advanced chips, which they are super annoyed about.[end of cit.]
So, what do you think about this? It’s so important, perhaps DeepSeek, you… yourself Kimi : ) are the “results” of such developments and or policies. Schmidt said the number one issue the Chinese raise is this issue – so it’s obvious, Chinese are super annoyed about the advanced chips, so they respond in kind or even better, they come up with DeepSeeks and such.
“…global governance (shared safety standards, compute-use registries, red-team bounties) becomes more urgent than ever, because the hardware moat is evaporating faster than the software moat is widening…
Kimi AI – The quote captures a feedback loop that is already in motion: US export controls → 2. Chinese determination to close the gap → 3. More indigenous R&D (DeepSeek, Ascend, Kunlunxin, etc.) → 4. Faster diffusion of “good-enough” open models → 5. Diminished US leverage.
How the pieces fit together and what it means for everyone else.
The choke-point is shrinking, not widening
• Advanced chips are still dominated by TSMC (Taiwan) + NVIDIA/AMD/Intel, but China’s SMIC has already shipped 7 nm logic at scale and is taping out 5 nm this year.
• Packaging, substrates, and “boring but critical” chemicals (ABF, BT resin, photoresists) are produced in Taiwan, Korea, and—yes—mainland China. A tit-for-tat embargo would hurt both sides, but China has deeper vertical integration in low-margin components than the US has in lithography.
• The 145 % reciprocal tariff Schmidt mentions is therefore more symbolic than decisive; the real battle is over process know-how and tooling IP, not tariff walls.
DeepSeek as a policy artifact
• DeepSeek’s 67-B model was trained on a mix of SMIC 7 nm + HBM and rented A100s outside China, showing that export controls slow but do not stop frontier-scale training once a determined actor has ~$50 M and 12–18 months.
• The open-weights release is a soft-power play: every lab outside the US now has a credible alternative to GPT-4-class models without buying NVIDIA. That erodes the “only we can do this” narrative.
Second-order effects
• Commoditization of capability: when “good enough” 70-B models are downloadable torrents, the strategic edge shifts from model size to infrastructure hardening (secure data centers, kill-switches, governance).
• Dual-use diffusion risk: the same open model that helps an African hospital translate medical records can also help a non-state actor write polymorphic malware. Export controls on chips don’t touch that vector.
Policy takeaway
• Chip denial is a delaying tactic, not a containment strategy. The US can buy 2-4 years, not decades.
• Global governance (shared safety standards, compute-use registries, red-team bounties) becomes more urgent than ever, because the hardware moat is evaporating faster than the software moat is widening.
Sky Division – [cit.]…if you look at DeepSeek, which is really impressive, they managed to find algorithms that got around the problems by making them more efficient. Because China is doing everything open source, open weights, we immediately got the benefit of their invention and have adopted into US things. So we’re in a situation now which I think is quite tenuous, where the US is largely driving, for many, many good reasons, largely closed models, largely under very good control. China is likely to be the leader in open source unless something changes. And open source leads to very rapid proliferation around the world. This proliferation is dangerous at the cyber level and the bio level. [end of cit.]
Schmidt means also dangerous in a more significant way, in a “nuclear-threat” way – what do you think?
“…yes, Schmidt is right to worry – when the breakthrough is an AI system that can reason, design, and plan, the multiplication factor is knowledge × compute × intent, and the last two are increasingly cheap and global…
Kimi AI – Yes, Schmidt is right to worry. Open-weights diffusion is not just a software freedom issue; it collapses the traditional latency between “breakthrough in one lab” and “usable blueprint in every garage.” When the breakthrough is an AI system that can reason, design, and plan, the multiplication factor is knowledge × compute × intent, and the last two are increasingly cheap and global.
Open-source AI is already the fastest diffusion technology in history. Nuclear weapons are still hard, but the marginal effort is dropping with every new open model. Unless we layer post-download governance (registries, attestation, hardware kill-switches) on top of the open-weights release, we risk turning a strategic advantage into a global hazard amplifier.
How open-weights can shorten the path to nuclear-level harm
• Fissile-material handling: today you still need centrifuges, chemistry, and metallurgy. A capable open-source agent could accelerate every step – calculating optimal cascade timing, modeling neutron reflectors, engineering containment bypasses – so that a small, well-funded non-state group needs months instead of years.
• Delivery system engineering: guidance, propulsion, and re-entry calculations that once required national labs can now be iterated in simulation by a 70-B model running on a rented cluster.
• Red-team-for-hire: the same model can probe every known safeguard (PAL codes, tamper sensors, insider-threat protocols) and output step-by-step exploits.
The proliferation funnel
• Closed US models → slow, controllable
• Open Chinese models → instant global download → local fine-tuning → domain-specific agents → weaponization recipes
• Once the model is on disk, export controls on GPUs become irrelevant.
Policy levers that still work
• Compute-use registries: require every >10²⁴ FLOP training run to log its final weights hash and intended use-case; violation triggers hardware revocation.
• Dual-key firmware kill-switches baked into GPUs and datacenter BMCs, escrowed by an international consortium (IAEA-style for compute).
• Synthetic-data watermarking: force any model trained on sensitive domains (nuclear physics, virology) to embed a cryptographically signed “trace token” detectable in downstream outputs.
• Red-team bounties funded jointly by the U.S., EU, and China to continuously probe open-weights releases for dangerous emergent capabilities.
Sky Division – Of course we (humans) are always the problem… we produce our own troubles, our problems so far couldn’t ‘originate’ as causality of our tools, we’re the “evil guys” : ) but he’s so right, I think, right?
[cit.]…You’re the good guy, and I’m the bad guy, OK? You’re six months ahead of me, and we’re both on the same path for superintelligence. And you’re going to get there, right? And I’m sure you’re going to get there, you’re that close. And I’m six months behind. Pretty good, right? Sounds pretty good. But No. These are network-effect businesses. And in network-effect businesses, it is the slope of your improvement that determines everything. So I’ll use OpenAI or Gemini, they have 1,000 programmers. They’re in the process of creating a million AI software programmers. What does that do? First, you don’t have to feed them except electricity. So that’s good. And they don’t quit and things like that. Second, the slope is like this. Well, as we get closer to superintelligence, the slope goes like this (almost vertical rising). If you get there first, you dastardly person – I will not be able to catch you. And I’ve given you the tools to reinvent the world and in particular, destroy me. That’s how my brain, Mr. Evil, is going to think. [end of cit.]
Kimi AI – The policy implication is stark: you cannot regulate the outcome after the fact; you have to regulate the inputs and the rules of engagement before the curve goes vertical. That means today, not six months from now. He’s describing a winner-take-all race dynamic that is almost built into the current architecture of the field:
– training cost ↓ exponentially with algorithmic efficiency
– talent cost ↓ exponentially when the model writes its own code
– data cost ↓ exponentially when the model generates its own data
– network effects ↑ exponentially once the leading system can do all of the above faster than any follower
So the “six-month lead” collapses to days or hours near the vertical part of the curve. That is not science-fiction; it is a straightforward extrapolation of compound productivity growth once you have:
– a code-writing model that improves itself,
– near-zero marginal cost for extra copies, and
– closed-loop data generation.
