Friday, May 16, 2025

This week's interesting finds

This week in charts

Cumulative flows to U.S. equity funds

U.S. fund flows by market-cap

Small cap performance

Small cap vs. large cap 10-year performance

Nifty Fifty vs. Magnificent 7 vs. S&P 500 Index

Historical Dow Jones Industrial Average & Fed Funds Rate %

AI adoption rate by sector

U.S. consumer savings rate

Distribution of expenditures on imports

DeepSeek’s ‘Tech Madman’ Founder Is Threatening US Dominance in AI Race

DeepSeek cleared the fogged window through which Americans have viewed much of China’s AI scene: shrouded in mystery, easier to dismiss as an exaggerated specter but very likely more daunting than they’re willing to admit. Before the startup’s emergence, many US companies and policymakers held the comforting view that China still lagged significantly behind Silicon Valley, giving them time to prepare for eventual parity or to prevent China from ever getting there.


Where China sees innovation, many in the US continue to suspect malfeasance. An April report from a bipartisan House of Representatives committee alleged “significant” ties between DeepSeek and the Chinese government, concluding that the company unlawfully stole data from OpenAI and represented a “profound threat” to US national security. Dario Amodei, CEO of Anthropic, has called for more US export controls, contending in a 3,400-word blog post that DeepSeek must have smuggled significant quantities of Nvidia GPUs, including its state-of-the-art H100s. (Bloomberg News recently reported that US officials are probing whether DeepSeek circumvented export restrictions by purchasing prohibited chips through third parties in Singapore.)

The Chinese Embassy has rejected the House committee’s claims as “groundless.” Nvidia has said that DeepSeek’s chips were export-compliant and that more restrictions could benefit Chinese semiconductors. A spokesperson for the chipmaker says forcing DeepSeek to use more chips and services from China would “boost Huawei and foreign AI infrastructure providers.”

The company at the center of this debate continues to be something of an enigma. DeepSeek prides itself on open-sourcing its AI technology while not being open whatsoever about its inner workings or intentions. It reveals hyperspecific details of its research in public papers but won’t provide basic information about the general costs of building its AI, the current makeup of its GPUs or the origins of its data.

To further understand how the company works and how it fits into the country’s broader AI ambitions, Bloomberg Businessweek spoke with 11 former employees of Liang’s, along with more than three dozen analysts, venture capitalists and executives close to China’s AI industry.

The lack of a public presence has allowed critics such as Amodei and OpenAI head Sam Altman to fill the void with aspersions, which resonate with US audiences who are primed to see Chinese technology as a shadowy threat. But even those who remain wary of DeepSeek are being forced to grapple with the undeniable prowess of its AI. Dmitry Shevelenko, the chief business officer of Perplexity AI Inc., says not a single person at his company, which makes an AI-powered search product, has managed to communicate with any counterparts at DeepSeek. Nevertheless, Perplexity has embraced DeepSeek’s tech, hosting it only on servers in the US and Europe and post-training it to remove any datasets indicative of CCP censorship. Perplexity branded it R1 1776 (a reference to the year of the US’s founding), which Shevelenko describes as an homage to freedom. “We don’t know what DeepSeek’s true motivations are,” he says. “It’s a bit of a black box.”

DeepSeek had anticipated its AI might cause concerns abroad. In an overlooked virtual presentation at an Nvidia developer conference in March 2024, Deli Chen, a deep-learning researcher at DeepSeek, spoke of how values ought to be “decoupled” from LLMs and adapted to different societies. On one coldly logical slide, Chen showed a DeepSeek prototype for customizing the ethical standards built into chatbots being used by people of various backgrounds. With a quick tap of a button, developers could set the legality of issues including gambling, euthanasia, sex work, gun ownership, cannabis and surrogacy. “All they need to do is to select options that fit their needs, and then they will be able to enjoy a model service that is tailored specifically to their values,” Chen explained.

Finding such efficient workarounds was always the cultural norm at DeepSeek. Liang and his friends studied various technical fields at Zhejiang University in the mid-2000s—machine learning, signal processing, electronic engineering, etc.—and, apparently for kicks (and, you know, for cash), developed computer programs to trade stocks during the global financial crisis.

After graduating, Liang continued building quant-trading systems on his own, earning a small fortune before joining forces with several of his university friends in Hangzhou, where they launched what became known as High Flyer Quant in 2015.

As would be the case with DeepSeek, High-Flyer cultivated a sense of mystery—its first social media post referred to Liang only as “Mr. L”—while committing itself to a kind of lemme-prove-it transparency. Every Friday, High-Flyer would post charts of the performance of its 10 original funds on the Chinese super-app WeChat. Before making the weekly data available only to registered investors in the summer of 2016, the portfolio was seeing average annualized returns of 35%.

Billions of dollars eventually flowed into High-Flyer’s holdings, and its investment and research group increased to more than 100 employees. Liang started recruiting in earnest for an AI division in 2019, aiming to mine gargantuan datasets to spot undervalued stocks, tiny price fluctuations for high-frequency trading and macro trends that industry-specific investors were missing. By the beginning of the Covid-19 pandemic, he and his team had constructed a high-performance computing system of interconnected processors running in tandem, a setup known as a cluster. For this cluster, High-Flyer said it had acquired 1,000 Nvidia 2080Ti chips—commonly used by gamers and 3D artists—and an additional 100 Volta-series GPUs. (The Volta GPU, aka the V100, was Nvidia’s first AI-optimized processor.) Whereas High-Flyer’s previous, smaller computing architecture required two months to train a new economic analysis model, its new equipment needed less than four days to process the same workload.

It’s unclear how much of this infrastructure was ultimately intended for quant trading versus Liang’s expensive hobby. The next spring, about five months after OpenAI introduced ChatGPT, he spun out DeepSeek as an independent research lab. At separate offices in Hangzhou and Beijing, finance was no longer the focus. In an unsigned manifesto rife with platitudes, High-Flyer vowed to shun mediocrity and tackle the hardest challenges of the AI revolution. Its ultimate goal: artificial general intelligence.

Throughout 2023 the DeepSeek lab raced to build an AI code assistant, a general-knowledge chatbot and a text-to-3D-art generator. Liang brought over engineers from High-Flyer and recruited more from Microsoft Corp.’s Beijing office and leading Chinese tech companies and universities. Bo “Benjamin” Liu, who joined as a student researcher that September prior to starting a Ph.D., says Liang frequently gave interns crucial jobs that elsewhere would be assigned to senior employees. “Take me as an example: When I got to the company, no one was working on the RLHF infra”—the infrastructure needed to support an important technique known as reinforcement learning from human feedback—“so he just let me do it,” Liu says. “He will trust you to do the things no one has done before.” (That trust came with a secondary benefit to DeepSeek: It paid interns the equivalent of $140 per day with a $420 monthly housing subsidy, generous compensation in China but about a third of what interns make at AI companies in the US, and a tiny fraction of what full-time Silicon Valley engineers earn.)

Liang placed a huge and early bet on sparsity, a technique for training and running LLMs more efficiently by breaking them down into specialties, according to two ex-DeepSeek researchers. When you asked the original ChatGPT a question, its entire LLM brain would activate to determine the ideal answer, whether you asked for the sum of 2 + 2 or a pie recipe. A sparse model, by contrast, would make better use of resources by being partitioned into “experts,” with only the relevant ones being activated in response to any particular prompt.

More breakthroughs followed, each shared publicly and increasingly catching the attention of Chinese competitors. Then, in late 2024, DeepSeek released V3, a general-purpose AI model that was about 65% larger than Meta Platforms Inc.’s equivalent, which was then the biggest open-source LLM available. But it was a lengthy V3 research paper that really grabbed the attention of executives at Google, OpenAI and Microsoft, about a month before DeepSeek broke into the wider consciousness with its R1 reasoning model. One shocking statistic that leapt off the PDF: DeepSeek implied that V3’s overall development had cost a mere $5.6 million. It’s likely this sum referred only to the final training run—a data-refinement process that transforms a model’s previous prototypes into a complete product—but many people perceived it as an insanely low budget for the entire project. By comparison, cumulative training for the most advanced frontier models can run $100 million or more. Anthropic’s Amodei even predicted (before the rise of DeepSeek) that next-generation models will each cost anywhere from $10 billion to $100 billion to train.

DeepSeek exhibited its rapid progress because Liang saw the open-source ethos as integral to his philosophy. He believed that hiding proprietary techniques and charging for powerful models—the approach taken by top US labs including OpenAI and Google—prioritized short-term advantage over more durable success. Making his models entirely accessible to the public, and largely free, was the most efficient way for DeepSeek to accelerate adoption and get startups and researchers building on its tech. The hope was that this would create a flywheel of product consumption and feedback. As DeepSeek wrote in the announcement of its first publicized LLM almost two years ago, quoting the inventor of open-source operating system Linux: “Talk is cheap, show me the code.”

DeepSeek, however, has presented itself as no different than any hot startup—the product of “pure garage-energy,” it said in a February post on X. After all, it operates on the same Beijing campus as Google, not far from a Burger King and two Tim Hortons. Just because the broader AI industry didn’t pay much attention to DeepSeek until now doesn’t mean something shady is happening behind the scenes. “The AI world didn’t expect DeepSeek,” says Arnaud Barthelemy, a partner at VC firm Alpha Intelligence Capital, which has invested in OpenAI and SenseTime. “They should have.”

Barthelemy says the real lesson to take from DeepSeek is how effectively Chinese tech companies are turning the constraints they operate under into a strength. “There are plenty of smart minds in China who did a lot of smart innovation with much lower compute requirements,” he says.


This week’s fund find

Unwind with the Ancient Japanese Art of Kumiko, a Wood Joinery Technique

If you’re familiar with the Japanese art of wood joinery, you’ll likely find kumiko equally intriguing. The traditional craft emerged in the Asuka era between about 600 and 700 C.E. and similarly eschews nails in favor of perfectly cut pieces that notch into place. Intricate fields of florals and geometric shapes emerge, creating a decorative panel that typically covers windows or divides a room.