This week in charts
U.S. equity risk premium
Manufacturing construction
U.S. presidents and the S&P 500 Index
PMR portfolio defaults
Trailing return contribution by region
U.S. vs global equities relative performance
European stocks
U.S. government expenditures
U.S.-China trade
Patents by region
10-year U.S. Treasuries
AI’s math problem: FrontierMath benchmark shows how far technology still has to go
Artificial intelligence systems may be good at generating text, recognizing images, and even solving basic math problems—but when it comes to advanced mathematical reasoning, they are hitting a wall. A groundbreaking new benchmark, FrontierMath, is exposing just how far today’s AI is from mastering the complexities of higher mathematics.
Developed by the research group Epoch AI, FrontierMath is a collection of hundreds of original, research-level math problems that require deep reasoning and creativity—qualities that AI still sorely lacks. Despite the growing power of large language models like GPT-4o and Gemini 1.5 Pro, these systems are solving fewer than 2% of the FrontierMath problems, even with extensive support.
FrontierMath was designed to be much tougher than the traditional math benchmarks that AI models have already conquered. On benchmarks like GSM-8K and MATH, leading AI systems now score over 90%, but those tests are starting to approach saturation. One major issue is data contamination—AI models are often trained on problems that closely resemble those in the test sets, making their performance less impressive than it might seem at first glance.
Mathematical reasoning of this caliber demands more than just brute-force computation or simple algorithms. It requires what Fields Medalist Terence Tao calls “deep domain expertise” and creative insight. After reviewing the benchmark, Tao remarked, “These are extremely challenging. I think that in the near term, basically the only way to solve them is by a combination of a semi-expert like a graduate student in a related field, maybe paired with some combination of a modern AI and lots of other algebra packages.”
Mathematics, especially at the research level, is a unique domain for testing AI. Unlike natural language or image recognition, math requires precise, logical thinking, often over many steps. Each step in a proof or solution builds on the one before it, meaning that a single error can render the entire solution incorrect.
This makes math an ideal testbed for AI’s reasoning capabilities. It’s not enough for the system to generate an answer—it has to understand the structure of the problem and navigate through multiple layers of logic to arrive at the correct solution. And unlike other domains, where evaluation can be subjective or noisy, math provides a clean, verifiable standard: either the problem is solved or it isn’t.
Ivy League endowments struggle with private market downturn
The drawn-out downturn in private market returns is hitting one group of investors especially hard: Ivy League university endowments.
Leading US university endowments, many of which allocate outsized portions of their portfolios to private equity and venture capital, have underperformed the university average for the second year in a row, with prominent ones like Yale and Princeton lagging far behind their smaller peers, as the once lucrative asset class suffers from a plunge in dealmaking and stock listings.
Top endowments have long used aggressive exposure to private investments in pursuit of excess returns they believe are out of reach through public markets. Now, as those investments have yet to pay off, some large endowments like Princeton have issued bonds to meet funding needs, according to the New Jersey Educational Facilities Authority.
Six of the eight Ivy League universities reported returns in the 12 months ended June that stood below the higher education average of 10.3%, according to Cambridge Associates, an investment consultancy. Yale and Princeton fared the worst by respectively yielding 5.7% and 3.9%.
The underperformance follows an even weaker 2023 when no Ivy League school was able to match the 6.8% industry average. Yale gained 1.8% while Princeton lost 1.7% last year. Ivy League endowments, which are among the wealthiest in the world, reported mediocre returns due in large part to their aggressive bets on the illiquid yet high return alternative investments that had fallen victim to the prolonged high interest rate environment.
And the paltry returns are coming at a time when public markets have soared, with the S&P 500 equity index up 57 per cent in the last two years and interest rates on bonds frequently returning more than 4 per cent.
Most Ivy League endowments had earmarked more than 30%, and in the case of Yale and Princeton at least 40%, of their assets to PE and VC by the first half of this year, according to Old Well Labs, a consultancy. In contrast, a survey of 121 university endowments by Cambridge Associates found their allocation to PE and VC had averaged 22% over the same period.
The struggle by elite university endowments to generate excess returns has raised fresh concerns about their investment model that has been emulated by asset allocators from sovereign wealth funds to community foundations around the world.
This week’s fun finds
Nigel Pickford has spent a lifetime searching for sunken treasure—without leaving dry land.
This Atlantic wreck was beguiling. An R.O.V.—a remotely operated vehicle, connected by a cable to the exploration vessel—was sent down to take a closer look. It was the remains of an old wooden sailing ship, stuffed with cargo, lying some six thousand metres below the surface—much deeper than the Titanic. The contents seemed to be Asian in origin: intricate lacquered screens and bolts of cloth, thousands of slender rattan canes, and an extraordinary array of porcelain, all preserved in the darkness of the ocean. “It was just cascading in these spills down around the slopes and undulations of the seabed,” [marine archaeologist Mensun] Bound recalled. “And there were barrels there, which hadn’t been opened. They were sitting there intact.”
There is something almost dangerously tantalizing about an undiscovered shipwreck. It exists on the edge of the real, containing death and desire. Lost ships are lost knowledge, waiting to be regained. “It’s like popping the locks on an old suitcase and you lift the lid,” Bound told me. Bound grew up on the Falkland Islands in the nineteen-fifties. In 2022, he found the Endurance, Ernest Shackleton’s polar-exploration ship, under the ice of the Weddell Sea, off Antarctica. “On a shipwreck, everything, in theory, that was there on that ship when it went down is still there,” he said. “It’s all the product of one unpremeditated instant of time.”
What was the ship? There was an obvious person to ask. In 1993, Bound had been searching for the remains of a nineteenth-century English trading vessel, the Caroline, in the Straits of Malacca, in Southeast Asia, when he and his colleagues pulled up a much older, bronze cannon instead. The cannon was marked with a relief of a sailing ship, the name of the Dutch East India Company, and a date, 1604. “I had no idea what it was doing there or anything,” Bound said. But he had heard of a self-taught shipwreck researcher, based in England, who was said to have an unusually broad grasp of the world’s lost vessels. Bound contacted the researcher, Nigel Pickford, by satellite phone from the ship.
Within twenty-four hours, Pickford replied, saying that Bound and his team were on the site of the Battle of Cape Rachado, which was fought between Portuguese and Dutch fleets over several days in August, 1606. The cannon probably belonged to a ship called the Nassau. “He said, ‘O.K., you found one wreck by itself,’” Bound recalled. “‘There should be three wrecks nearby.’ And he even gave us a rough direction.”