Dated but functioning sewing machines sit under maps of Italy at Philip's Shoe Repair in Washington, D.C., in 2017. Sewing machines were among the most influential inventions of the nineteenth century. The machine on the left is more than 50 years old, and the one at right is more than 100 years old. (Jahi Chikwendiu/The Washington Post)

The U.S. patent office has stockpiled the text to more than 10 million patents. But that’s often all they have: an enormous amount of text. Many early patents lack any form of citation or industry specification, which researchers could use to understand the history of American invention.

Now a team of economists has created a clever algorithm that processes that text -- often the only consistent data we have for many of the country’s most famous inventions -- to create a measure of the influential inventors and industries of the past 180 years.

To measure a patent’s influence, they first look for patents whose text, once cleaned up and scrubbed of common words and outliers, has little in common with those that came before it. Like the electromagnetic motor patented in 1888 by Nikola Tesla, such inventions are revolutions, not iterations.

But they also aren’t flashes in the pan. Like the nylon fiber patented by DuPont chemist Wallace Carothers in 1938, they shape the ideas and vocabulary of inventors that come after them. So in addition to looking for inventions that have little historical precedent, the algorithm also looks for inventions that were followed by a wave of patents that used similar terminology. It suggests that, like Tesla’s motor or Carothers’ nylon, they had a lasting impact on American innovation.

Innovation defies measurement. It’s intangible. But economists continue to try because innovation helps determine productivity, and productivity helps determine economic growth.

Previous attempts to compare innovation across eras counted patents or patent citations. Such counts are easily distorted by corporate practices or a patent clerk’s quirks. Patents rarely cited other patents prior to 1946. An early telegraph patent earned almost no citations, yet it’s not unusual for modern patents to rack up hundreds or even thousands of citations.


The new analysis was released as a National Bureau of Economic Research working paper by Bryan Kelly (Yale School of Management), Dimitris Papanikolaou (Kellogg School of Management, Northwestern University), Amit Seru (Stanford Graduate School of Business) and Matt Taddy. Taddy now works at Amazon, an online retailer whose founder and chief executive, Jeffrey P. Bezos, also owns The Washington Post.

The team’s database begins in 1836. Earlier records were destroyed in December of that year when the hotel housing the U.S. patent office, one of the only buildings spared by the British in the War of 1812, went up in smoke. Casualties included fan favorites such as Eli Whitney’s cotton gin (1794), Cyrus McCormick’s reaper (1834) and Samuel Colt’s revolver (1836), none of which could be included in the study.

The algorithm requires five years of data from both before and after a patent’s issuance. As a result, the researchers only rated patents issued between 1840 and 2010.


Their text-similarity measure compares favorably to current methods. It correlates with citation-based rankings when they’re available and tends to give higher ratings to prominent patents on lists such as those compiled by patent office historians. Unlike citations, the algorithm can compare patents across eras. Unlike historians, the researchers' algorithm can review the text of millions of patents at a time.

A simple count of patents per person appears flat for most of U.S. history, but such a measure is distorted by innumerable incremental advances. Sharp peaks and valleys emerge once researchers consider only the most influential patents.

The rate of important inventions gained steam during the 1920s and peaked in 1932, near the nadir of the Great Depression. It seems counterintuitive. But Santa Clara University economic historian Alexander Field argued for its existence in a 2003 American Economic Review paper. At the time, he said, it was a “radical hypothesis.”

“Most people wouldn’t think of a decade in which the unemployment rate was mostly in double digits as being a likely candidate for a lot of technological advances,” Field said.

Field argues the innovations of that era across a broad frontier of the U.S. economy laid the groundwork for the country’s rapid industrial and economic expansion during and after World War II.

Their algorithm was designed to detect trends at the industry level, but we can also use it to pick influential individual patents out of all those issued since 1836. There will be some noise. The same is true of citation-based and qualitative measures, Papanikolaou said.

Modern inventions often require dozens or even hundreds of patents, and it’s not always possible to determine which of those best represents a drug, smartphone or jumbo jet. To eliminate the noise from a list which might otherwise place pharmaceutical formulations and cryptography methods next to skateboard components, we combined the researchers' algorithm with a list of about 250 historically important patents compiled by their team.

The items listed below fell within the top 0.5 percent of all patents the team rated and have been listed as notable by human experts.


The researchers found a strong relationship between a company’s portfolio of most-influential patents and its market value. The most influential firm in their data set was International Business Machines. IBM issued about three times as many breakthrough patents as its nearest competitors, AT&T and Motorola.

Petra Moser, a New York University Stern School of Business economist, has long studied patents and innovation, most notably in a widely cited analysis of two 19th-century world’s fairs. Patents don’t capture every human advance, she said, but “they’re our most complete record of innovation.”

Moser said the measure created by Papanikolaou and his colleagues would be “super helpful” in her work, and that it confirms her understanding of when industries such as chemical manufacturing and crop production were at their most innovative — trends which wouldn’t show up in traditional patent-based measures. Innovation, she said, allows humanity to improve its quality of life without consuming more scarce resources.

“We need to understand what drives innovation so that we can keep people fed and not destroy our planet,” Moser said. “Innovation is the engine of growth."

In the current era, gene-splicing patents laid the groundwork for companies such as Genentech. Herbert Boyer, who pioneered the methods canonized atop the most-influential patent list alongside academic Stanley Cohen, founded the biotech firm in 1976.


Boyer and Cohen’s work with recombinant DNA is typical of the modern wave of American innovation, which leans heavily on software and bioscience. That wave appears to have crested. Fewer influential patents have been issued almost every year since 1998.

There may be a silver lining for those concerned about the nation’s long run of low productivity. Economists have long quantified the productivity of workers and machinery, but they struggled to measure the gains provided by new inventions. By measuring innovation, researchers hope to gain insight into one of the hardest-to-measure components of economic growth. In their research, Papanikolaou and his colleagues found it takes years for America’s inventiveness to be reflected in the country’s economic output.

“Years where there was a lot of innovation tend to be followed by periods of high-productivity growth,” Papanikolaou said.

Even on its downslope, the third wave of American innovation could continue to pay dividends.