Where are new inventions made? Geocoding of worldwide patent data

We have created a database of worldwide patent applications and assigned geographic coordinates ('geocoding') to the addresses of the inventors and applicants. The dataset contains geographical coordinates as well as the corresponding cities and regions in which the address is located.
Published in Research Data
Where are new inventions made? Geocoding of worldwide patent data
Like

Share this post

Choose a social network to share with, or copy the URL to share elsewhere

This is a representation of how your post may appear on social media. The actual post will vary between social networks

The database allows, for example, the identification of all patented inventions by inventors based in Zurich or in the canton of Zug in Switzerland. In total, we collected and geocoded 7 million inventor and applicant addresses from 19 million patent documents, and assigned them to 46 countries and approximately 50,000 cities.

Original data of the most important patent offices in the world

For our project, we collected original data from the five largest patent offices (patent offices of the USA, Europe, Japan, China and South Korea). In addition, we supplemented the data with addresses from three national, European patent offices (patent offices in Germany, the United Kingdom and France). Although we did not collect further data from patent offices of smaller countries, we achieve a very good coverage also for smaller European countries. This is mainly because the small but economically strong countries are very internationalized, which is reflected in their patent applications. For example, many patents are not only filed at the national patent office, but also at patent offices of larger countries (for example in Germany or the USA) or at the European Patent Office, which standardises the registration procedure within Europe. Therefore, we were able to impute missing address information from applications at other patent offices.

Data for researchers and decision makers

The dataset is aimed at researchers who deal with innovation economics and economic geography. Accurate location data enables a fine-grained measurement of transport and communication costs caused by the spatial distance between cooperation and trading partners. Another application is the analysis of technology clusters. There is, for example, little evidence of the emergence of such clusters in emerging markets such as China.

Accurate geographic information on invention activities is also important for policymakers as they are increasingly interested in location decisions of companies and highly skilled workers. For this purpose, it is important to know where the most important innovation centers are. Our dataset allows you to create accurate technology profiles for each region and city.

The resulting data set contains an identification number with which each patent application can be identified as well as the coordinates of the inventor’s and applicant’s place of residence, the associated city and the country.

More information about the project and visualizations can be found on our website: https://www.worldwide-patents.com/

Please sign in or register for FREE

If you are a registered user on Research Communities by Springer Nature, please sign in

Follow the Topic

Research Data
Research Communities > Community > Research Data

Related Collections

With collections, you can get published faster and increase your visibility.

Data for epigenetics research

This Collection presents data within epigenetics research including, but not limited to, data generated through techniques such as ChIP, bisulphite, nanopore and RNA sequencing, single-cell epigenetics/epigenomics, spatial genomics/epigenomics, and the role of non-coding RNAs in epigenetic modulation.

Publishing Model: Open Access

Deadline: Sep 30, 2024

Neuroscience data to understand human behaviour

This Collection presents descriptions of datasets combining brain imaging or neurophysiological data performed alongside real-world tasks or exposure to different stimuli.

Publishing Model: Open Access

Deadline: Oct 31, 2024