How machine learning can help in patent monitoring and other IP tasks: First results

Sébastien Ragot

Swiss and European Patent Attorney, Representative before the UPC, PhD.

发布日期: 2019年1月28日

I am keen to report first results that I have obtained with AI-related techniques applied to intellectual property (IP) and, in particular, to patent monitoring, for which machine learning appears to work very well.

Before getting down to the heart of the matter, let me briefly recall the background. Machine learning is a foremost component of artificial intelligence (AI), which itself is one of the most disruptive ingredients of Industry 4.0. Computer scientists are massively developing machine learning techniques and applications to various technical fields are swarming. More than a hype, AI is today a reality in many areas.

As a patent attorney, I wish I had a versatile and efficient tool to assist me in some tedious IP tasks, especially where large numbers of IP rights are involved. But applications to IP are still in their infancy. Proprietary solutions exist and a few machine learning applications to IP have been advertised. However, such solutions are mostly integrated to large IP platforms and, as such, have a rather limited scope and/or lack flexibility. This has prompted me to develop my own tools, based on available text analysis and machine learning algorithms. The advantage of homemade tools is that their efficiency and accuracy can be clearly assessed. All the more, they can be quickly reconfigured to match any use case of interest to IP users.

Let me illustrate this with an example showing how one can actively use machine learning to gain a clearer view of the patent landscape in a given field. Assume that you (an “IP user”) want to monitor adverse patents granted in your field, as most innovators do. It is easy to set up a monitoring process with a standard patent database. Having done so, you may for example receive monthly digests of newly granted patents, which you would then review to detect patents that are relevant to your activities. The additional workload is usually not an issue.

What is more difficult, however, is to go through the long list of patents already in force—typically a few (dozens of) thousands in your very technical field, meaning a few weeks of work for a trained reader. Now, not all companies have this kind of time budget or the required in-house competences. They cannot necessarily afford to outsource this task to a patent attorney either, especially as the same problem reoccurs each time a new activity is started, for example when launching a new product or service.

This is where -machine learning becomes useful. Indeed, a cognitive model can be trained based on your own ratings, i.e., scores you assign to the patents that you regularly review. For example, you may rate the relevance of newly received patents with a score varying between 0 (not relevant) and 1 (fully relevant). Once you have rated a sufficient number of patents (forming your “training set”), you can train a cognitive model based on this training set. And upon completion of the training phase, the model can be run (“inference phase”) to automatically assess the potential relevance of thousands of other patents. Eventually, the rated patents may be ranked in descending order of relevance, as illustrated below for a test pool of patents of potential relevance to a given company.

At this point, it only remains to review the claims of the ranked patents, starting from the most relevant patents retrieved by the cognitive model. The review can be restricted to the top fraction of the ranked patents. The truly relevant patents essentially rank in the first 50 patents returned in practice, which will at most require a few days of careful reading instead of weeks.

Qualitatively, the results obtained are convincing. Dependency statistics performed on validation sets (thanks to kind beta testers) show substantial correlations between the top-ranked patents and their prima fasciae relevance. I have tested this approach in respect of several companies active in distinct technical fields (e.g., IT, materials science, microfluidics, and IC chips), using datasets of several hundreds to thousands of patents each.

In particular, tests have been performed where some of the patents as granted to a selected company were deliberately not included in the training sets but rather placed in the test pool, to estimate the relevance of the ranked patents. Nevertheless, the trained models were able to suitably retrieve such patents (based on the sole claim language, no metadata was fed as input to the models). That is, patents as rated by a selected IP user in the training set (which includes only a fraction of the patents owned by this IP user) make it possible to retrieve earlier patents (orange points) of that same IP user with highest relevance scores, as seen in the figures above. Notwithstanding, some of the third-party patents (blue points) turn out to be more relevant than the least relevant patents of the selected company. This, in my view, confirms that the present approach can indeed suitably identify patents that are most relevant to a given company.

Interestingly, similar algorithms can be applied to the analysis of prior art documents (for deciding whether to file a new patent application or not) or the detection of potential licensees/infringers. And beyond patents, machine learning can be used to objectively measure similarities between trademarks, logos, and other IP signs. I have tested such algorithms for comparing trademark signs, which I will report soon. Here again, the results are promising.

To conclude, such investigations show that machine learning, and other AI-related techniques can be profitably exploited to circumvent problems posed by large numbers of IP rights and associated documents. Eventually, the question boils down to the resources you are willing to invest in discovering potentially relevant IP rights. Now, assuming that your resources are limited, prospects offered by machine learning for reducing time and cost budgets for IP analyses are enough enticing to seriously start exploring the potential of AI-related algorithms for IP.

Sébastien Ragot

要查看或添加评论，请登录

Sébastien Ragot的更多文章

Measuring the scope of patent claims using (large) language models

2023年10月3日

Measuring the scope of patent claims using (large) language models

Several patent scope metrics have been proposed. In particular, bibliometric indicators have been used extensively in…

15 条评论
Quantum technologies: patent applications vs. scientific publications across the world

2021年11月4日

Quantum technologies: patent applications vs. scientific publications across the world

By Sébastien Ragot and Michel Kurek A second quantum revolution is underway, which promises to revolutionize several…

12 条评论
How original is this?

2021年5月4日

How original is this?

The originality of IP assets such as copyrighted works or designs can be sensibly formulated as a function of…
Patent protection in the field of quantum computing

2020年11月26日

Patent protection in the field of quantum computing

“It is difficult to make predictions, particularly about the future.” This comical observation would be nothing but a…

21 条评论
Questions to ask when looking for a suitable digital certification service

2020年9月15日

Questions to ask when looking for a suitable digital certification service

When it comes to intellectual property (IP) rights that have no registration requirements, cryptographic applications…

1 条评论
WIPO PROOF under the loupe

2020年6月22日

WIPO PROOF under the loupe

WIPO PROOF is an online timestamping solution for certifying IP assets, recently launched by the World Intellectual…

4 条评论
On provisional patent applications (and smoke and mirrors)

2020年1月19日

On provisional patent applications (and smoke and mirrors)

Online platforms have recently emerged, which propose to ease the filing of provisional patent applications at little…

1 条评论
Blockchain applications to intellectual property: yes, but …

2019年12月1日

Blockchain applications to intellectual property: yes, but …

Blockchain applications to intellectual property (IP) are the current talk of the town. However, there are simpler…

3 条评论
Copyright in works generated using artificial intelligence

2019年10月17日

Copyright in works generated using artificial intelligence

A work generated by means of artificial intelligence will only be eligible for copyright protection if a human being is…
What machine learning can say about protection by copyright or design patent: Example of typeface designs (fonts)

2019年6月18日

What machine learning can say about protection by copyright or design patent: Example of typeface designs (fonts)

Behind fantasies and myths, artificial intelligence (AI) is a technical reality that is progressively gaining ground in…

1 条评论

See all articles

Sébastien Ragot的更多文章

Measuring the scope of patent claims using (large) language models

Quantum technologies: patent applications vs. scientific publications across the world

How original is this?

Patent protection in the field of quantum computing

Questions to ask when looking for a suitable digital certification service

WIPO PROOF under the loupe

On provisional patent applications (and smoke and mirrors)

Blockchain applications to intellectual property: yes, but …

Copyright in works generated using artificial intelligence

What machine learning can say about protection by copyright or design patent: Example of typeface designs (fonts)

社区洞察