AI search tools for patents; How to improve results as a user? Part I
Linus Wretblad
Innovation Advisor * Boosting IP decisions * QPIP Qualified Patent Information Professional * Founder & CEO
Part I (of II): Best practice for AI enhanced searchers
In previous articles I elaborated on how to measure the quality of search tools which are using just a normal text as input. This included thoughts on how to define an evaluation platform to measure performance of AI search tools. A test methodology was defined using patent examination citations as Ground Truth. I also showed some graphs illustrating what performance results across different technology domains could look like. This post will, based on feedback and further analysis, review how to use AI search tools more efficiently. The aim is to identify some possible best practice.
We should simply accept that being able to use text based queries adds a brand new gadget to the searcher’s tool box and that we need more knowledge on how to use them properly. This is well described by Aalt van de Kuilen article (read more here), addressing how to embrace new tools and possibilities. However, this probably also demand a somewhat different mindset for the search approach.
Being an information professional or a "searcher" for prior art, the work traditionally involves understanding a technical concept, performing a search, composing search strategies, assessing the relevance of documents and finally communicating the findings on the state of the art in a proper report format. The part affected by the new options is the converting a technical description into a representative search strategy, which today is an essential part of an information professional’s skill set. Thus, I am trained to extract the core of the idea and tailor my own search strategies from scratch by applying boolean and proximity operators and combining essential keywords with synonyms, applicant names, classification codes, meta-tagged data etc. There are numerous good articles on this topic, especially by Evert Nijhof [1] (online version), [2] (online version), [3] (online version). I demonstrate this procedure of creating a search profile by a schematic example applied on a simplified text:
We are proposing a new trapping device for catching unwanted animals (e.g. mice) in home or office environments. The cage construction has a door that is activated when a mouse is touching a substance (e.g. cheese) on a sensor provided inside the cage.
In an associated query strategy, you would try to identify relevant words, synonyms and related distances in between key words. The searcher then run the queries, reviews found documents and in an iterative manner refines the query to generate more relevant results. For the text above an associated very simplified figurative query could look like:
((mouse or mice or rodent+) proximity (cage+ or trap+)) proximity (sensor+) and (group) in cpc/ipc-class
By adjusting the number of words of “proximity” value as well as the granularity of the class description the search string will be either broader or narrower. That is, a longer distance and higher level class will catch more documents (with higher recall and lower precision), whereas a shorter distance and precise class will catch fewer documents (accordingly with lower recall and higher precision). So we are as searchers accustomed to manually convert a text into a corresponding query representation, then successively adjusting the search strategy.
However, when using AI-based tools we shall feed in a text instead. Thus in the example above you would simply start-off by using the whole paragraph containing both problem, solution and maybe background as well. Even though obviously compared to manual approach this input includes irrelevant information such a prepositions and other noise terms. However, for the algorithmic analysis this semantic data provides additional teaching and ground for training. Consequently, a first challenge for the information professional is then to accept to use a “raw” text (such as an invention disclosure or application) and avoid transforming the text into one or more manual queries.
Another insight is the possible need for adaptation of a search text depending on the technical domain. We saw in Chapter III that you would have to review quite different numbers of hits depending on the technical field to get a similar recall score (the example showed that for an area less suited for text-based searches such as “Mechanical elements” you could need to review up to Top 100 documents to have the same Recall score given by the Top 10 documents for a high performing area such as “Biotechnology”). These results were given using the same format of the query texts (in that example the title and the abstract only).
In view of those insights some questions are raised; what text parts should be entered and how could I as a searcher change the input to get better performance? Also, what tuning towards a technology domain is needed, and how long should an input text then be? This is especially interesting for areas with performance challenges, and where a possibility to boost the score is wanted. Simply put, could an information professional give the AI a manual “push” for better results by adapting the text?
Consequently, the question is to understand what amount of input data to use for best performance. To achieve a better understanding of this, we did a study on how different text length inputs actually affect the quality of the output. The first test collection is a technology mix of 50000 patent applications with more than 200 000 characters of text length. The analysis was based on baseline runs varying the input from using just the title to using the full document. Non-text parts such as charts, chemical formulas etc. were excluded.
In general, we see that adding more text to the input did yield considerable performance boosts to begin with, only then with much larger texts slowly decreasing. This is shown in the graph below, where recall is plotted against the amount of text (number of characters) entered. The score shows the recall within top 100 hits retrieved.
This behavior is probably logical; too little data might be too vague and lead to a misunderstanding the actual focus, where too much data adds noise and you are suddenly unable to identify a distinct target description. Our indicative first analysis has two main takeaways:
Firstly, it was indeed surprising to identify the amount of data to be “too much” as input, and hence reach the peak where the recall start to decrease. We see actually the performance increase slowly all the way to finally peak off at about 100 000 characters. Simply put, this represents about 10-15 pages of technical text until we see a performance dip. “This will introduce too much noise”, would be the normal reaction where it is indeed shown to be quite the opposite. As reference, our baseline example in Chapter II of 175 000 patent applications sampled over all technology domains has an average length of 89 000 characters per patent document.
Secondly, the analysis shows that the performance increases dramatically when using about a page (around 3000 characters) as input, prior to which the quality is quite poor. Seemingly, shorter texts are not providing enough data for the text analyzing algorithms to perform on. As a follow up, we analyzed the statistics of query lengths used in our a pilot programs. The interesting finding was; the average input was only 500 characters long, even though the users were asked to use at least one page. Maybe this was just because they wanted to test how good the system is by throwing in a short phrase. Or maybe searchers are so used to tailor slimmer search strategies that by habit they reduce the text volume. Then using larger text input, it feels uncomfortable from a user perspective. Thoughts on this are welcome!
In a worst case we have a “lose-lose” scenario when exploring text-based search engines. The searcher is spending time on analyzing and reducing a text to what is thought to be a better and more query like input. But in the end the result might be too short for the algorithm to deliver on and the effort actually degrades the performance.
This is rather easily remedied by thinking in terms of larger paragraphs instead traditional keyword analysis. The challenge is not to mix up the manual search approach with the algorithmic one. For the latter, as indicated by the graph, you should as a starting point just use whatever descriptive text you have as input, even though long and seemingly too broad. The results are then used to boost and support analysis and further manual search strategies.
In an additional follow up to verify the findings, we compared the performance of the lesser performing technical domain “Mechanical elements” with the high performing “Biotechnology” with respect to the input. Thus, looking at a title/abstract type query compared to a full text based one. Looking at the baseline measurements (i.e. how many of the relevant citations found by the patent office that were found by the tool) the recall ratio when looking at top 100 hits given by the following figure.
In short, using a full text document as input improves recall for the mechanical domain considerably, in average boosting performance with around 50%. However, for biotechnology we got only a boost of about 10% and only when also capping the length of the input text to include a maximum of 100000 characters (the peak value for best recall value shown in the previous graph).
The optimal input length is also somewhat dependent on the technical domain; if the technical field is better or less suited for using text-based queries. This suggest a need to run a baseline on your specific patent portfolio or at least within the technical domain reflecting your research. This helps to have a better understanding of best use and to be able to optimize the workflow and need of information for improving the performance.
So then back to the initial question; may I do something as user to improve the results? The answer is yes. It is still valid that a better input always yields better output, as long as the text used is large enough for the algorithm to analyse properly. The grey bar "hybrid" in the diagram shows indeed a further boost of the recall when a user manually selects the most appropriate text sections and use this assembly of text as input (about 1-5 pages together).
This "hybrid" study consists of spot-checks on a limited number of about 100 patent documents. It indicates that the more challenging technical fields are still improved more than the already good one. This was, as you might guess, especially true when the text was noisy, had too much background focus or where it contained inconsistencies or dealing with numerous completely different technical ideas. Proving the query text with appropriate patent IPC/CPC classes to steer the analysis against, actually yield even a further performance boost. This indicates that a professional user indeed may make use of traditional manual search skills also in a AI text based search procedure.
Then it is all about setting up a proper process for when and how to perform a prior art search and with what tools. We have one trade off identified between i) spending more time initially and improving the query by selecting text paragraphs to get better accuracy of the first output and ii) use the text as is with potentially lower accuracy and spend more time on manual search afterwards based on the results retrieved. One consideration could then be to use it more as a rough pre-screening to get a first indication of the prior art or as a more integrated search tool within the manual search process. I believe, this is not “one” or “the other", rather "both" and probably more depending on where in the innovation workflow it is being used and for what purpose.
So summarizing, based on our extensive analysis on input format versus output performance we discovered a few hands-on insights which should be part of best practice for text-based searches:
- Avoid using too short query texts. We discovered that test users used an average length input text of 500 characters, while the threshold for performance boost starts at around 3000 characters. This could relate to the fact that query based searching searchers are used to create shorter search strings, which indeed contain limited text data. The impact of small text input sets is that the algorithms have little information to “work on” with a poorer result than it potentially could have achieved. This is an important insight and therefore a good aim is to enter much more text than compared to a search query; a recommended minimum is at least one page. For very long texts, and especially in high performing domains, you could consider cutting out data, e.g. use only the first 1-5 pages or selected paragraphs.
- Adapt text length and strategy to the technical field. There is a correlation between the technical field and the need for longer descriptive texts. The harder it is to describe the subject area in a text, the more text you should add. Thus, areas like the “Mechanical elements“ you could increase the recall considerably by feeding the whole document with both descriptive passages and even background to support better search results. However do avoid too large text documents as well. Keep in mind that it may make sense to also manually select and combine appropriate best sections to use as input, which especially applies to exhaustive and inconsistent documents. You also might also consider adding patent classes for fields with lower recall if the tool offers that option.
- Explain short topics with more details. As in real life; the more comprehensive and concise you are, the better you will be understood by the other. Algorithms cannot guess your underlying intentions, thus always avoid general concepts, especially when working on very short query texts of a few sentences. If you want to use a tool for ideation and inspiration to find related prior art, do try to almost over explain the topic. By adding details about the problem /solution when describing an idea it will improve the performance. One comment I got from a user was a very to the point instruction; “Explain the new case as detailed as if you talked to your teenager, then it works best”.
- Always run a baseline on your technical domain. You need hundreds of queries per class for statistic relevance of the performance as discussed in Chapter I. A manual verification of a few cases is merely indicative. Thus, suggest to use your own (or combined with a competitor) patent portfolio as basis for running a performance test showing the recall scores. By knowing how a text-based search tool is performing in your technical domain, you get information for optimizing your prior art searching as well. The analysis may also help to improve designing invention disclosures (e.g. amount of text, format for summary, problem and/or solution, potential key terms etc.) to boost the search performance further.
The bullet points above are only an indicative summary and we are running further studies to explore how a user may get better performance. Also as the findings disclosed in this article are based on IPscreener only. Of course, the optimal scenario would be to compare several tools run on the same data to understand differences and similarities. Optimal text input versus quality of output will certainly vary among the providers. I look forward to a common standard in the near future! I also reckon there are more tips & tricks on best practice from other studies as well. You are very welcome to share comments to gather further experience on this topic.
In the next chapter I will share my thoughts on when to use AI tools and the future of searching; look out for “The future role of AI enhanced searchers”.
[1] Subject analysis and search strategies – Has the searcher become the bottleneck in the search process?: Nijhof; World Patent Information, Volume 29, Issue 1, 2007
[2] Searching? Or actually trying to find something? – The comforts of searching versus the challenges of finding: Nijhof; World Patent Information, Volume 33, Issue 4, 2011
[3] Want to find? Break the rules!: Nijhof; World Patent Information, Volume 52, 2018
Linus Wretblad is the co-founder of Uppdragshuset and IPscreener. He has a Master of Science degree from the Technical Physics and Electrical Engineering, Link?ping University, Sweden and holds a French DEA degree in microelectronics. He studied MBA on Innovation and Entrepreneurship at the University of Stockholm. Linus has 20 years experience of innovation processes and IPR with a focus on prior art searches and analysis, starting as an examiner at the Swedish Patent office. Since 2008 he is on the steering committee of the Swedish IP Information Group (SIPIG) and was during 2012-2017 on the board and president of the Confederacy of European Patent Information User Groups (CEPIUG). Linus is one of the coordinators in the certification program for information professionals. He is recently involved in a EUROSTAR research project together with the Technical University of Vienna on automated text based and AI supported prior art screening.
This article and its content is copyright of Linus Wretblad - ? IPscreener 2019. All rights reserved. Any redistribution or reproduction of part or all of the contents in any form is prohibited other than the following:
- you may print or download to a local hard disk extracts for your personal and non-commercial use only
- you may copy the content to individual third parties for their personal use or disclose it in a presentation to an audience, only if you acknowledge this being source of the material
You may not, except with express written permission, commercially exploit the content or store it on any other website.
Ingeniero Consultor de Propiedad Industrial PI INTERNATIONAL | Especialista Patentes VIA PCT- NUEVAS TECNOLOGIAS
5 年Excelente? articulo y vamos por ,,,,,,,,,,,,,,,,,,,,más entre todos,,,,,,,,,,,,,,,,,, Adoptar nuevas herramientas y posibilidades. Sin embargo, esto probablemente también exige una mentalidad algo diferente para el enfoque de la búsqueda. El trabajo tradicionalmente implica la comprensión de un concepto técnico, la realización de una búsqueda, la composición de estrategias de búsqueda, la evaluación de la pertinencia de los documentos y, finalmente, la comunicación de los hallazgos sobre el estado del arte en un formato de informe adecuado. La parte afectada por las nuevas opciones es la conversión de una descripción técnica en una estrategia de búsqueda representativa, que hoy en día es una parte esencial del conjunto de habilidades de un profesional de la información. Por lo tanto, estoy capacitado para extraer el núcleo de la idea y adaptar mis propias estrategias de búsqueda desde cero aplicando operadores booleanos y de proximidad y combinando palabras clave esenciales con sinónimos, nombres de solicitantes, códigos de clasificación, datos meta etiquetados, etc. Por consiguiente, un primer reto para el profesional de la información es aceptar utilizar un texto "crudo" (como la divulgación de un invento o una aplicación) y evitar transformar el texto en una o más consultas manuales El buscador dedica tiempo a analizar y reducir un texto a lo que se considera una mejor y más adecuada consulta como entrada. Pero al final el resultado puede ser demasiado corto para que el algoritmo lo cumpla y el esfuerzo realmente degrada el rendimiento. Usar las herramientas de IA y el futuro de las búsquedas.
Bringing STEM and IP together
5 年Good stuff Linus
Chief Marketing Officer, LL.M.
5 年Interesting read!