This research project is being developed with the support of the European Union Intellectual Property Office
Reference: Grant GR/002/23 - Agreement number 1320230017

SIMILIS – Exploiting semantics and Deep Learning to provide intelligence to the syntactic search engine for Help & FAQs on the EUIPO website thanks to a Transformer model fine-tuned on custom datasets.

Duration: 1 year (28/9/23-28/9/24)
Principal researcher: Yolanda Blanco Fernández

Project description

The European Union Intellectual Property Office (EUIPO) oversees the registration of Community trademarks and designs, providing exclusive usage rights and legal protection to owners. To assist users in accessing services, IP management entities' websites feature 'Help & FAQs' sections with frequently asked questions grouped into categories like trademarks and designs. Users can typically search for specific information by entering queries through search engines on these websites. While these search engines usually retrieve pertinent FAQs from relevant sections, they rely solely on the syntax of the user's query, often resulting in imprecise or irrelevant outcomes, even for specific queries. For instance, if a user seeks information on the trademark registration process, a syntax-driven search tool could yield somewhat relevant results, such as FAQs on registration fees, duration, or pre-registration trademark use. However, it might also present entirely unrelated results, such as queries about opposition procedures for already registered trademarks or specific sections of the application form, simply because they contain some of the user's search terms, such as 'registration' or 'trademark'.

The SIMILIS research project endeavors to craft an intelligent semantic tool, specifically designed to enrich user interaction with the ubiquitous 'Help & FAQs' section on the websites of IP management entities, such as the EUIPO. A pivotal objective is the detection of whether user-submitted queries have already been addressed in existing FAQs, thereby averting redundant responses from staff. SIMILIS is intricately tailored to identify duplicate or equivalent questions by harnessing query semantics, word relationships, and contextual understanding. This approach facilitates a profound understanding of user information needs, leading to a more refined presentation of search results.

To realize these objectives, SIMILIS will capitalize on Natural Language Processing (NLP) techniques and implement supervised learning through Deep Learning models. This strategic use of technology aims to elevate the user experience by discerning the relevance of retrieved results within the specific context of each query. In practical terms, SIMILIS empowers users to receive quicker and more precise responses, fostering satisfaction. Simultaneously, it enables staff to save time, redirecting their focus toward addressing new or complex queries.

Research team