How to use stemming and wildcards for better patent search results

How stemming and wildcards can enhance your patent search

Searching for patents can be a time consuming and often frustrating task. Especially when you’re struggling to find the results you need. If you fail to uncover all the relevant patents, there’s a risk of missing crucial information—which could land you in litigation. Thankfully, stemming and wildcards are here to help you achieve relevant and accurate results. As an added bonus, they can also reduce the time you spend on a patent search.

Stemming and wildcards can refine results in different ways. Either they can broaden your search but with relevant results or narrow down your search if you’re looking for something specific.

Stemming

Stemming is a way in which search engines look for variations of a keyword. PatSnap’s stemming is based on the Porter Stemming Algorithm, written and maintained by Martin Porter and it’s an industry standard for search engines—Google adopted it in 2003. Stemming is particularly useful for patent searches in English, because the English language has so many suffixes. Using stemming in patent searches can help you find relevant patents in a very short amount of time—a search that would previously have taken you days will now take a few hours.

How to use stemming to broaden your search

Whether you’re scouting for technologies, searching for prior art or analysing whitespace, you can ensure that your query is extended to include variations of your keywords to display all the possible uses of a technology. This provides flexibility in your search without having to include all the synonyms and allows a broader search for less effort. And there are no limitations on the number of stem variations of the word.

If you are searching for a patent with the keyword “heat”, you will be able to find all relevant results with variations of the word “heat” such as “heated”, “heating” or “heater”. This is useful because sometimes when patents are filed, inventors may change the suffix of a root word to make it harder to find. 

For example, a search for patents using the keywords “heated plate” reveals 17,624  patents with stemming off.

Click image to enlarge

Search results for the keyword "heated plate" with stemming off on PatSnapSearch result for the keywords “heated plate” with stemming off (Source: PatSnap platform)

After turning stemming on, 81,114 patents are revealed and many of those are likely to be relevant to your search. Someone may have filed a patent using the term “heating plate” or “heat plate”, which could be missed if you leave stemming off. 

Click image to enlarge

Search result for "heated plate" with stemming on, on PatSnapSearch result for “heated plate” with stemming on (Source: PatSnap platform)

Wildcards

Wildcards are more selective and manual forms of stemming. It’s useful if you know which variations of your keyword you want to include and avoid, to help narrow down your search. If you’re doing FTO or state of the art searches and want to look for a specific technology or process covered in a patent, then wildcards are your best option. It’s important to note that if you want to use wildcards effectively, you should turn stemming off so it doesn’t counteract your results.

Using wildcards to narrow down your search

There are two wildcards on PatSnap. You can use the question mark "?" which represents one character or the asterisk "*" which represents an unlimited number of characters. Wildcards are only used in the middle or at the end of keywords. Wildcards are more useful if you are looking for something very specific such as chemical names which aren’t common in English or different English spellings of words.

For example, if you search for Ethyl* you can expect to get results such as “ethylene”, “ethylenic” or “ethylenically”. 

Click image to enlarge

Patent search results for "ethyl*" and wildcard on PatSnap platformPatent search results for the keyword ethyl* (Source: PatSnap platform)

If you want to control and refine your search even more, you can use the question mark "?" to give you results with only a few character variations. This is useful if you are searching for keywords that are spelt differently in British English and American English.

For example, if you search for “industrialisation”, you may not get any results for "industrialization". You can use this wildcard "?" to search for "industriali?ation" which will give you all the variations of that word. 

As you can see below, a search for “industrialisation” only brought back 212 results.  

Click image to enlarge

Patent search result for industrialisation on PatSnapSearch result for patents containing the keyword “industrialisation” (Source: PatSnap platform)

After using the wildcard, a search result for “industriali?ation” brought back 14,673 results! That’s 14,461 patents you could have missed. This search also brings back results for both “industrialisation” and “industrialization”.

Click image to enlarge

Patent search result for "industriali?ation" including wildcard on PatSnap platformPatent search results for “industriali?ation” (Source: PatSnap platform)

You should probably start off your patent search with stemming to get a relevant and broad set of results— especially if you’re using patent data for ideation. If you find that you are getting too many results for your chosen keyword—in an industry which isn’t relevant to your project—you should probably switch to wildcards.


Learn more about effective patent search techniques

Watch this webinar, hosted by D'vorah Graeser from KISSPatents, who explained how you can make your patent searches for effective with Boolean. 

New call-to-action