The RMK keyword search does not return the expected results. Why?
SAP SuccessFactors Recruiting Marketing
Keyword search is utilized in many forms across the Recruiting Marketing (RMK) platform in both the front-end career site as well as backend rules and processes. RMK keyword search is powered by an indexing and search process modeled after the large web search engines such as Google and uses a number of different types of logic to display jobs or sometimes other content based on relevance. The purpose of this document is to provide descriptions and basic functional overview regarding the search logic.
Basic Search Types
- Single Term: A search using a single keyword example: ‘Sales’.
- Multi Term Search: Two or more keywords are used to perform the search. Example, ‘Project Manager’.
Description: When two or more terms are used to perform the search, and ‘OR’ operator is invisibly placed between the two or more keywords.
The example ‘Project Manager’ all jobs with keyword ‘Project’ or ‘Manager’ will be returned in the results with the highest relevant scoring jobs appearing first on the list.
- Searching Without a Search Term: Performing a job search without entering any keyword search terms into the search box.
Description: When an open search without any keywords is performed, all active jobs are returned in the result set. Relevancy scoring
is still applied to determine the jobs order however, since there isn’t a keyword for the jobs to be scored against, they are returned in a seemingly random order.
- Search using “Quotations”: Two or more keyword terms wrapped in “Quotations” are treated as a single term.
Description: When keywords/phrases are wrapped with “Quotations” only jobs with the exact same set of keywords in the same order are returned.
For example the search “Project Manager” is treated as a single keyword search. Jobs will only be returned if the terms Project Manager appears next to each other somewhere within the job.
It is important to note that when using “quotations” to perform a search, stemming is still applied and stop words are removed (see stemming and stop words for more information).
- Search using (Parentheses): When two or more keyword terms are wrapped with (Parentheses), the database is searched for jobs with either term.
Description: While it is not initially evident how this type of search is any different from a basic ‘Multi Term Search’, the use of (Parentheses) can be valuable when performing more advanced/complex search queries that involve specific individual data points.
- Search using a Boolean Operator: A search using a Boolean Operator such as AND/NOT/OR (in full caps) will further refine the search.
Description: An example of using a Boolean Operator would be placing AND/NOT (in full caps) in between two or more keyword terms. The search ‘project AND manager’ will return jobs only when both the keywords are found. Inversely to the search ‘project NOT manager’ will return jobs only when ‘project’ without ‘manager’ is found.Note: the keyword terms used do not need to be found next to each other in the job description.
- Wildcard Characters (question marks, asterisks, and tilde) Wildcards are characters which can be used as search criteria to narrow or expand alternatives in the search results.
Multiple Character (*) – the * asterisk character can be used as a wildcard when searching for terms, where any number of the characters are not known. Using the asterisk will results in the greatest diversity of matches. When the search needs to include a wide variety of results, the asterisk is the best option.
Fuzzy searches (~) – to do a fuzzy logic search use the tilde ~ symbol at the end of a single term. Fuzzy logic provides assistive search logic to find similar spellings. Eg) Search for a term similar in spelling to roam use a tilde at the end of the word “roam~”. This search will find terms like foam, roam, roams, room, road, roads, etc.
Stemming: Stemming refers to the process of identifying the stem or root of the keyword search term and searching all variations of the word. Example: the search term ‘Fishing’ is used to perform a search.
The stem of the word ‘Fishing’ is ‘Fish’. Therefore, the search will include results for not only ‘Fishing’ but also ‘Fish’ and other variations of the word ‘Fish’ such as ‘Fished’ and ‘Fisher’. The stemming process is automatically applied to every English keyword search.
Stop words: These are words that are excluded when used in a search string. This process is designed to increase search accuracy by removing small words such as ‘the’,'a','is','as', 'in',... Such words are present throughout all jobs in large quantities and are removed from the search string to avoid skewing a job’s relevancy. This process is built into the framework and cannot be circumvented.
The Case of "IT" :"it" is one of the stop words. This means that IT (standing for Information Technologies) cannot be used as a search term as it will not return any results.
As to the possibility of removing "IT" from the list, the risk is that every job with “it” in it would then hit the search request which would render the search useless. Even an upper case IT search would return unwanted results as it would show all jobs in Italy. In any case, the search could not be made case sensitive as this would be too restrictive.
What is more, changes to field data types and how they work has deep impacts in other searches for categories/rules etc. that had previously worked so this change would simply be too risky.
Our recommendation is that "IT" jobs should always be spelled out completely, using Information Technology instead.
Scoring: Scoring is the process of determining a jobs relevancy. When a keyword search is performed, every job in the database is scored based on its relevance to the keyword search term(s). The higher a job’s score, the more relevant it is deemed to be and the higher it will appear on the results list.
search syntax, how to search in RMK, IT jobs, Information Technology, RMK search logic , KBA , LOD-SF-RMK , Recruiting Marketing , How To