Using a Custom xPlore Thesaurus

In this post I examine using a custom thesaurus in xPlore to solve three common problems encountered when conducting full text searches:

  1. Accounting for frequently misspelled words.  For example, Walmart is often spelled:  Walmart (correct), Wal-mart, Wal_mart, or Wal Mart. Or, Webtop can be spelled Webtop (correct), WebTop, or Web Top.  Note some of these variations are reduced by doing case-insensitive searches.
  2. Accounting for product names that have changed, for example:  Site Caching Services (SCS) is now Interactive Delivery Services (IDS).
  3. Expanding the scope of a search by including synonyms and related concepts, such as: Tylenol, acetaminophen, ibuprofen.

My first task was to learn about xPlore thesauri and how to create them.  The xPlore v1.3 Administration and Development Guide, pages 213-217, do a good job of explaining what you need to know to create and install thesauri in xPlore.  xPlore thesauri use the SKOS model and are simple to create (I used TextPad) and install, though the lessons I learned below hint at the thought power required to really use thesauri effectively.

Misspelled Words

Here is my first thesaurus to address problem #1.  Initially I created one Concept block for ‘Walmart’ and included all the variations as altLabels.

Walmart RDF

I learned that only the perfLabel values are searched. The altLabels are expansions of the perfLabel. When I only included the first Concept block for ‘Walmart’,  searching on ‘Wal-mart’ did not expand the term to include the other variations. Therefore, to cover all the misspelling possibilities, each term was included as a Concept with the other spellings included as altLabels. Here is a screenshot to illustrate the results.

Walmart search

Product Rename

My approach for solving the second problem (renamed products), was similar to the first. I created a Concept for ‘IDS’ and included ‘SCS’ as an expansion term, and to ensure the reverse was true also, I created a Concept for ‘SCS’ and included ‘IDS’ as an expansion term.  Therefore, anyone searching on one product would find information pertaining to the other product also.  Here is the thesaurus entries for IDS and SCS.

IDS RDF

And here is how that search worked out.

IDS search

Related Concepts

Addressing problem #3 (related concepts) was a lot more interesting and delved into the realm of knowledge management and taxonomies.  To address this problem, I wanted to create the concept of ‘analgesics’ and include in it sub-concepts of ‘generic’ and ‘brand name’ types of medication.  For example, the analgesics concept would include the two generic medications, ‘acetaminophen’ and ‘ibuprofen’, and then two commercial brands of those drugs, ‘Tylenol’ and ‘Motrin’.  Though the SKOS model supports many knowledge management constructs like broader, narrower, and related, xPlore seems limited to just the perfLabel and altLabel tags.  With these limitations, my thesaurus entry for analgesics turned out looking much like the thesaurus entry for the product renaming problem in #2.  To thoroughly flesh-out analgesics, Concept entries would need to be made for each of the altLabels too to provide term expansion on any of the altLabel terms.

Tylenol RDF

Here is an example of the results obtained when searching for ‘tylenol’.

tylenol

These use cases are fairly trivial, but I hope they whet your appetite for what can be done with a simple thesaurus in xPlore.  Other interesting use case might be expanding search terms in one language to include synonyms from another language, or loading an industry-specific thesaurus .

UPDATE:  As is usually the case, after posting an article I stumble upon some more great info.  In this case, two ECN posts by Ed Bueche regarding cool things you can do with the xPlore thesaurus.

UPDATE:  This post also resulted in a derivative post here:  http://www.armedia.com/blog/2014/03/expanding-documentums-full-text-search-capability-with-a-thesaurus/

 

Advertisements

About Scott
I have been implementing Documentum solutions since 1997. In 2005, I published a book about developing Documentum solutions for the Documentum Desktop Client (ISBN 0595339689). In 2010, I began this blog as a record of interesting and (hopefully) helpful bits of information related to Documentum, and as a creative outlet.

5 Responses to Using a Custom xPlore Thesaurus

  1. pitch (former emcer, still fond of xplore) says:

    Note that you can use multiple prefLabel in a single concept, hence merging in a single entry IDS and SCS. That way you don’t need to duplicate the altLabel definitions.

    Like

    • Scott says:

      Ah, I thought I tried that with no success. I will examine it again, because that approach makes the thesaurus much easier to create and maintain.

      Like

    • Scott says:

      Pitch, I tried as you suggested with no success. I used the following concept definition and searched for ‘tylenol’. The only returns I got from xPlore were for ‘tylenol’; therefore, no term expansion of the altLabel tags. When I replace the concept with the original definition with altLabels I get the expected results.

      <skos:Concept rdf:about=”http://www.my.com/#Tylenol”>
      <skos:prefLabel>Tylenol</skos:prefLabel>
      <skos:prefLabel>acetaminophen</skos:prefLabel>
      <skos:prefLabel>ibuprofin</skos:prefLabel>
      <skos:prefLabel>Motrin</skos:prefLabel>
      <skos:prefLabel>Advil</skos:prefLabel>
      </skos:Concept>

      Like

  2. Pingback: Testing Thesauri in xPlore | dm_misc: Miscellaneous Documentum Information

  3. Pingback: Expanding Documentum’s Full Text Search Capability with a Thesaurus | Armedia Blog

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: