Query Results Truncated?

Recently, some colleagues and I were discussing whether the Content Server truncated result sets for large queries.  They insisted that it did and that the largest result set Documentum would return was 1000 rows or 350 rows from any single source (the default values for dfc.search.max_results and dfc.search.max_results_per_source in the dfc.properties file).  “Ridiculous!”, I exclaimed.  I had run queries that returned 1,000s of rows and could prove it.  So, I set out on this little research project.

To prove my point, I decided to run a query that returned a known result set from a variety of clients, while changing the settings of dfc.search.max_results and dfc.search.max_results_per_source.  To set these properties, I added the following lines to the dfc.properties file on both the Content Server and the DA web application server.  I set these properties artificially low to make the results obvious.

dfc.search.max_results = 100
dfc.search.max_results_per_source = 10

The query I ran was select r_object_id from dm_folder.  In my repository, this query returned 743 rows (from iDQL, which I used as my baseline).  I also ran this query from the RepoInt utility, the DA DQL Editor and the DA Advanced Search page.   If there was any truth to the claims of my colleagues, I should see a result set no larger than 100 rows when the properties were in effect.  See the table below for the results.

Client No Config Changes Content Server Only App Server Only
iDQL32 743 743 743
RepoInt 743 743 743
DA DQL Editor 743 743 743
DA Adv. Search 350 350 10

Interestingly, the Advanced Search did truncate the result set, but not as I expected.  It truncated the result set to 350 when these properties were not explicitly set, leading me to believe there was some sort of default in play.  It also truncated the result set to 10, not 100, when the properties were set.  What’s going on here?

After reading up a bit on dfc.search.max_results and dfc.search.max_results_per_source properties, I concluded that these configuration settings only affect ECIS/FS2 searches and not “regular” client searches (i.e., iDQL, RepoInt, DQL Editor, etc.).  However, since Webtop (and DA) are configured to use ECIS/FS2 when they are installed, it appears that the Advanced Search does respect the dfc.search.max_results and dfc.search.max_results_per_source properties when they are set.  Here’s how it works:

The dfc.search.max_results property dictates how large the final result set can be.  The default value is 1,000.  In my testing, this was supposed to be 100 rows.  However, this setting is the maximum setting for the entire result set and is further constrained by the dfc.search.max_results_per_source property.

The dfc.search.max_results_per_source property dictates the maximum number of results that can be returned from a single source.  The default value is 350.  Since my testing only involved one repository, the maximum number of results returned was 10.  If I had searched across 2 repositories, the final result set would have contained 20 rows (max).  Following this logic, if I had searched across 20 repositories, the result would have been 100 (the maximum size allows by the dfc.search.max_results property), not 200 as expected.

My advice is if you are only searching on one repository, set the dfc.search.max_results and dfc.search.max_results_per_source properties equal to each other to ensure your Advanced Searches return maximum result sets.  What the actual value of these properties are to produce maximum performance and efficiency is up to you to determine.

So, my colleagues and I were both right, we just needed to specify how we were running our queries.

8 Responses to Query Results Truncated?

  1. Pingback: One Year Ago « dm_misc: Miscellaneous Documentum Tidbits and Information

  2. Lorenz Vijay says:

    Scott,

    Is there a methodology or recommendations I can follow to arrive on the value for dfc.search.max_results_per_source neded for my repository ? (single repository only)

    What would be the impact if I increrase this value to 1000000 or is there a way i can set it to rteurn the results without any truncation since we are using only a single repository in our environment.

    Like

    • Scott says:

      I am not aware of any best practices for determining the values of these constraints. Unfortunately, trail and error and good record keeping may be your only answer. I would hesitate in allowing users to return a million rows. You will notice some latency and perhaps a performance hit returning result sets that size.

      Like

  3. Benjamin says:

    Hello

    I’ve observed a different behavior using the DFS Search Service (DCTM 6.7SP1) : I can manage the number of results effectively presented through the DFC.properties parameters, but no matter what I do I can’t get the total hit count (“theoretical” result set size) to be more than 10000.

    Any idea ?

    Like

    • Shoeb Haque says:

      Can you try the same using QueryService instead of Search Service? Have you tried playing with dfs-runtime.properties (dfs.query_cache_policy.query_max_result). From memory, I think it is pre-set and cannot be increased becasue of performance implications. Not sure if you can extend it arbitrarily.

      Like

      • Benjamin says:

        I’m bound to use SearchService because I need to work with structuredQuery objects.

        I tried working with dfs-runtime.properties without effects, but I found a new lead : my problem might be linked to faceting and maximum hit count retrieved by facets. This is not parametered in properties file but at object level in the code.

        I shall post more information when I finished my tests.

        Like

    • Shoeb Haque says:

      Facets have a default upperlimit of 10000 specified in indexserverconfig.xml. You should beware of the ramifications of making that change

      Like

      • Benjamin says:

        That seems to be the cause of my problem. I’ve changed this parameter in the StructredQuery instanciation and it works fine for us.

        So far, I haven’t observed any side effects.

        Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.