Documentum Performance Corner
January 22, 2012 Leave a Comment
January 22, 2012 Leave a Comment
January 19, 2012 Leave a Comment
I updated the code for the dmRecordSet class here. The only change is that I removed some debug statements that made it into the released code.
January 16, 2012 Leave a Comment
In December, Cast software, the maker of software quality tools, released their second annual CRASH (Cast Report on Application Software Health) report. The report rated the “health” of world-wide software applications by examining the source code of 745 applications (~365 million lines of code), from 160 different companies, spanning 10 industry sectors, and 8 programming languages. The code examination flagged 1800 different types of development and architecture violations that compromise application “health” in 5 major categories.
Though the report does not mention Documentum or ECM directly, it is an interesting and insightful read to say the least. I wrote a longer blog post about it here. The 22-page executive summary can be downloaded from Cast here.
January 8, 2012 Leave a Comment
I learned an interesting thing this week. I had to sort a set of query results according to these rules: object names starting with lowercase letters should precede object names starting with uppercase letters, and object names starting with numbers should follow those staring with uppercase letters. In pseudo-code, the rule looks like this: [a-z] < [A-Z] < [0-9]. As you might have noticed, this is exactly the reverse of how strings are naturally sorted (i.e., according to the ASCII table). So, what I learned is that it is very simple to create a rule-based collator object that defines sort order,s and use it to enable this sort.
The code looks like this:
RuleBasedCollator coll = new RuleBasedCollator(
“< a,A < b,B < c,C < d,D < e,E < f,F < g,G < h,H <i,I ” +
“< j,J < k,K < l,L < m,M < n,N < o,O < p,P < q,Q < r,R ” +
“< s,S < t,T < y,U < v,V < w,W < x,X < y,Y < z,Z ” +
“< 0 < 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9″);
Collections.sort(list_of_objects, coll);
where list_of_objects is a ListArray of object names.
Simple, clean and uses built-in Java sorting logic. The nice thing about collators, is you can create any sorting precedent you want. For example, if you wanted to swap the precedence of S and T in the collation above, simple change the order in the definition of the collator and Voila!, your strings will now sort such that T precedes S.
If you want more on collators, check here.
January 3, 2012 Leave a Comment
In 2003, Documentum began performing benchmark performance and load tests to definitively answer the questions we all get asked: How fast is Documentum? How far can Documentum Scale? Documentum claims to be the market leader in performance and scalability, prove it. So they did, in a series of benchmark tests performed over the course of the last 8 years. Here are links to the ones I know about. Perhaps you know of more:
December 19, 2011 Leave a Comment
In this final post of the IDfCollection series, I offer an alternative to the IDfCollection object, the dmRecordSet. The dmRecordSet is an object I created to extend the capabilities of the IDfCollection and overcome many of the limitations I have been discussing here. For example, the dmRecordSet allows you to:
The following code provides some examples of how easy it is to use the dmRecordSet object.
Instantiate a dmRecordSet:
IDfCollection col = null;
String dql = "select r_object_id, object_name, "
+ "r_creation_date, a_content_type, r_full_content_size, a_is_template "
+ "from dm_document where folder('/Temp',descend)";
IDfQuery q = new DfQuery();
IDfTypedObject tObj = null;
q.setDQL(dql);
col = q.execute(session, DfQuery.DF_READ_QUERY);
// get record set
dmRecordSet dmRS = new dmRecordSet(col);
Test for empty set and count rows:
System.out.println("Record count = " + dmRS.getRowCount());
if (dmRS.isEmpty()) {
System.out.println("dmRecordSet is empty");
} else {
System.out.println("dmRecordSet is NOT empty");
}
Process record set:
while (dmRS.hasNext()) {
tObj = dmRS.next();
System.out.print(tObj.getString("r_object_id") + "\t");
System.out.println(tObj.getString("object_name");
}
Move to end and process set backwards:
tObj = dmRS.last();
while (dmRS.hasPrevious()) {
tObj = dmRS.previous();
System.out.print(tObj.getString("r_object_id") + "\t");
System.out.println(tObj.getString("object_name");
}
The class, source code and Javadoc for the dmRecordSet can be downloaded here.
I hope you have enjoyed this series on the IDfCollection object, one of the most commonly used but least functional objects in the DFC. As always, I appreciate your comments and feedback.
December 12, 2011 Leave a Comment
In this post I will show you how to do a recursive operation using IDfCollection objects. Perhaps more accurately, I will show you how to do a recursive operation in spite of IDfCollection objects. As I mentioned in Part I of this series, the number of available IDfCollections is limited and it is very important to close collection objects as soon as you are done processing them. This leads to two problems when considering a recursive operation: 1) you don’t know ahead of time how many IDfCollection objects will be required; and 2) some of the collection objects may need to remain open for a long time.
The obvious solution to the problem is to replace the IDfCollection object with something that is a little more functional. In the code below I simply us an ArrayList of IDfTypedObjects to represent the contents of the IDfCollection object. I know, this solution runs contrary to the warning I made in Part I of this series when I warned never to instantiate the IDfTypedObject from the collection. Well, in this case it is warranted, and the overhead involved with retrieving and instantiating the object is lost in the recursion itself.
I won’t go through all of this code but simply state that it’s purpose is to traverse a folder structure, and print a directory listing (similar to an old DOS directory listing). It’s probably not the most practical or useful code, but it illustrates recursion with IDfCollections pretty well. The key to the code is line 40 – 50 where the contents of the collection are transferred to the ArrayList of IDfTypedObjects. The advantage to using IDfTypedObjects, is that later, when the array contents are processed (lines 7 – 17), you have information about each column of the collection (e.g., the attribute type, the attribute name, etc.). If your application calls for less information, you could get away with a different or custom data structure in the ArrayList.
IDfFolder folder = (IDfFolder) session.getObjectByQualification("dm_cabinet where object_name = 'Temp'");
// do recursion
ArrayList dirList = doRecursiveDir(folder);
// process results
for (IDfTypedObject tObj : dirList) {
if (tObj.getString("r_object_id").startsWith("0b")) {
IDfFolder f = (IDfFolder) session.getObject(new DfId(tObj.getString("r_object_id")));
System.out.println(f.getFolderPath(0));
} else {
System.out.println("\t" + tObj.getString("object_name")
+ "\t\t" + tObj.getString("r_object_id") + "\t"
+ tObj.getString("r_creation_date") + "\t"
+ tObj.getString("a_content_type") + "\t"
+ tObj.getString("r_full_content_size"));
}
}
private static ArrayList doRecursiveDir(IDfFolder folder) {
IDfQuery q = new DfQuery();
IDfCollection col = null;
ArrayList tempList = new ArrayList();
ArrayList objList = new ArrayList();
try {
// get contents of current folder
String dql = "select r_object_id, object_name, r_creation_date, a_content_type, r_full_content_size from dm_sysobject "
+ "where folder(id('" + folder.getObjectId().toString() + "'))";
q.setDQL(dql);
col = q.execute(session, DfQuery.DF_READ_QUERY);
while (col.next()) {
tempList.add(col.getTypedObject());
}
col.close();
// add objects to list or call recursion if necessary
for (IDfTypedObject tObj : tempList) {
objList.add(tObj);
if (tObj.getString("r_object_id").startsWith("0b")) {
IDfFolder f = (IDfFolder) session.getObject(new DfId(tObj.getString("r_object_id")));
objList.addAll(doRecursiveDir(f));
}
}
} catch (DfException e) {
e.printStackTrace();
}
return objList;
}
The output of this code looks like this:
/Temp/Jobs
/Temp/Jobs/dm_GwmTask_Alert
11/11/2011 1:59:26 PM dm_GwmTask_Alert 090000018002c503 11/11/2011 1:59:33 PM crtext 0
11/11/2011 3:15:26 PM dm_GwmTask_Alert 090000018002c52b 11/11/2011 3:15:28 PM crtext 0
11/11/2011 4:10:53 PM dm_GwmTask_Alert 090000018002c535 11/11/2011 4:10:53 PM crtext 0
/Temp/Jobs/dm_FTIndexAgentBoot
/Temp/Jobs/dm_ContentWarning
11/11/2011 1:59:50 PM dm_ContentWarning 090000018002c507 11/11/2011 2:00:09 PM crtext 3909
11/14/2011 11:17:21 AM dm_ContentWarning 090000018002c907 11/14/2011 11:17:40 AM crtext 3909
12/3/2011 10:18:40 PM dm_ContentWarning 090000018002d107 12/3/2011 10:19:15 PM crtext 3906
I haven’t had to do recursive operations too often with Documentum — in fact, I think the only times I have had to do recursion is for building GUI widgets that represented a tree or folder structure. So, you be the judge of how practical this is.
In the next post, I will present a replacement for the IDfCollection object, the dmRecordSet, that doesn’t suffer from the IDfCollection object’s shortcomings discussed here over the past few weeks. For example, the dmRecordSet knows if it is empty; it know how many rows it contains; it can be added to; it can be traversed forward, backward, or randomly; it can be reset, etc. Check back next week.
December 6, 2011 1 Comment
In the last post I discussed how to determine if an IDfCollection was empty, and how to process a generic IDfCollection (i.e., not knowing anything about its columns). This week I want to look at two ways to determine the size of an IDfCollection. As mentioned previously, the IDfCollection object does not have a method or attribute that identifies how many (if any!) rows it contains. The first method discussed below is easy to implement, but has several drawbacks. The second method is a little trickier to implement, but does not suffer from the same drawbacks as the first method.
Method #1
This method loops through the IDfCollection and increments a counter. The approach is simple, but will consume processing time if the query results are large. The code looks like this:
dql = "select r_object_id, object_name, r_creation_date, a_content_type, r_full_content_size, a_is_template from dm_document where folder('/Temp', descend)";
q = new DfQuery();
q.setDQL(dql);
col = q.execute(session, DfQuery.DF_READ_QUERY);
// count the rows in the collection
int cnt = 0;
while (col.next()) {
cnt++;
}
System.out.println("Collection = " + cnt + " rows");
// if there were results, re-run and process
if (cnt > 0) {
col = q.execute(session, DfQuery.DF_READ_QUERY);
processGenericCollection(col);
}
Notice the “gottcha” here. After counting the rows in the while() loop, the IDfCollection object’s pointer has been advanced beyond the end of the collection. There is no way to reset the pointer to the beginning of the IDfCollection short of re-running the query. This is a major deficiency in my book. This means you get one shot at processing your query results and you can only plow through them linearly.
Method #2
This second method offloads the task of counting the rows in the result set (i.e., the IDfCollection object) to the database. To accomplish this, you must issue two queries: the first returns the count of the database objects that meet the query criteria; the second returns the result set. The tricky part of this approach is correctly parsing the query and converting it to a ‘count’ query. Usually a simple string match will do the trick, but I have seen some queries blow up when the select variables are removed. Here is the basic code for this approach.
dql = "select r_object_id, object_name, r_creation_date, a_content_type, r_full_content_size, a_is_template from dm_document where folder('/Temp', descend)";
System.out.println("Results = " + countQuery(dql) + " rows");
q = new DfQuery();
q.setDQL(dql);
col = q.execute(session, DfQuery.DF_READ_QUERY);
processGenericCollection(col);
. . .
// convert query to count query and count results
private static int countQuery(String dql) {
IDfQuery q = new DfQuery();
IDfCollection col = null;
String cntDQL = "select count(*) as cnt ";
int cnt = -1;
try {
cntDQL += dql.substring(dql.indexOf("from"));
q.setDQL(cntDQL);
col = q.execute(session, DfQuery.DF_READ_QUERY);
col.next();
cnt = col.getInt("cnt");
col.close();
} catch (DfException e) {
e.printStackTrace();
}
return cnt;
}
This solution isn’t so much about the IDfCollection object as it is about the query. Still it seems a bit excessive and awkward just to determine how many rows are in an IDfCollection object.
There you have it, two ways to determine the size of an IDfCollection – neither of them ideal. I would love to hear from you if you have a different/better approach. Next week I’ll show you some code that does recursion using IDfCollections.
November 29, 2011 Leave a Comment
In case you missed the announcement, here it is again: Documentum 6.7 SP1 and xPlore 1.2 are now available for download! Here is the full announcement. Notable enhancements include: JBoss 5.1 for the Java Method Server, a new DQL hint, and tons of new xPlore features including thesaurus support. Check it out!
November 27, 2011 Leave a Comment
In the previous post, we looked at the basics of processing the contents of an IDfCollection object. In this post I will show how to process a generic IDfCollection, and how to determine if an IDfCollection is empty.
First, the code for processing a generic IDfCollection. The following method will take as input, an IDfCollection object of which it knows nothing about. It will “explore” the collection and retrieve the attribute values it contains according to the column types. You might embed code like this in your application as a generic way to extract IDfCollection content. In this example, I just print the content as it is extracted. This limits the method’s usefulness as a generic process, but it works well as an illustration. You can change it to do something more meaningful.
private void processGenericCollection(IDfCollection col) {
// #6 check for empty collections
boolean isEmpty = true;
try {
// #1 print column names
int colNum = col.getAttrCount();
for (int i=0; i
System.out.print(col.getAttr(i).getName() + "\t");
}
System.out.println();
// #2 print contents of each row
while (col.next()) {
// #7 collection is not empty
isEmpty = false;
// #3 process each column individually according to its data type
for (int j=0; j<colNum; j++) {
// #4 get col data type
int colType = col.getAttr(j).getDataType();
// get col value based on type
String colValue = "";
if (colType == IDfType.DF_BOOLEAN)
colValue = Boolean.toString(col.getBoolean(col.getAttr(j).getName()));
else if (colType == IDfType.DF_DOUBLE)
colValue = Double.toString(col.getDouble(col.getAttr(j).getName()));
else if (colType == IDfType.DF_ID)
colValue = col.getId(col.getAttr(j).getName()).toString();
else if (colType == IDfType.DF_INTEGER)
colValue = Integer.toString(col.getInt(col.getAttr(j).getName()));
else if (colType == IDfType.DF_STRING)
colValue = col.getString(col.getAttr(j).getName());
else if (colType == IDfType.DF_TIME)
colValue = col.getTime(col.getAttr(j).getName()).toString();
// #5 print value System.out.print(colValue + "\t");
}
System.out.println();
}
} catch (DfException e) {
e.printStackTrace();
}
col.close();
// #8 process empty collection
if (isEmpty)
System.out.println("The IDfCollection contains no results (zero rows)");
}
next() method, you can “explore” the IDfCollection and retrieve information about the columns. Unfortunately, you can’t retrieve information about the rows, like, how many there are (We’ll look at that next week). In this example, I just print the names of the columns in the IDfCollection (these will correspond to the values in the SELECT statement).next() method to advance the row pointer in the IDfCollection. The col object now represents the first row in the IDfCollection.As I mentioned in #1, you can’t determine how many rows an IDfCollection has by interrogating it — or if it has any at all. Unfortunately, an empty collection is not returned as null from the IDfQuery.execute() method because it contains column definitions. Therefore, testing for a null IDfCollection object will not indicate an empty collection. The next few bullets present a simple flag that can be set to determine if a collection is empty (i.e., the query returned no results), as opposed to some other error that returned an empty IDfCollection object.
true (assume the collection is empty until content is actually retrieved).next() method executes for the first time, set the flag to false. If the collection is in fact empty, the code in the while loop will not execute.This code is a direct expansion of the basic processing loop presented in the previous post. It has been enhanced to accommodate a generic collection (i.e., not knowing what the column names or types are), and to handle empty collections. Next week I’ll show you two methods for determining the size of an IDfCollection object.