Fork me on GitHub

n. Slang a rough lawless young Kuali developer.
[perhaps variant of Houlihan, Irish surname]
kualiganism n

Blog of an rSmart Java Developer. Full of code examples, solutions, best practices, et al.

Friday, April 30, 2010

Setting up Eclipse to use Jetty for KFS Development

Sometimes, you just want to use a straight, no-nonsense appserver for development. You want it to be fast, simple, and integrate with Eclipse and its debugger. Sounds like a tall order, but it's not. It's possible to setup Jetty to run KFS within Eclipse and hook in the debugger.

Move Required Libraries into Your KFS Project

The following need to be copied to the .classpath of your eclipse project
  • jetty-6.1.15.rc5.jar
  • jetty-util-6.1.15.rc5.jar
  • connector-1_5.jar

Create a Startup Server Class in Your KFS Project

Here's ours

package edu.arizona.jetty;

import org.mortbay.jetty.*;
import org.mortbay.jetty.webapp.*;
import org.mortbay.jetty.nio.*;

import static java.lang.System.getProperty;


/**
*
* @author Leo Przybylski (przybyls@arizona.edu)
*/
public class HappyMeal {
public static void main(String[] args) {
final String WEBROOT = getProperty("webroot.directory");
final int SERVER_PORT = Integer.parseInt(getProperty("jetty.port"));
final String CONTEXT_PATH = "/kfs-dev";

Server server = new Server();

Connector[] connectors = new Connector[1];
connectors[0] = new SelectChannelConnector();
connectors[0].setPort(SERVER_PORT);
server.setConnectors(connectors);

WebAppContext webapp = new WebAppContext();
webapp.setContextPath(CONTEXT_PATH);
webapp.setWar(WEBROOT);
server.setHandler(webapp);

try {
server.start();
server.join();
}
catch (Exception e) {
e.printStackTrace();
}
}
}


Create a .launch for Jetty Server


<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<launchConfiguration type="org.eclipse.jdt.launching.localJavaApplication">
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_PATHS">
<listEntry value="/kfs"/>
</listAttribute>
<listAttribute key="org.eclipse.debug.core.MAPPED_RESOURCE_TYPES">
<listEntry value="4"/>
</listAttribute>
<booleanAttribute key="org.eclipse.debug.core.appendEnvironmentVariables" value="true"/>
<listAttribute key="org.eclipse.debug.ui.favoriteGroups">
<listEntry value="org.eclipse.debug.ui.launchGroup.debug"/>
<listEntry value="org.eclipse.debug.ui.launchGroup.run"/>
</listAttribute>
<stringAttribute key="org.eclipse.jdt.launching.MAIN_TYPE" value="edu.arizona.jetty.HappyMeal"/>
<stringAttribute key="org.eclipse.jdt.launching.PROJECT_ATTR" value="kfs"/>
<stringAttribute key="org.eclipse.jdt.launching.VM_ARGUMENTS" value="-Dwebroot.directory=work/web-root/ -Djetty.port=8080 -Xmx2g -XX:MaxPermSize=256m -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -Denvironment=dev -DlogDefault=WARN"/>
</launchConfiguration>


The parts to pay attention to are the org.eclipse.jdt.launching.MAIN_TYPE, org.eclipse.jdt.launching.PROJECT_ATTR, and org.eclipse.jdt.launching.VM_ARGUMENTS attributes. You'll need to adjust yours to your preferences.

Test it out



This will add jetty to the tools menu


Hooking up the Debugger

The jetty selection should now be available under the debug menu.


Love

REPOST KFS Inheritance/Composition with Private Methods

Why the Repost?

I got some feedback that I
  • Never resolved the namespace issue
  • Didn't show original code to explain the problem
Both are very good points, so I'm going to do that now.

The Problem Scenario

Suppose an instance arrives where you need to change the functionality of KFS slightly, but the method you want to override is declared private. Need an example? I have one. Read on.

Digester and Namespace Validation

By default, Digester within KFS has namespace validation off. Suppose you want to turn it on. Now you run into exactly such issue. Below is the original code from the XmlBatchInputFileTypeBase
...
...
public class XmlBatchInputFileTypeBase extends BatchInputFileTypeBase {
/**
* @see org.kuali.kfs.sys.batch.BatchInputFileType#parse(byte[])
*/
public Object parse(byte[] fileByteContent) throws ParseException {
if (fileByteContent == null) {
LOG.error("an invalid(null) argument was given");
throw new IllegalArgumentException("an invalid(null) argument was given");
}

// handle zero byte contents, xml parsers don't deal with them well
if (fileByteContent.length == 0) {
LOG.error("an invalid argument was given, empty input stream");
throw new IllegalArgumentException("an invalid argument was given, empty input stream");
}

// validate contents against schema
ByteArrayInputStream validateFileContents = new ByteArrayInputStream(fileByteContent);
validateContentsAgainstSchema(getSchemaLocation(), validateFileContents);

// setup digester for parsing the xml file

Digester digester = buildDigester(getSchemaLocation());

Object parsedContents = null;
try {
ByteArrayInputStream parseFileContents = new ByteArrayInputStream(fileByteContent);
parsedContents = digester.parse(validateFileContents);
}
catch (Exception e) {
LOG.error("Error parsing xml contents", e);
throw new ParseException("Error parsing xml contents: " + e.getMessage(), e);
}

return parsedContents;
}

}

The problem is with buildDigester(getSchemaLocation()). XmlBatchInputFileTypeBase#buildDigester() is declared private. That is, if one tries to extend XmlBatchInputFileTypeBase, you not only cannot override the buildDigester() method, but you can't call it from an inheriting class either.

We would love to do
        Digester digester = buildDigester(getSchemaLocation());

But we can't. The alternative is to use reflection and call the private buildDigester method anyway.

Here's an example:
package edu.arizona.kfs.sys.batch;
...
...
public class XmlBatchInputFileTypeBase extends org.kuali.kfs.sys.batch.XmlBatchInputFileTypeBase {
/**
* @see org.kuali.kfs.sys.batch.BatchInputFileType#parse(byte[])
*/
public Object parse(byte[] fileByteContent) throws ParseException {
if (fileByteContent == null) {
LOG.error("an invalid(null) argument was given");
throw new IllegalArgumentException("an invalid(null) argument was given");
}

// handle zero byte contents, xml parsers don't deal with them well
if (fileByteContent.length == 0) {
LOG.error("an invalid argument was given, empty input stream");
throw new IllegalArgumentException("an invalid argument was given, empty input stream");
}

// validate contents against schema
ByteArrayInputStream validateFileContents = new ByteArrayInputStream(fileByteContent);
validateContentsAgainstSchema(getSchemaLocation(), validateFileContents);

// setup digester for parsing the xml file

Digester digester = null;
try {
Method digesterMethod = getClass().getDeclaredMethod("buildDigester", String.class, String.class);
m.setAccessible(true); // Private? Who cares?! I sure don't.
m.invoke(this, getSchemaLocation(), getDigestorRulesFileName());
}
catch(IllegalAccessExption e) {
// Not since it's accessible now.
}
catch(InvocationTargetException e) {
// Something else naughty happened
}

if (digester == null) {
// throw some exception because this is very bad
}

digester.setNamespaceAware(true); // Enabling the namespace checking! Yeay!


Object parsedContents = null;
try {
ByteArrayInputStream parseFileContents = new ByteArrayInputStream(fileByteContent);
parsedContents = digester.parse(validateFileContents);
}
catch (Exception e) {
LOG.error("Error parsing xml contents", e);
throw new ParseException("Error parsing xml contents: " + e.getMessage(), e);
}

return parsedContents;
}

}

The trick is in

Digester digester = null;
try {
Method digesterMethod = getClass().getDeclaredMethod("buildDigester", String.class, String.class);
m.setAccessible(true); // Private? Who cares?! I sure don't.
m.invoke(this, getSchemaLocation(), getDigestorRulesFileName());
}
catch(IllegalAccessExption e) {
// Not since it's accessible now.
}

By calling m.setAccessible(true), we can still call the private method. many call this a hack because it ignores the private access. I think many people are confused in what that means. The control is rather just for managing scope and maintaining OO encapsulation. These directives are not intended for security. They are intended to enforce OO. If we wanted to restrict method access, a SecurityManager implementation would be more appropriate. IMHO, this is an entirely acceptable thing to do in this situation. That being that we are being restricted by a flaw in the API from doing something we should be allowed to do. This happens often, and is a much better alternative than duplicating code.

Then later, we append our changes to the digester with...

digester.setNamespaceAware(true); // Enabling the namespace checking! Yeay!

With this, we can get away with extending code and overriding behavior (setting namespace awareness).

Tuesday, April 27, 2010

KFS Inheritance/Composition with Private Methods

The Problem Scenario

Suppose an instance arrives where you need to change the functionality of KFS slightly, but the method you want to override is declared private. Need an example? I have one. Read on.

Digester and Namespace Validation

By default, Digester within KFS has namespace validation off. Suppose you want to turn it on. Now you run into exactly such issue.
package edu.arizona.kfs.sys.batch;
...
...
public class XmlBatchInputFileTypeBase extends org.kuali.kfs.sys.batch.XmlBatchInputFileTypeBase {
/**
* @see org.kuali.kfs.sys.batch.BatchInputFileType#parse(byte[])
*/
public Object parse(byte[] fileByteContent) throws ParseException {
if (fileByteContent == null) {
LOG.error("an invalid(null) argument was given");
throw new IllegalArgumentException("an invalid(null) argument was given");
}

// handle zero byte contents, xml parsers don't deal with them well
if (fileByteContent.length == 0) {
LOG.error("an invalid argument was given, empty input stream");
throw new IllegalArgumentException("an invalid argument was given, empty input stream");
}

// validate contents against schema
ByteArrayInputStream validateFileContents = new ByteArrayInputStream(fileByteContent);
validateContentsAgainstSchema(getSchemaLocation(), validateFileContents);

// setup digester for parsing the xml file

Digester digester = null;
try {
Method digesterMethod = getClass().getDeclaredMethod("buildDigester", String.class, String.class);
m.setAccessible(true); // Private? Who cares?! I sure don't.
m.invoke(this, getSchemaLocation(), getDigestorRulesFileName());
}
catch(IllegalAccessExption e) {
// Not since it's accessible now.
}
catch(InvocationTargetException e) {
// Something else naughty happened
}


Object parsedContents = null;
try {
ByteArrayInputStream parseFileContents = new ByteArrayInputStream(fileByteContent);
parsedContents = digester.parse(validateFileContents);
}
catch (Exception e) {
LOG.error("Error parsing xml contents", e);
throw new ParseException("Error parsing xml contents: " + e.getMessage(), e);
}

return parsedContents;
}

}

The trick is in

Digester digester = null;
try {
Method digesterMethod = getClass().getDeclaredMethod("buildDigester", String.class, String.class);
m.setAccessible(true); // Private? Who cares?! I sure don't.
m.invoke(this, getSchemaLocation(), getDigestorRulesFileName());
}
catch(IllegalAccessExption e) {
// Not since it's accessible now.
}

By calling m.setAccessible(true), we can still call the private method. many call this a hack because it ignores the private access. I think many people are confused in what that means. The control is rather just for managing scope and maintaining OO encapsulation. These directives are not intended for security. They are intended to enforce OO. If we wanted to restrict method access, a SecurityManager implementation would be more appropriate. IMHO, this is an entirely acceptable thing to do in this situation. That being that we are being restricted by a flaw in the API from doing something we should be allowed to do. This happens often, and is a much better alternative than duplicating code.

With this, we can get away with extending code and overriding behavior.

KIS of the Dragon - Testing Liquibase Changelogs

Here is a screencast on how to test changelogs with Liquibase tagging and rollback commands.


The Breakdown


  1. Tag the database
    You want to tag before changes are made. This is for rolling back later.

    % liquibase --defaultsFile=liquibase.properties tag

  2. Run your update

    % liquibase --defaultsFile=liquibase.properties --changeLogFile=your changelog update <your tag name>

  3. Rollback
    You can rollback if you make a mistake and run your update again, but to make sure you leave the place the same as when you came in, cleanup with rollback.

    % liquibase --defaultsFile=liquibase.properties rollback <tag you used earlier>

Monday, April 26, 2010

KISmet - Game Changer: Structuring the Project to Manage Database Changes with Liquibase

Structuring your project to handle configuration management of your project RDBMS can probably be the most difficult part of managing your project. If you plan to use Liquibase to manage your database migrations, then this is even more the case. At the University of Arizona, we first followed the instructions laid out by this tutorial (Tutorial Using Oracle) since we're using Oracle.

We retain in our methodology much from that article, but rather than explain the differences, I'll just explain what we did.

The University of Arizona Liquibase Methodology in a Nutshell

The goals we wanted to solve with simplified data migration using Liquibase are:
  • Structure for isolating database changes.
  • Integrates with a process that versions a database from SVN, Jira, and Continuous Integration.
  • Coupled database version with application version.
  • Integrates with a process that facilitates rollback, update, and complete schema rebuilds.

Structure for Isolating Database Changes

A project was created at University of Arizona called kfs-cfg-dbs. Within that project is where we created the structure to manage out database migrations. We followed the example outlined in Tutorial Using Oracle. We found that we can create two paths. One path is for update (update/), and the other path is for building the latest schema entirely (latest/).

latest/

Here, we followed the convention of using 3-character paths according to the changelog content.


Pathname Content
cst constraint-related changelog
dat table-related changelog
idx index-related changelog
seq sequence-related changelog
tab table-related changelog
vw view-related changelog

constraints.xml

During our database migrations, we load schema changes and data changes. These data changes can sometimes effect constraints. For full schema rebuilds, we load constraints last to allow data loads to process faster. Therefore, we separate our constraint changelog information into its own file to run last.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog/1.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog/1.9 http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-1.9.xsd">
<include file="latest/cst/LD_EXP_TRNFR_DOC_T.xml" />
<include file="latest/cst/FP_PMT_MTHD_T.xml" />
<include file="latest/cst/FP_PMT_MTHD_CHART_T.xml" />
</databaseChangeLog>


Notice that entries are simply includes that point to files associated by table name in the cst/ directory within the latest/ path. For example, latest/cst/LD_EXP_TRNFR_DOC_T.xml refers to constraints on the LD_EXP_TRNFR_DOC_T table. Because cst/ is taken from latest/ we know that this file relates to new schema migrations. 100% of the time, includes in constraints.xml will point to latest. That is our convention.

data.xml

Similarly to constraints.xml, data.xml has entries to data by table-name in latest/ for new schema migrations. Here is ours:

<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog/1.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog/1.9 http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-1.9.xsd">
<include file="latest/dat/FP_PMT_MTHD_T.xml" />
<include file="latest/dat/FP_PMT_MTHD_CHART_T.xml" />
<include file="latest/dat/KREW.xml" />
<include file="latest/dat/KRIM.xml" />
<include file="latest/dat/KRIM3.xml" />
<include file="latest/dat/GL_OFFSET_DEFN_T.xml" />
<include file="latest/dat/KRIM2.xml" />
<include file="latest/dat/KRNS_PARM_T.xml" />

install.xml

Anything that needs to be migrated before data and constraints is added to the install.xml. It follows exactly the same convention as the previous two:

<include file="latest/tab/FP_PRCRMNT_LVL3_ADD_ITEM_T.xml" />
<include file="latest/tab/FP_PRCRMNT_LVL3_FUEL_T.xml" />
<include file="latest/tab/FP_PRCRMNT_CARD_HLDR_DTL_T.xml" />
<include file="latest/tab/FP_PRCRMNT_CARD_TRN_T.xml" />
<include file="latest/tab/FP_PRCRMNT_CARD_HLDR_LD_T.xml" />
<include file="latest/tab/PDP_SHIPPING_INV_TRACKING_T.xml" />
<include file="latest/seq/ERROR_CERT_ID_SEQ.xml" />
<include file="latest/seq/CM_CPTLAST_AWARD_HIST_NBR_SEQ.xml" />
<include file="latest/seq/PUR_PDF_LANG_ID.xml" />

update/

The update changelog is responsible for database migrations on existing schemas. It simply updates a schema already in use, so these are all changes.
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog/1.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog/1.9 http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-1.9.xsd">
<include file="update/KITT-958.xml" />
</databaseChangeLog>


The update.xml refers to files in the update/ path which contains files associated by Jira Issue #. By this convention, we know that update/KITT-958.xml contains a change in this version for KITT-958. This is how we get our Jira coupling. Using this convention, we can migrate perpetually and let Jira handle linkage with our data migration/issue management.

Integrates with a process that versions a database from SVN, Jira, and Continuous Integration.

I have already shown how this works with Jira, but how do we get exclusive database versions? We use SVN to handle changelog ids for us. Changelog ids are what Liquibase uses to identify which changes are to be run. Each change gets its own id, to identify it apart from others. In order to lessen developer overhead, we simply use the $Revision$ keyword from SVN.

<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog/1.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog/1.9 http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-1.9.xsd">
<changeSet id="$Revision$" author="$Author" >
<comment>Adding a new kim type of patisserie and a new eclair baker role. Yum!!</comment>


This will put a new revision id with each checkin. We also use $Author$ to identify the person that made the change. This convention falls apart when a person commits more than one changelog at a time. Then two changelogs have the same id. This will cause problems.

Part of our methodology is the concept of one-change-per-checkin. That is, when creating a changelog for update/, we put all changes relating to KITT-958 for this release into a single changeset in our KITT-958.xml. If we have changes for another issue (KITT-959), then we put changes in KITT-959.xml. That is fine. It is crucial to understand that only one of these files gets checked in at a time though. That is, first, commit KITT-958.xml, then KITT-959.xml. The reason is that when one gets committed, it gets a revision number. Then, when the next one is committed, it gets a different revision number. This helps us keep a consistent set of changelog ids, and also prevent changelogs from stepping on each other.

Integrates with a process that facilitates rollback, update, and complete schema rebuilds.

I will explain this in more detail when I discuss testing changes to data migration. I will say though that Liquibase supports rollback of changes. Using our configuration management system, this allows us to couple data migrations with a release of KFS. We can suddenly move between versions very easily by undoing what we have done. In most cases, Liquibase can automatically rollback changes by analyzing different patterns based on the refactoring. Some cases, this is very difficult though (consult the Liquibase manual for more details on what does and doesn't auto-rollback). One example is data-related migrations. When inserting data, liquibase cannot understand how to undo an insert or an update of a record. For this, changelogs have a rollback directive to explicitly define a rollback pattern.

<databaseChangeLog xmlns="http://www.liquibase.org/xml/ns/dbchangelog/1.9"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.liquibase.org/xml/ns/dbchangelog/1.9 http://www.liquibase.org/xml/ns/dbchangelog/dbchangelog-1.9.xsd">
<changeSet id="$Revision$" author="$Author" >
<comment>Adding a new kim type of patisserie and a new eclair baker role. Yum!!</comment>
<sql splitStatements="false" endDelimiter=""><![CDATA[
declare 
ktyp_id krim_typ_t.kim_typ_id%TYPE;
BEGIN
INSERT INTO KRIM_TYP_T 
(KIM_TYP_ID, OBJ_ID, VER_NBR, NM, SRVC_NM, ACTV_IND, NMSPC_CD) VALUES 
(KRIM_TYP_ID_S.NEXTVAL,'SYS_GUID()', 1,'Patisserie',null,'Y','KUALI')
RETURNING kim_typ_id into ktyp_id;

INSERT INTO KRIM_ROLE_T 
(ROLE_ID,OBJ_ID,VER_NBR,ROLE_NM,NMSPC_CD,DESC_TXT, KIM_TYP_ID, ACTV_IND) VALUES 
(KRIM_ROLE_ID_S.NEXTVAL,SYS_GUID(),1,'Eclair Baker','KUALI',null,ktyp_id,'Y');
END;
]]>
</sql>
<rollback>
<sql><![CDATA[
delete from KRIM_TYP_T where NM = 'Patisserie';

delete from KRIM_ROLE_T where ROLE_NM = 'Eclair Baker';
]]>
</sql>
</rollback>

Case Study: Adding a Data-Only Change

Here is a screencast on how to use liquibase to migrate database changes using the University of Arizona's methodology for change management. Includes an example on making a data related change.



Looking Ahead

Be sure to read my next blog entry which will describe how to test this change against a database using rollbacks.

Sunday, April 25, 2010

Prelude to a KIS - Setting up and Getting Started with Mylyn

This is my first screencast. I am going to discuss how to setup and get started using mylyn since both Jira and Eclipse are prominent Kuali development tools.





My reasons for using Mylyn

  • Enforces one-issue-at-a-time practice. Many times, complex check-ins from working on multiple issues at a time is the cause of bugs.

  • Organizes files to an issue. This helps you focus on the task by blocking out files that don't matter and helping you remember what you were working on.

  • Issue management integration connects what you're working on with ... what you're working on. Issue metadata only available in Jira is something that can be lost when developing. Mylyn keeps your work connected with your tools

Friday, April 23, 2010

Tuning the Garbage Collection in KFS

KFS does not ship with JVM configuration that optimizes the application. Implementing institutions are expected to do that themselves. At University of Arizona, we have spent some time tuning KFS garbage collection for what seems to be pretty decent performance. Our results came mostly from trial-and-error, research from literature, and Tuning Garbage Collection with the 5.0 Java[tm] Virtual Machine

This is what we've come up with.

-Xms2g -Xmx2g -XX:MaxPermSize=512m -Doracle.jdbc.Trace=true -Djava.util.logging.config.file=/usr/share/tomcat5/conf/ojdbclog.conf -server -XX:+UseParNewGC -XX:MaxNewSize=256m -XX:NewSize=256m -XX:SurvivorRatio=128 -XX:MaxTenuringThreshold=0 -XX:+UseTLAB -XX:+UseConcMarkSweepGC -XX:+CMSClassUnloadingEnabled -XX:+CMSPermGenSweepingEnabled


Concurrent Low Pause Collector

What we decided to use is the Concurrent Low Pause Collector. We run our application out of a 64-bit Redhat Enterprise Linux VM running from Vmware ESX Server with 2 processors and 4 gb of RAM. Now that's a mouthful. You may be wondering, "With 2 processors, why don't you use the parallel GC?" Well, we tried both parallel and concurrent low pause gc. Really, the only reason why you would use a parallel is not because you have an extra processor, but rather that sacrificing that processor during runtime is better than bringing the system to its knees. Well, there's got to be a way that you run the garbage collector without sacrificing processing power at what might be a crucial time. That's where the concurrent low pause collector comes in. Concurrent means it runs in a separate thread. You may sacrifice a processor here, yes. Low Pause means that unlike a Parallel Collector, this runs in a separate thread for a short amount of time. Win-win!! So what's this other stuff? Read on.

Heap Memory Requirements

We set the min and max to be the same. This way the JVM doesn't have to reallocate memory. It's greedy and takes everything it can right away. This saves time. In a server environment, you don't want to be conservative. Greedy is a better way to go.

Optimizing for Caching

The -XX:MaxNewSize=256m -XX:NewSize=256m properties grow the Eden gen space to 256 mb immediately. Just like with the heap, we don't want to reallocate on a server platform. Let's be as greedy as possible. We've optimized to 256mb because we expect to have a large amount of cache rewriting. Tuning JVM Garbage Collection for Production Deployments recommends setting it to 32mb for caching systems, but we've found that for larger retained caches like what we want, maybe 256mb is more about what we want to use. This is especially the case with the number and size of HashMap instances in use in KFS between Spring and Rice. We also optimize the survivor space for the young generation to be 128th the size of the eden space. MaxTenuringThreshold is turned off so that a new NewSize space becomes reusable with each collection. This is actually really really small, and works well for caching.

Optimizing for Concurrency

CMSClassUnloadingEnabled, CMSPermGenSweepingEnabled, and UseConcMarkSweepGC give us our concurrentcy collection. Together, they force different algorithms for the GC that are optimal for multiprocessor system like forcing class unloading to prevent the GC from being intrusive on the application. UseTLAB "uses thread-local object allocation blocks. This improves concurrency by reducing contention on the shared heap lock." from Tuning JVM Garbage Collection for Production Deployments.

Conclusion

We had tried numerous configurations, and this worked out the best for us. It gave us a 4x improvement when processing large batch jobs during the day. In most cases, we won't process large batch jobs while users are in the system; however, for testing we are limited on servers and sometimes we try to get more out of our systems than they can give.

Turning on Oracle Driver Tracing

You are probably asking right about now, "Why would I do this if I have p6spy?" Well, here are my reasons.
  • P6Spy is a proxy on the driver. It watches SQL that goes in and prints it.
  • OJDBC in debug mode prints what goes in (not just SQL).
  • OJDBC in debug mode reports what comes out (including exceptions!)
Basically, if there's an exception or any kind of warning handled by the driver, you would find out about it. This is useful for too many reasons to count. Further, it doesn't just spit out the SQL you send it. It spits out any requests to recover connections or API level communication happening in the backend. Really, anything over Net8 happens will get reported. That's huge if you're working with Oracle.

Without further ado, here it is. Pass the following in when starting your JVM.

-Doracle.jdbc.Trace=true -Djava.util.logging.config.file=<location of your java logging configuration file>



Note: You must use the ojbdcX_g.jar.
Oracle optimizes its driver library for each JVM. ojdbc14.jar is optimized for jdk 1.4. ojdbc5.jar is optimized for java 5. ojdbc6.jar is optimized for java 6. _g is appended for the debugging enabled jar. For example, ojdbc14_g.jar is the jdk1.4 jar with debugging enabled.

ojdbc debug mode uses java.util.logging framework to define its logging. You simply configure it. Here is what our config looks like:

.level=SEVERE
oracle.jdbc.level=WARNING
oracle.jdbc.handlers=java.util.logging.FileHandler
java.util.logging.FileHandler.level=WARNING
java.util.logging.FileHandler.pattern=/usr/share/tomcat5/logs/jdbc.log
java.util.logging.FileHandler.count=3
java.util.logging.FileHandler.formatter=java.util.logging.SimpleFormatter


We use a java.util.logging.FileHandler, but you can use the ConsoleHandler which is useful for development; however, this merges all your logs to file which may not always be ideal

That's it. If you're working with tomcat, just set this in your CATALINA_OPTS environment variable. You can also set this in your Eclipse JRE settings for your Kuali project. If you're using eclipse, you may want to look at the Eclipse Log Viewer plugin which will allow you to follow external logs from within eclipse.

Thursday, April 22, 2010

CSV Export in Internet Explorer

At the University of Arizona, we have had a problem for some time now when opening exported CSV using Internet Explorer. We knew the problem stemmed from security settings in Internet Explorer. Until now, we have not had a solution that did not involve weakening security settings in Internet Explorer. This was important to us because our institution has security policies that are enforced internally for applications on our domain like Internet Explorer. Weakening security settings just was not an option we could use.

Just the other day, Andrew Hollamon, discovered a solution to this problem that had been plaguing us. He discovered that a setting in Internet Explorer would disable caching on pages rendered through HTTPS (details here http://support.microsoft.com/kb/812935). Basically, what happens is the security settings in Internet Explorer require it to submit a header to the server requesting a page without caching. Caching is necessary because the file is cached before it can be opened. If the page is not cached, then there is nothing to be opened. Andrew's solution was to add some custom code to the Rice WebUtils class that changes the headers only for requests of reports in CSV.

We plan to contribute this back, but any institution can implement a solution easily enough by simply modifying the necessary headers in the servlet response according to this document http://support.microsoft.com/kb/812935)