7.28.2008

Hibernate Bites...

...us in the butt Our team recently chose to use Hibernate for handling read-only access to data in our configuration and archive databases. Our primary decision for doing this was to use the baked-in caching provided by ehcache. Since then, we have had three major releases, the first two each containing a bit more hibernate than the previous, and the third, our architecture re-design, in which we spread Hibernate into all of the read-only data access classes. Our love affair with Hibernate was deep and passionate. Then disaster struck. During the verification of the installation of our new architecture to the disaster recovery environment, we found that many of our web-service responses were taking much longer than our service-level agreement allowed, significantly longer. One full week later, we have determined that most of the SessionFactory instances which we thought were caching, were in fact not caching. One particular request to our web service caused 43000 queries to be generated from one repeated (x1000) entity retrieval. This problem did not rear its ugly head in the primary environment because the app server and the database are much closer together on the network in the primary environment. The network timing differences were not great between our primary and disaster recovery environments, but when 43000 queries are being issued, each little bit adds up. We diagnosed our problems with a simple jsp that used SessionFactory.getStatistics() to get the collected statistics and then used Statistics.getEntityStatistics(), Statistics.getQueryStatistics(), and Statistics.getSecondLevelCacheStatistics() to display counts for cache misses, entity loads, cache puts, and cache hits. When you have caches defined that are not being used correctly, you will see more misses than puts, or you will see no misses and many puts. To solve our problems, we are looking at modifying some of the basic assumptions we had about how Hibernate does caching. We are removing some of the mappings in which a parent class was created simply to have one single object to retrieve for each id instead of a list of child objects. We are also now making use of the query cache in the situations in which we are using Criteria.list. In the future, we are looking to spread the use of Spring's JdbcTemplate, Select, and Update classes in conjunction with directly using ehcache or a home-grown caching solution. Lesson Learned: Make sure that you are verifying that your SessionFactory instances are actually caching your cacheable entities as early in the project as you can. If you can, make sure that your integration tests are asserting cache statistics, both in the query cache and in the entity cache.

1 comment:

Max said...

Rule #1 Don't cache anything before it is actually proven to be needed.

Rule #2 If proven to be needed, implement it and then verify with actual numbers that enabling caching actually worked.

Those two rules should be followed in *all* cache related development with or without hibernate.