Hibernate Bites... (revisited)

An update to this post. Okay, so we have been able to make the caching work with very little changes to our mappings. By moving the cache element from the child entity definition into the collection declaration in the parent entity we have gotten the caching to work.


Robert said...

You might find you want both... the reason is that the caches aren't caching what you think.

An entity cache will cache the actual entity, but not the relationships. A collection (or query) cache will cache the relationship, but not the entity. A few examples will help clarify the difference.

You mention a web service that brings back 1000 objects. Let's call that object Foo, and say that each Foo has a collection of Bar.

You run a query that will bring back some Foo instances. The query runs against the database, and Hibernate gets a list of IDs (and other data); these IDs are used to look up the Foo instances in the second-level cache (if needed; Hibernate will happily re-create the instances inf the query has enough data).

So now Hibernate has a collection of Foo instances. But you want the Bar instances for each Foo... and that wasn't in the cache (because the cache stores the entity only). So Hibernate needs to run a query to get the IDs of the Bar instances. Hibernate needs to run this query N times - where N is the number of Foo instances.

You can cache this query by caching the collection - but all that this cache stores are the matched IDs. Hibernate then needs to create the Bar instance, either from the Bar cache, or the query that got the IDs or via another query.

So the result is that you want three caches: a cache on Foo, a cache on Bar, and a cache on the Foo->Bar collection.

But wait... that's not all! Any update to any Bar instance will invalidate the Foo-Bar collection cache. So if Bar is an object that regularly changes, or new Bar instances are regularly created, then the collection cache won't help you out very much in production.

There are two options that may help:

a) make the Foo->Bar collection lazy (only good if you don't need the Bar collection in the web service)

b) have a very large fetch size, so that multiple Bar instances can be brought back at a time (thus reducing the value of N).

Ryan David Ransford said...

In our case, we decided to keep it just at the parent entity cached with the embedded collection cached.
1. All of our Hibernate objects are completely read-only. We control the database and only touch it at release time. This way we can rule out the update/freshness issue.
2. Only the parent entities are directly retrieved. The child entities are never directly fetched.