Probably the best testimony to GORM/Hibernate complexity is that the Grails project itself has hard time deciding what to do. Interestingly, I recently found that the issue, I am about to present, is also discussed in The Definitive Guide to Grails 2 on page 217 (Automatic Session Flushing) and that what that book says is no longer true if you are using Grails 2.4.3!
Examples in all my previous posts have been verified on Grails 2.4.3 (currently latest) as well as on 2.3.3 (what I use at work). This post shows GORM behavior that has changed since Grails 2.3.3. I have tested both versions by creating new projects and accepting all default configurations.
Code Example:
Assume a very simple domain class which looks like this:
class Simple {
String name
}
and a saved record with name 'simple1'. Here is a test case (asserts will fail using Grails 2.3.3 and will pass with 2.4.3):
void testMysteriousSave() {
Simple.withNewSession() {
def simple1 = Simple.findByName('simple1')
simple1.name = 'simple1b'
Simple.findAll()
simple1.discard()
}
Simple.withNewSession() {
assert Simple.findByName('simple1')
assert !Simple.findByName('simple1b')
}
}
(I have used findAll() just for simplicity but many other queries involving Simple class should cause a similar issue). What is going on here? GORM/Hibernate decides when to actually save objects (this is sometimes called write-behind approach but in this case it really is 'write-ahead of the developer'). Queries in GORM/Hibernate have a (yet one more) serious side-effect, they can persist objects stored on the session. This behavior is controlled using FlushMode (see https://docs.jboss.org/hibernate/core/3.6/javadocs/org/hibernate/FlushMode.html). GORM implementation must have recently changed how it uses FlushMode to avoid problems shown in the above code.
Side Note: Here is how a similar code example is described in "The Definitive Guide to Grails 2" book: "You may consider the behavior of automatic flushing to be a little odd, but if you thing about it, it depends very much on your expectations. If the object weren't flushed to the database, then the change made to it on line 2 [refers to a line analogous to simple1.name = 'simple1b'] would not be reflected in the results [result of the finder on the following line]. That may not be what you're expecting either!". Hmm, that appears to be simply not true. GORM/Hibernate do not return latest data from the database, that is why we have 'repeatable finder' problem discussed in post 2. But if somehow this was all changed and Hibernate started working more like, say, Active Record... Why would that be confusing? No, sorry, unexpected side-effects are never part of my expectations!
Side Note 2: New project in Grails 2.4.3 will default this (new to me) setting in DataSource.groovy:
hibernate {
flush.mode = 'manual'
}
changing this to 'auto' does not seem to cause any difference in how my test runs. I believe the new behavior is a GORM code change not just a default project configuration change. (Please correct me if I am wrong.)
Why is (or was) this behavior dangerous:
There are many good reasons why I may want to keep validation logic outside of domain object, but this is/was a very risky thing to do. Consider this simplistic controller pseudo-code (assume validate method sets errors on the object and returns false if validation fails):
def createNew() {
DomainObject domainObject = DomainObject.findById(params.id)
domainObject.parameters = params.data
if(validate(domainObject)) {
domainObject.save(flush:true, ...)
} else {
domainObject.discard()
}
return domainObject
}
Using Grails 2.3.3 that code was likely to save the object during validate call if validate method executed a 'wrong' query! So here we have it again: an example of code that works fine but will break if you add a 'finder' query.
My personal experience with this issue:
Non-transactional save/validation logic is only for brave Grails developers, I am not that brave. Code that I work with uses transactional services to validate/create/update domain objects. If such code fails any 'write-ahead of developer' saves are rolled back. The problem, however, still exists. Since the actual persisting can happen before validation logic completes, it is possible for Grails to try to save an object that, for example, violates some DB enforced integrity checks. It is quite surprising if finder queries start throwing database update or insert errors!
How to find these errors:
Grails unit tests will not find them (EDITED: use of HibernateTestMixin may change that. I have not tried it). Integration tests and functional tests will find them.
Hindsight is 20/20?
The idea that side-effects are evil and that the ability to manage/isolate side-effects is what differentiates good programs from bad programs is not new, it predates Hibernate by decades. Insert/update operations are serious side effects. How can I manage these serious side-effects in my programs, if GORM/Hibernate hide these from me? I find FlushMode a poor way to manage how objects are persisted to the database and the very idea of hiding these operations wrong in principle.
Future "I don't like" posts:
So far, I have taken a test-driven approach to these posts. For each post I have created 'from scratch' a new Grails project with default configuration and wrote a test or series of tests to verify every problematic behavior I wanted to talk about. This is not always easy to do. There are some interesting side-effects that are hard to reproduce, possibly because of layers of configuration or something specific to a particular domain class. For example, figuring out why grailsApplication.isDomainClass sometimes does not work (http://jira.grails.org/browse/GRAILS-11630) took some effort.
There are quite a few very interesting side effects for which I do not yet have a 'working' test case. For example: GORM/Hibernate auto-saving is triggered by GORM/Hibernate thinking that the object is dirty. I have seen a very surprising behavior around how dirty flag is set but, currently, I have no way to reproduce this behavior 'from scratch'. This is a slow process and I am not sure I will succeed reproducing all issues.
I want to give it couple of weeks before I write next installment. At some point I may just write about things that I experienced, as opposed to things that I can demonstrate with a test case.
So this will not be my last post about Grails but most likely the next one will take couple of weeks to prepare.
Added 2014/08/25:
I have created this JIRA: http://jira.grails.org/browse/GRAILS-11687
Current behavior of FlushMode is confusing. This does not change the story in this post, but maybe will help clarify current behavior.
Added 2014/09/04:
My next post shows FlushMode behavior in Grails 2.4.3 to be even more confusing. Example shown in my next post exhibits flush mode behavior opposite to what I get when running the code example from this post. Seems like Grails 2.3.3 was at least more consistent.
Added 2014/10/25:
Comments on these JIRA tickets: GRAILS-11797, GRAILS-11536 shed some light on the confusing behavior of FlushMode in 2.4.3.
"It should be noted that HibernateTransaction manager switches the flush mode of the current session to AUTO when the transaction starts."
"Flush mode doesn't have any effect within a transaction. Changes get always flushed at the end of the transaction (a non-read only transaction) if it's not rolledback (the flush mode doesn't matter in transactions)."
The second quote seems to be not exactly incorrect in lieu of examples in my next post.
Software has two ingredients: opinions and logic (=programming). The second ingredient is rare and is typically replaced by the first.
I blog about code correctness, maintainability, testability, and functional programming.
This blog does not represent views or opinions of my employer.
I blog about code correctness, maintainability, testability, and functional programming.
This blog does not represent views or opinions of my employer.
Friday, August 22, 2014
Friday, August 15, 2014
I don't like Grails/Hibernate, part 4. Hibernate proxy objects.
Hibernate uses proxy classes to implement its lazy loading of nested domain objects. As a result, instead of BankAccount class (defined in post 1) I may sometimes get object with a class named something like 'BankAccount_$$_javassist_27'.
The idea behind the general concept of a proxy is that: use of proxy objects should be identical to the use of real object. This utopian wish is impossible to accomplish in practice, and Hibernate proxies are no exception. Hibernate proxies can cause several problems, including
Probably for that reason, proxies had a rough start in Grails. The term 'proxy hell' was coined and some interesting bugs like this one: http://jira.grails.org/browse/GRAILS-2570 have been reported.
Code Examples:
This code will get me a proxy object
def id
BankAccount.withNewSession {
id = BankAccount.findByName('a1').id
}
def a1 = BankAccount.load(id)
assert simple instanceof HibernateProxy
Note, that as usual with Hibernate, I can break the above code by adding an isolated query:
BankAccount.findByName('a1') //added this line
def id
BankAccount.withNewSession {
id = BankAccount.findByName('a1').id
}
def a1 = BankAccount.load(id)
assert simple instanceof HibernateProxy //now fails
You may find this last code example somewhat unrealistic, but I hope you agree with me that it explains the intermittent aspect of how proxies work (or why they often do not work).
How proxy code typically breaks:
Sooner or later, each Grails app will need to use domain object class to do various things with it like:
entity.getClass().shortName
may give me wrong class name! And if I do forget, my code will work, only, not always.
If you are very diligent in making sure that you always use the un-proxied classes only, then rest assured that someone else is not that diligent. For example, here is a currently open Grails JIRA: http://jira.grails.org/browse/GRAILS-11630 and if you search source of, for example, various template libraries available out there you will find things like this code:
grailsApplication.getArtefact(
DomainClassArtefactHandler.TYPE,
element.getClass().name
)
which, again, I am sure was tested and works ... only not always.
How to test for proxy problems:
Integration tests may find them but few developers code integration test separating these essential test elements into separate Hibernate sessions:
Next topic:
So far I have focused a class of Hibernate side-effects, where:
BankAccount.findByName('a1') //(1)
someOtherCode() //(2)
adding isolated query (1) changes behavior of code (2).
I want to continue discussion about side-effects, but from a somewhat different angle. In post 1, I have promised not to talk about auto-saving, I will break that promise. There are some very 'interesting' aspects of auto-saving worth examining, for example, I sometimes get DB update operation errors from issuing a finder query. I will try to examine a few of these interesting things. Things that happen as a consequence of Hibernate design decision to make a major side-effect, like SQL update or insert operation, implicit and invisible.
The idea behind the general concept of a proxy is that: use of proxy objects should be identical to the use of real object. This utopian wish is impossible to accomplish in practice, and Hibernate proxies are no exception. Hibernate proxies can cause several problems, including
- gotchas related to equals method implementation (not discussed),
- the famous LazyInitializationException (which I have promised not to discuss)
- casting problems caused by use of polymorphic associations (this is a well documented issue and I will stay away from it as well).
Probably for that reason, proxies had a rough start in Grails. The term 'proxy hell' was coined and some interesting bugs like this one: http://jira.grails.org/browse/GRAILS-2570 have been reported.
Code Examples:
This code will get me a proxy object
def id
BankAccount.withNewSession {
id = BankAccount.findByName('a1').id
}
def a1 = BankAccount.load(id)
assert simple instanceof HibernateProxy
Note, that as usual with Hibernate, I can break the above code by adding an isolated query:
BankAccount.findByName('a1') //added this line
def id
BankAccount.withNewSession {
id = BankAccount.findByName('a1').id
}
def a1 = BankAccount.load(id)
assert simple instanceof HibernateProxy //now fails
You may find this last code example somewhat unrealistic, but I hope you agree with me that it explains the intermittent aspect of how proxies work (or why they often do not work).
How proxy code typically breaks:
Sooner or later, each Grails app will need to use domain object class to do various things with it like:
- check if an object is a domain object or something else
- introspect GORM properties on a domain object
- find beans named after a domain object (for example, find BankAccountService if object is BankAccount)
entity.getClass().shortName
may give me wrong class name! And if I do forget, my code will work, only, not always.
If you are very diligent in making sure that you always use the un-proxied classes only, then rest assured that someone else is not that diligent. For example, here is a currently open Grails JIRA: http://jira.grails.org/browse/GRAILS-11630 and if you search source of, for example, various template libraries available out there you will find things like this code:
grailsApplication.getArtefact(
DomainClassArtefactHandler.TYPE,
element.getClass().name
)
which, again, I am sure was tested and works ... only not always.
How to test for proxy problems:
Integration tests may find them but few developers code integration test separating these essential test elements into separate Hibernate sessions:
- data setup (given)
- test itself (when)
- test asserts (then)
Next topic:
So far I have focused a class of Hibernate side-effects, where:
BankAccount.findByName('a1') //(1)
someOtherCode() //(2)
adding isolated query (1) changes behavior of code (2).
I want to continue discussion about side-effects, but from a somewhat different angle. In post 1, I have promised not to talk about auto-saving, I will break that promise. There are some very 'interesting' aspects of auto-saving worth examining, for example, I sometimes get DB update operation errors from issuing a finder query. I will try to examine a few of these interesting things. Things that happen as a consequence of Hibernate design decision to make a major side-effect, like SQL update or insert operation, implicit and invisible.
Monday, August 11, 2014
I don't like Grails/Hibernate part 3. DuplicateKeyException: Catch it if you can.
This post follows a pattern I used so far: it documents a case where adding an isolated query can break Grails/Hibernate code.
I often think that a measure of well designed library is how in how it handles exceptional cases. Hibernate does not 'exception' well, but Hibernate behind Spring Framework Templates, and then behind GORM can be really puzzling. In this post, I will examine one of such puzzling cases.
Call for short Hibernate Sessions:
By design, Hibernate sessions are implicit in Grails and it is the framework responsibility to manage hibernate session life cycle when processing HTTP requests. Taking over that role does not seem to be a good idea. With that said, since there are no good solutions to the 'repeatable finder' problem, bad solution maybe still the best I got! Also, there are other, better acknowledged, reasons why I may need to control hibernate sessions, such as performance, long running scheduled jobs, Grails integration tests.
I find the whole issue a bit ironic. Web developers have been, by now, conditioned to minimize the use of HTTP session. REST wants to ban it all together. Yet, Hibernate design is to maximize the use of hibernate sessions. Both are shared application state, if one is bad so should be the other! Is this the idea: that shorter lived evil is less evil so it is OK to use it for everything? If that is so, here you have it, one more argument for making hibernate session shorter.
There is an API to interact with hibernate sessions. I can use sessionFactory bean directly to flush/close/create sessions or I can use withNewSession method available on any GORM domain object.
Unfortunately, dealing with more than one Hibernate session exposes me to a bunch of Hibernate/Spring exceptions that would be otherwise unknown to me as a Grails developer: HibernateSystemException, NonUniqueObjectException (hibernate), and DuplicateKeyException (spring) are among them. I will focus on the last 2.
NonUniqueObjectException (hibernate), and its twin DuplicateKeyException (spring):
In my experience so far, they seem to be linked to each other (DuplicateKeyException wraps hibernate NonUniqueObjectException). I had hard time finding good documentation about these two, documentation that is relevant to how Grails works. Hibernate JavaDoc for NonUniqueObjectException gives me only this:
"This exception is thrown when an operation would break session-scoped identity. This occurs if the user tries to associate two different instances of the same Java class with a particular identifier, in the scope of a single Session." (http://docs.jboss.org/
This is not something I, as a Grails developer, want to identify with. Instead, I would prefer the framework to enforce that objects returned using query in one session are not used in other session. But that is not exactly what the error indicates or not what it is.
Please note that Hibernate is not very logical here either: it is not exactly that the user always 'associates' instances with the session. They can get associated sometimes in ways that would surprise most of the users! (I may need to post about it too.) Hibernate does not provide any public API to query for what is associated with the session. It considers this 'private' information. Well, if it is so private that I can't even query for it, why am I seeing it then in the exception?
DuplicateKeyException documentation seems simply incorrect for the context in which I am seeing this error:
"Exception thrown when an attempt to insert or update data results in violation of an primary key or unique constraint. Note that this is not necessarily a purely relational concept; unique primary keys are required by most database types." (http://docs.spring.io/spring/
Fortunately, the message I typically get is more descriptive: "a different object with the same identifier value was already associated with the session". So both documentation and exception design seems to be a mess here, but the real mess is still ahead of us.
Code Examples:
As a Grails developer, you may find it surprising that this code even works:
def ac1 = BankAccount.findByName('a1')
BankAccount.withNewSession { session2->
ac1.name = 'a1b'
ac1.save(flush:true, failOnError: true)
}
it is better to see the problem if I make the code more Hibernate explicit (which still works just fine):
def ac1 = BankAccount.findByName('a1')
BankAccount.withNewSession { session2->
ac1.name = 'a1b'
session2.saveOrUpdate(ac1)
}
If ac1 is associated with my first session and not session2, why session2 allows me to save it? Would it be not more logical if this code threw an exception with something like 'not associated with session'? This may make sense for more general case that Hibernate tries to accommodate, but it makes Grails behave inconsistently.
Now, I can break it by adding a finder:
def ac1 = BankAccount.findByName('a1')
BankAccount.withNewSession {
BankAccount.findByName('a1') //added this line
ac1.name = 'a1b'
println shouldFail(org. springframework.dao. DuplicateKeyException) {
ac1.save(flush: true, failOnError: true)
}
}
or to be more Hibernate explicit:
def ac1 = BankAccount.findByName('a1')
def id = ac1.id
BankAccount.withNewSession { session->
session.get(BankAccount, id)
ac1.name = 'a1b'
println shouldFail(org.hibernate. NonUniqueObjectException) {
session.update(ac1)
}
}
And the names find/get sound so innocent ... Imagine running a diff, comparing what changed from last stable source code version to figure out what caused the problem: and finding only extra finder methods!
Again, one sane way to think of this issue is that I am using ac1 associated with session1 on a wrong session (session2) and that is wrong. But if that is the case WHY does my first example work!
Again, one sane way to think of this issue is that I am using ac1 associated with session1 on a wrong session (session2) and that is wrong. But if that is the case WHY does my first example work!
SIDE NOTE: In my experience, this is not the only way to get into DuplicateKeyException trouble and I have not figured out all Grails code triggers for it. In most cases, I was able to solve the problem by 'bringing' some domain object into the current session. So the mechanics of the problem seem to be always on some level similar to what I have described.
Grails unit test coverage will be useless for finding DuplicateKeyException/ NonUniqueObjectException (EDITED: use of HibernateTestMixin may change that).
Both integration and Functional tests are capable of finding this issue.
Why Grails has done it this way?
From what I know, GORM tries to be a thin Groovy layer around Spring Hibernate Templates. In addition, Hibernate does not expose any public API to query what domain objects have been attached to the session so GORM would have to remeber that. One solution could be for GORM to store 'owning' session on each domain class created by Grails and use it to provide more meaningful and consistent exception if client code tries to use it in a context of another session.
Refrences:
Grails JIRA: https://jira.grails.org/ browse/GRAILS-11652
Summarizing examples shown so far:
In its ORM pursuit, Hibernate has lost something much more fundamental and infinity more important than purist ORM thing can possibly be. Ability to manage unwanted side effects has been lost and, as we have seen in a couple of examples already. In code like this:
BankAccount.findByName('a1') //(1)
someOtherCode() //(2)
(1) can change behavior of (2), in most extreme case it can break it.
As a result, ability to decouple application logic is largely lost if I use Grails/Hibernate stack. I consider this a major Hibernate design flaw, but because GORM Domain Objects are likely to be used extensively in Grails apps, Grails applications are more impacted by it.
Also note that problems like this maybe very hard to troubleshoot. Even if I somehow manage to have a mental image of every finder, every eagerly loaded association, Grails/Hibernate can (and will) put objects in the cache that will surprise anybody. (I may write about it too).
In my next post I will examine the same pattern (adding isolated query breaks Grails code) in a context of Hibernate proxies and talk about another related Grails bug.
Sunday, August 3, 2014
I don't like Hibernate/Grails, part 2, repeatable finder problem: trust in nothing!
I was hoping for some 'inalienable truths' developer can rely on.... Like, that things that should obviously return true never return false.
(A more correct technical term for this is property but I find inalienable truth more fun).
The assert code from last post is one such example:
assert BankAccount.findAllByBranch(myBranch).every{
it.branch == myBranch
}
Repeatable finders:
(I use this term as a reverse analogy to non repeatable reads.) In Hibernate/GORM the above assertion does not need to be true. For each finder/getter hibernate stores returned objects in its session and when next finder/getter is called hibernate will use the stored objects whenever it can. It will not refresh them. So you have to assume that any finder will return some (or many) of the domain objects found by previous finders. What if something has happened that hibernate session does not know about between the time of your current finder and the time previous finders ran?
So here is one example showing how to break the above BankBranch assert:
(assume ac1.branch == branch2)
... HTTP request for User1:
... 'componentA' executes:
println 'Tracing something about '+BankAccount.findByName('ac1')
... some other expensive computation executes
... HTTP request for User2:
def acc = BankAccount.findByName('ac1')
acc.branch = branch1
acc.save(...)
... HTTP request for User1 continues:
... 'componentB' executes:
def myBranchAccounts = BankAccount.findAllByBranch(branch1)
(myBranchAccounts includes ac1 but Hibernate returns old, not refreshed version of it so ac1.branch == branch2 is still true)
... myBrancheAccounts are rendered on a view page
(User1 is presented a list of all accounts from branch1 including ac1 which is jolly displayed showing branch2. User1 is surprised.)
This is not necessarily a strict concurrency problem. You may have code which bypasses Hibernate (maybe uses Groovy.Sql class directly) and get into very similar issues.
It is also interesting to think about compoentA and componentB code from the point of view of the unit test coverage leak problem I described in the 'part 1' post.
Here are 2 other inalienable truths (properties) that are no longer:
Uniquness constraints:
My BankAccount was declared with unique constraint on the name field (database enforced uniqueness on the name column). So if I do this:
def accounts = BankAccount.findAll()
I will never see the same name repeated, right? Wrong. Here is a concurrent usage that shows how that breaks:
...HTTP request for User1:
... 'componentA' executes:
println 'Tracing something about ' +BankAccount.findByName('ac1')
... some other expensive computation executes
...HTTP request for User2:
def ac1 = BankAccount.findByName('ac1')
def ac2 = BankAccount.findByName('ac2')
ac1.name = 'ac1_b'
ac1.save(...)
ac2.name = 'ac1'
ac2.save(...)
... HTTP request for User1 continues:
... 'componentB' executes:
def allAccounts = BankAccount.findAll()
(allAccounts contain old amount in ac1 with ac1.name == 'ac1'
and ac2 with ac2.name == 'ac1')
... allAccounts are are rendered on a view page
(User1 is presented a list of all accounts and account 'ac1' shows up twice. User1 is upset)
You may find it unrealistic that User 2 can perform 2 renames concurrently to a short period between 2 finder calls in one HTTP request. What if there have been 2 users renaming one object each? In any rate there are other possible domain objects than Bank Account and other fields that may need to have uniqueness constraint. I think the issue is demonstrated well enough.
Again, you can get into similar problems if you use some direct Groovy.Sql.
Results which look like uncommitted reads:
If I transfer money between accounts inside a transaction I should never ever see the transfer applied to one account and not the other. Right? Wrong again:
...HTTP request for User1:
... 'componentA' executes:
println 'Tracing something about ' +BankAccount.findByName('ac1')
... some other expensive computation executes
def ac2 = BankAccount.findByName('ac2')
transferMoney(ac1, ac2, 1000) //transfer 1000 dollars
...HTTP request for User1 continues:
def allAccounts = BankAccount.findAll()
(allAccounts contain old amount in ac1 but new amount in ac2)
... allAccounts are are rendered on a view page
(User1 is presented a list of all accounts and the list looks inconsistent. Where did the 1000 go? User1 is software tester, he/she is now furious, spends hours figuring out what got wrong and the problem magically just disappears. )
And again just use direct Groovy.Sql to get into the same issue without concurrency.
Why Hibernate Works Like This?
I imagine that it must have been tempting to use single Java object to represent single record. This is also purist approach to ORM: node.children.first().parent.is(node). But with hibernate this may have not been just a temptation. Hibernate designers decided at some point that Hibernate will be saving objects attached to the session automatically. I imagine that it would be very hard to deal with auto saving if you had more than one domain object representing the same record (which one would you save?)
So why not refresh existing object each time it is retrieved? Maybe because that would be a serious side-effect ;) If some objects have been changed locally and also concurrently changed in the database: have Hibernate designers been concerned about throwing ConcurrentModificationException from a finder?
Well I do not see why, because Hibernate finders already save objects and you are likely to get a collection of interesting save errors when calling a finder. (Talk about aversion to side-effects!)
Can I be just careful?
Be careful not to pollute hibernate session - that may not be so easy. For example, consider that the BankBranch class has something like featuredAccount association back to BankAccount. If that gets eagerly loaded at the time branch1 is retrieved hibernate session is already polluted with one BankAccount at the onset of the first example (without any artificial println statements).
More complex applications may want to do some HTTP filter before and after logic which uses hibernate objects. Applications may have layers of complexity which share the same hibernate session. Controlling what is and what is not in that session is unrealistic.
Why is this bad?
Developers experienced with relational databases are likely to expect certain behaviors from the code that with GORM/Hibernate are just gone. The unexpected behavior may be very intermittent and impossible to troubleshoot. In my example I have 'broke' the code by inserting a println statement printing something about one account. What if finder 'polluting' hibernate session was executed only under some (rare) conditions?
I think developers tend to think of finders and getters as safe methods to call. Almost always getters do not mutate anything. With Hibernate getter/finder have side-effects, one of them is mutated Hibernate session and this is easy to forget when you design and code your application.
I believe that any nontrivial Grails app most likely has issues/bugs related to repeatable finder.
In addition, Hibernate offers NO public API to query what is stored on the session. So if you think of some programmatic ways to solve this issue think some serious hacking.
New Session a Solution?
There is one very tempting partial solution to this.
If you really need consistent results from a finder keep it in isolated Hibernate session.(LazyInitalizationException alert flashing, OK I promise not to talk about LazyInitalization. :)
The idea is that if this line (from first example):
def myBrachAccounts = BankAccount.findAllByBranch(branch1)
runs in a new (and therefore unpolluted) hibernate session the domain objects returned from the finder will all have the most recent values and none of the surprising things will ever happen.
Or ...
Enter next Hibernate gotchas: DuplicateKeyException (and friends) the subject of my next post.
My current thinking is that using new session on certain 'crucial' selects seems to be a very good option to reduce the impact of repeatable finder problem. This technique could work well, especially if the finder we want to protect is the very last hibernate call in the HTTP request. I will not solve the problem but may reduce its impact. I may return to this thought later.
I believe there is currently no full solution to repeatable finder problem described in this post.
Added 2014/08/12: Thinking more about impacts:
I find it helpful to think about application code from the point of view of properties (logical consistency rules that need to be true). Application may rely on such properties explicitly (for example your logic will exception or return incorrect results if name property is not unique), implicitly (it will be embarrassing to show user list of items with seemingly broken uniqueness), and can actually be enforcing such properties (for example Grails code is used to enforce some more complex uniquness rules).
There are many, many possible properties related to domain objects (we have seen only 3 in this post). Some are derived from software requirements but many have a much lower level nature. For example, each 'where' or 'join' criteria in underlying SQL statements is likely to imply a property (as shown by BankAccount.findByBranch() example).
Some properties are meant to be enforced by application code, some by DB, some by the framework.
Repeatable finder is likely to affect/invalidate large portion of such properties in your application!
And you cannot rely on the fact that even DB enforced property will hold.
The impact seems very big.
(EDITED 2014/09/13) References:
I have posted a question about this issue here: Stack Overflow Question
I have now posted one (terrible) solution to how one can identify and fix such problems as answer to the Stack Overflow Question.
Here is Grails JIRA ticket for it: http://jira.grails.org/browse/GRAILS-11645
another relevant JIRA: http://jira.grails.org/browse/GRAILS-11644
Forum: http://groups.google.com/forum/#!topic/grails-dev-discuss/wzekMGC0ibE
Hibernate JIRA: https://hibernate.atlassian.net/browse/HHH-9367
(A more correct technical term for this is property but I find inalienable truth more fun).
The assert code from last post is one such example:
assert BankAccount.findAllByBranch(myBranch).every{
it.branch == myBranch
}
Repeatable finders:
(I use this term as a reverse analogy to non repeatable reads.) In Hibernate/GORM the above assertion does not need to be true. For each finder/getter hibernate stores returned objects in its session and when next finder/getter is called hibernate will use the stored objects whenever it can. It will not refresh them. So you have to assume that any finder will return some (or many) of the domain objects found by previous finders. What if something has happened that hibernate session does not know about between the time of your current finder and the time previous finders ran?
So here is one example showing how to break the above BankBranch assert:
(assume ac1.branch == branch2)
... HTTP request for User1:
... 'componentA' executes:
println 'Tracing something about '+BankAccount.findByName('ac1')
... some other expensive computation executes
... HTTP request for User2:
def acc = BankAccount.findByName('ac1')
acc.branch = branch1
acc.save(...)
... HTTP request for User1 continues:
... 'componentB' executes:
def myBranchAccounts = BankAccount.findAllByBranch(branch1)
(myBranchAccounts includes ac1 but Hibernate returns old, not refreshed version of it so ac1.branch == branch2 is still true)
... myBrancheAccounts are rendered on a view page
(User1 is presented a list of all accounts from branch1 including ac1 which is jolly displayed showing branch2. User1 is surprised.)
This is not necessarily a strict concurrency problem. You may have code which bypasses Hibernate (maybe uses Groovy.Sql class directly) and get into very similar issues.
It is also interesting to think about compoentA and componentB code from the point of view of the unit test coverage leak problem I described in the 'part 1' post.
Here are 2 other inalienable truths (properties) that are no longer:
Uniquness constraints:
My BankAccount was declared with unique constraint on the name field (database enforced uniqueness on the name column). So if I do this:
def accounts = BankAccount.findAll()
I will never see the same name repeated, right? Wrong. Here is a concurrent usage that shows how that breaks:
...HTTP request for User1:
... 'componentA' executes:
println 'Tracing something about ' +BankAccount.findByName('ac1')
... some other expensive computation executes
def ac1 = BankAccount.findByName('ac1')
def ac2 = BankAccount.findByName('ac2')
ac1.name = 'ac1_b'
ac1.save(...)
ac2.name = 'ac1'
ac2.save(...)
... HTTP request for User1 continues:
... 'componentB' executes:
def allAccounts = BankAccount.findAll()
(allAccounts contain old amount in ac1 with ac1.name == 'ac1'
and ac2 with ac2.name == 'ac1')
... allAccounts are are rendered on a view page
(User1 is presented a list of all accounts and account 'ac1' shows up twice. User1 is upset)
You may find it unrealistic that User 2 can perform 2 renames concurrently to a short period between 2 finder calls in one HTTP request. What if there have been 2 users renaming one object each? In any rate there are other possible domain objects than Bank Account and other fields that may need to have uniqueness constraint. I think the issue is demonstrated well enough.
Again, you can get into similar problems if you use some direct Groovy.Sql.
Results which look like uncommitted reads:
If I transfer money between accounts inside a transaction I should never ever see the transfer applied to one account and not the other. Right? Wrong again:
...HTTP request for User1:
... 'componentA' executes:
println 'Tracing something about ' +BankAccount.findByName('ac1')
... some other expensive computation executes
... HTTP request for User2:
def ac1 = BankAccount.findByName('ac1')def ac2 = BankAccount.findByName('ac2')
transferMoney(ac1, ac2, 1000) //transfer 1000 dollars
...HTTP request for User1 continues:
def allAccounts = BankAccount.findAll()
(allAccounts contain old amount in ac1 but new amount in ac2)
... allAccounts are are rendered on a view page
Why Hibernate Works Like This?
I imagine that it must have been tempting to use single Java object to represent single record. This is also purist approach to ORM: node.children.first().parent.is(node). But with hibernate this may have not been just a temptation. Hibernate designers decided at some point that Hibernate will be saving objects attached to the session automatically. I imagine that it would be very hard to deal with auto saving if you had more than one domain object representing the same record (which one would you save?)
So why not refresh existing object each time it is retrieved? Maybe because that would be a serious side-effect ;) If some objects have been changed locally and also concurrently changed in the database: have Hibernate designers been concerned about throwing ConcurrentModificationException from a finder?
Well I do not see why, because Hibernate finders already save objects and you are likely to get a collection of interesting save errors when calling a finder. (Talk about aversion to side-effects!)
Can I be just careful?
Be careful not to pollute hibernate session - that may not be so easy. For example, consider that the BankBranch class has something like featuredAccount association back to BankAccount. If that gets eagerly loaded at the time branch1 is retrieved hibernate session is already polluted with one BankAccount at the onset of the first example (without any artificial println statements).
More complex applications may want to do some HTTP filter before and after logic which uses hibernate objects. Applications may have layers of complexity which share the same hibernate session. Controlling what is and what is not in that session is unrealistic.
Why is this bad?
Developers experienced with relational databases are likely to expect certain behaviors from the code that with GORM/Hibernate are just gone. The unexpected behavior may be very intermittent and impossible to troubleshoot. In my example I have 'broke' the code by inserting a println statement printing something about one account. What if finder 'polluting' hibernate session was executed only under some (rare) conditions?
I think developers tend to think of finders and getters as safe methods to call. Almost always getters do not mutate anything. With Hibernate getter/finder have side-effects, one of them is mutated Hibernate session and this is easy to forget when you design and code your application.
I believe that any nontrivial Grails app most likely has issues/bugs related to repeatable finder.
In addition, Hibernate offers NO public API to query what is stored on the session. So if you think of some programmatic ways to solve this issue think some serious hacking.
How to test for this:
Repeated finder problems arising from the use of direct SQL or use of several hibernate sessions within single HTTP request can be discovered by both integration and functional tests.
The issue is not unit testable even if you think of Unit Tests as 'atomic tests' which are implemented as Grails integration tests.
The issue is not unit testable even if you think of Unit Tests as 'atomic tests' which are implemented as Grails integration tests.
Unfortunately, in many cases the issue will be triggered by concurrent access to your application.
Testing concurrency issues is always not trivial. So I do not really know how to answer this question.
Ideally, the tools we use will be better designed to handle concurrency (have you read Simon Peyton Jones 'Beautiful Concurrency' http://research.microsoft.com/en-us/um/people/simonpj/papers/stm/?). I am afraid Hibernate maybe never be one of these tools.
There is one very tempting partial solution to this.
If you really need consistent results from a finder keep it in isolated Hibernate session.(LazyInitalizationException alert flashing, OK I promise not to talk about LazyInitalization. :)
The idea is that if this line (from first example):
def myBrachAccounts = BankAccount.findAllByBranch(branch1)
runs in a new (and therefore unpolluted) hibernate session the domain objects returned from the finder will all have the most recent values and none of the surprising things will ever happen.
Or ...
Enter next Hibernate gotchas: DuplicateKeyException (and friends) the subject of my next post.
My current thinking is that using new session on certain 'crucial' selects seems to be a very good option to reduce the impact of repeatable finder problem. This technique could work well, especially if the finder we want to protect is the very last hibernate call in the HTTP request. I will not solve the problem but may reduce its impact. I may return to this thought later.
I believe there is currently no full solution to repeatable finder problem described in this post.
Added 2014/08/12: Thinking more about impacts:
I find it helpful to think about application code from the point of view of properties (logical consistency rules that need to be true). Application may rely on such properties explicitly (for example your logic will exception or return incorrect results if name property is not unique), implicitly (it will be embarrassing to show user list of items with seemingly broken uniqueness), and can actually be enforcing such properties (for example Grails code is used to enforce some more complex uniquness rules).
There are many, many possible properties related to domain objects (we have seen only 3 in this post). Some are derived from software requirements but many have a much lower level nature. For example, each 'where' or 'join' criteria in underlying SQL statements is likely to imply a property (as shown by BankAccount.findByBranch() example).
Some properties are meant to be enforced by application code, some by DB, some by the framework.
Repeatable finder is likely to affect/invalidate large portion of such properties in your application!
And you cannot rely on the fact that even DB enforced property will hold.
The impact seems very big.
(EDITED 2014/09/13) References:
I have posted a question about this issue here: Stack Overflow Question
I have now posted one (terrible) solution to how one can identify and fix such problems as answer to the Stack Overflow Question.
Here is Grails JIRA ticket for it: http://jira.grails.org/browse/GRAILS-11645
another relevant JIRA: http://jira.grails.org/browse/GRAILS-11644
Forum: http://groups.google.com/forum/#!topic/grails-dev-discuss/wzekMGC0ibE
Hibernate JIRA: https://hibernate.atlassian.net/browse/HHH-9367
Saturday, August 2, 2014
I don't like Hibernate (and Grails), PART 1
My goal here is to write about Hibernate and GORM functionality that you could call 'It is not a bug, it is a feature' and which yield very negative and often non trivial consequences.
These topics are (I believe) documented nowhere and surprise every developer I know.
I also want to comment on what are the best ways to test for these.
Testing Grails Unit vs. Integration vs. Functional:
Is there some sort of a 'pyramid' consensus that you need a lot of unit tests, many in integration test,
and maybe just some (if any) functional tests?
Grails creates appearance of a test friendly framework with big focus on unit tests.
Out of the box, Grails will put test/unit and test/integration folder in your project
(You really should have test/functional there as well, but you need to work a bit harder to get it).
For each domain class you add to your project Grails will, by default, create an empty Unit Test for it (not integration, only unit).
Grails Unit Tests do not interact with database and anything related to hibernate/GORM has to be mocked. This makes Unit Tests a wrong choice for uncovering problems related to how hibernate/GORM is used/misused in your project.
I find this a bit ironic. Here are two well know Hibernate gotchas: automatic saving of domains objects and LazyInitializationException. Talking about these 2 feels like beating a dead hoarse so I will not.
I just want to point out that your domain objects maybe saved when they should not be (do you have non-transactional validation logic outside of your domain class?) or your views may throw LazyInitializationException (did you forgot to hydrate your model and transaction has rolled back?) but all your Grails unit test will pass, your Grails integration tests will also pass. Did you write Functional Tests?
You maybe tempted to think about Unit Tests as simply tests that exercise atomic blocks of your code in isolation. If this is your thinking you maybe putting your 'Unit Tests' into tests/integration folder and still think of them as 'unit'. I admit to thinking this way.
There is an issue with this too. Side effects do not work well with unit tests and, well, Grails is very much a side-effect framework. Consider this high level hypothetical: you have programmed 'component A' and have a full logical unit coverage for it (if that is even possible) then you coded 'component B' and wrote full coverage for it as well. You pat yourself on the back for having fully covered everything.
But wait... , if executing A creates side-effects which impact logic in B, then, well, you coverage has leaked on you. You may be confident that you wrote a well tested app, but you really did not. I think this is not well understood because, if it was, I would expect much more interest in FP.
My take on testing for possible Hibernate gotchas is to reverse the pyramid: Focus on Functional Tests, write more integration than unit tests. But I will probably address it in more details in the future.
Example:
(I may reuse this simple Domain Class in future posts):
class BankAccount {
String name
BankBranch branch
Float amount
static constraints = {
name unique: true
}
}
Here is my 'component B' logic: (Assume properly defined equals/hasCode methods are in place - not shown.)
assert BankAccount.findAllByBranch(myBranch).every{
it.branch == myBranch
}
Would you think the above statement is guaranteed to be true?
If you are a sane person you will answer yes, this assertion has to hold no matter what.
If you have worked with Hibernate and/or GORM long enough then you (have lost your sanity by now and) can figure out what code to add in front of the above code block to have it break.
I call it 'repeatable finder problem' and I will write more about it next time.
These topics are (I believe) documented nowhere and surprise every developer I know.
I also want to comment on what are the best ways to test for these.
Testing Grails Unit vs. Integration vs. Functional:
Is there some sort of a 'pyramid' consensus that you need a lot of unit tests, many in integration test,
and maybe just some (if any) functional tests?
Grails creates appearance of a test friendly framework with big focus on unit tests.
Out of the box, Grails will put test/unit and test/integration folder in your project
(You really should have test/functional there as well, but you need to work a bit harder to get it).
For each domain class you add to your project Grails will, by default, create an empty Unit Test for it (not integration, only unit).
Grails Unit Tests do not interact with database and anything related to hibernate/GORM has to be mocked. This makes Unit Tests a wrong choice for uncovering problems related to how hibernate/GORM is used/misused in your project.
I find this a bit ironic. Here are two well know Hibernate gotchas: automatic saving of domains objects and LazyInitializationException. Talking about these 2 feels like beating a dead hoarse so I will not.
I just want to point out that your domain objects maybe saved when they should not be (do you have non-transactional validation logic outside of your domain class?) or your views may throw LazyInitializationException (did you forgot to hydrate your model and transaction has rolled back?) but all your Grails unit test will pass, your Grails integration tests will also pass. Did you write Functional Tests?
You maybe tempted to think about Unit Tests as simply tests that exercise atomic blocks of your code in isolation. If this is your thinking you maybe putting your 'Unit Tests' into tests/integration folder and still think of them as 'unit'. I admit to thinking this way.
There is an issue with this too. Side effects do not work well with unit tests and, well, Grails is very much a side-effect framework. Consider this high level hypothetical: you have programmed 'component A' and have a full logical unit coverage for it (if that is even possible) then you coded 'component B' and wrote full coverage for it as well. You pat yourself on the back for having fully covered everything.
But wait... , if executing A creates side-effects which impact logic in B, then, well, you coverage has leaked on you. You may be confident that you wrote a well tested app, but you really did not. I think this is not well understood because, if it was, I would expect much more interest in FP.
My take on testing for possible Hibernate gotchas is to reverse the pyramid: Focus on Functional Tests, write more integration than unit tests. But I will probably address it in more details in the future.
Example:
(I may reuse this simple Domain Class in future posts):
class BankAccount {
String name
BankBranch branch
Float amount
static constraints = {
name unique: true
}
}
Here is my 'component B' logic: (Assume properly defined equals/hasCode methods are in place - not shown.)
assert BankAccount.findAllByBranch(myBranch).every{
it.branch == myBranch
}
Would you think the above statement is guaranteed to be true?
If you are a sane person you will answer yes, this assertion has to hold no matter what.
If you have worked with Hibernate and/or GORM long enough then you (have lost your sanity by now and) can figure out what code to add in front of the above code block to have it break.
I call it 'repeatable finder problem' and I will write more about it next time.
Subscribe to:
Posts (Atom)