Stuff about programming, programming style, maintainability, testability. Dedicated to my coworkers and friends. Everyone is welcome to leave comments or disagree with me. This blog does not represent views or opinions of my employer.

Saturday, October 25, 2014

I don't like Hibernate/Grails part 11. Final thoughts.

This series was born out of my frustration with Grails. But, instead of making it a comprehensive criticism of the framework, I have decided to focus on a few GORM and Hibernate issues. I had several reasons to do that.

Why GORM/Hibernate focus?
There is quite a few blogs which basically say: Grails is very buggy and then provide few or no details. There are also many blogs saying Grails is great and then provide equal amount of fluff to support their claim. All of this becomes very subjective.

(Section Edited for clarity, Oct 30, 2014)
It is not that hard to demonstrate that this is a very buggy environment. It has been founded on Groovy, and, in my experience, Groovy is and always was is a very buggy language.  Here is one curious example (tested with Groovy 2.3.6, other versions I checked behave the same way):
  1 as Long == 1 as Integer //true (note, false in Java)
  1 as Integer == 1 as Long //true (note, false in Java)
  [(1 as Integer)].contains((1 as Long))  //false (inconsistent with equals!)
  (1 as Long) in [(1 as Integer)] //false

That is some scary stuff, right?  I think it is scary. This one is not very fixable,  but most Groovy bugs will eventually get fixed.  Hibernate bugs will be with us forever. This series was about bugs other people call functionality and sophisticated design.

Designing a good web framework is not easy.  I think the key is: the framework must be intuitive. It should do what developers expect to happen. That is one more reason for my focusing on GORM/Hibernate. It is hard to claim that these 2 are intuitive, but a similar claim becomes much more subjective in other parts of Grails.
Here is one example of a non-intuitive behavior: How does controller forwarding work. The intuitive behavior would be that forwarded method executes in a separate thread. It does not, it executes as part of the calling method. Instead of wasting time on disagreeing that this is a bad design, do this experiment: Create a controller with methods ‘a’ and ‘b’ and have 'a' forward to 'b'.  Add a filter which simply prints action name in ‘before’ and ‘afterView’.  Here is what you will get with Grails 2.4.3:
    before a,  before  b,  afterView b,  afterView b 
(both afterView print the same action name!).  Would it not be nicer if we got:
    before a,  afterView a,  before b,   afterView b      
Confusing design + mutating state == bugs. But how can I argue that this is not just an innocent overlook on the part of Grails/Spring framework?

One of the most irritating aspects of Grails is that everything is so intermittent, and that is by design. The fail-fast philosophy is totally foreign to this framework. For example, if calling object.save() no longer always saves the object (yes you are reading it right, see GRAILS-11797GRAILS-11536) then would you not want object.save() to fail if the object is not going to be ever saved?  Again, focusing on GROM/Hibernate simplified my job of demonstrating examples with a very intermittent behavior.
The uncanny ability to exacute bad code in Grails goes way beyond what I was able to demonstrate. This is the very scary: How did that ever work before? thing. I suspect something is wrong with Grails/Groovy compilation and a bad code sometimes magically works until some totally not relevant change exposes the problem. I cannot justify this claim. The only think I have is anecdotal evidence.

I have very strong opinions about what are the main causes of OOP bugs. That does not mean you would have agreed with me. Without good examples, this blog would have been labeled as a one more guy that thinks that 'FP is the new silver bullet'. Focusing on GORM allowed me to pinpoint the problems in a way that is hard to dispute. (And yet, I still got the label.)

Is it all about the cache?
All of the GORM problems listed in this series can be attributed in some way to Hibernate session/1st level cache.  You can argue that having cache is beneficial and that some problems are unavoidable with any cache implementation.
Ideally, caching should behave as if it was not there: application using a cache should work exactly the same way if the cache was removed. However, synchronization of ORM cache and DB state is a very hard problem and achieving ‘opaque’ implementation may be hard or even impossible.

Dear GORM/Hibernate:  If you can’t implement cache which behaves ‘like it is not there’ then don’t design your API like the cache ‘is not there’.  Make the cache very explicit and optional (Identity Map?). Invalidate cache immediately, or at least, provide a way for the application to learn as soon as you know that cached data is stale. Design invalidating (you like to call it session.clear()) your cache in a way that does not make half of objects used by the application useless.  Remember, you are just a cache, the data is still there!  The data is what is important, not you.  If you want to call yourself a cache, stop being so bossy!  :)

More criticism of Hibernate
I must point out that I am not the only Hibernate hater. Here are some examples:
http://www.slideshare.net/alimenkou/why-do-i-hate-hibernate
http://mentablog.soliveirajr.com/2012/11/hibernate-is-more-complex-than-the-problem-it-tries-to-solve/
http://brian.pontarelli.com/2007/04/03/hibernate-pitfalls-part-2/

In my blog, I have just scratched the surface.  Some examples of 'bug generating' design that could use more discussion include: defaulted and recommend 'flush:false' in save() (maybe a moot point now), or what happens when Hibernate session closes suddenly and unexpectedly (like, when transactional code fails).
I need to stop somewhere and this post seems a good place and time to stop.

Sneaking-in FP and other things that interest me
I decided that complaining about Grails not working right is much less powerful than pointing out why it does not work right.  I think analyzing and criticizing bad programs is a great way to advance programming skills. With each installment I tried to sneak-in some concept that explained why bad is bad: properties, shared state/side-effects, fail-fast, unit testability, even a bit of combinators. These are all common sense things that explain design flaws.

One thing I could not fit into my posts was types.  I decided that this will be too foreign concept in the context of Java and Groovy. Types are very powerful and I regret not finding a good place for them in this series.

Is FP the ‘new’ silver bullet?
I was asked this question and it makes sense that I try to answer it.
FP is not new, FP predates OOP,  Haskell is older than Java, combinatory logic is older than Turing machines, lambda calculus is about the same age.  The silver bullet is and probably always was the ability to logically reason on the code.

In this series, I tried to emphasize the importance of logic.  Programs should be logically simple and ‘mappable’ to logic. Programming and Logic are very related on a theoretical level (google: Curry-Howard).  This 3 are called the trinity of CS:  Type Theory, Category Theory and Proof Theory (google: Curry-Howard-Lambek).

Programs I write using Groovy and Grails may look straightforward and shorter than Java and Spring but from the point of view of logic these are still just spaghetti threads of programming instructions. Add to it a total disregard for side-effects and this exercise becomes equivalent to building a house of cards on a foundation that is shaking.

Quiz Question:  Recalling Logic 101, here is a logical ‘formula’:
    (a^b)=>c   ⇔   a=>(b=>c)
Do you know/can you figure out what that corresponds to in programming?  Answer at the end of this post.

I used to love Grails
I started this blog site with a series about 'Imperative curlies'.  My idea at that time was that I can count the number of curly braces ({}) in my code and use that number as a measure of how good my code is. The fewer 'curlies', the better the code.  The idea was to break away from coding and thinking using imperative sequences of instructions (for loops, if statements all use 'curlies').  I remember it worked very well for me. If you look at these old posts, you will see that there was a time when I really liked Grails.

Is all OOP bad?
I think that is a complex question.  Good OOP is about things like decoupling, separation of concerns, eliminating shared state, meaningful polymorphism, etc. These things may achieve some of the same goals FP is fighting for. The concept of a shared session state (Hibernate session) is not very OO.  Hibernate StatelessSession interface which does not extend Session is not a great example of OO polymorphism. Ability to decouple is mostly gone due to Hibernate non-localized side-effects. Hibernate is simply not a good OOP.

What any OOP will always lack is this: a clear and simple correspondence to logic. This is what makes FP unique.

Parting thought:
Answer to the Quiz Question:  It is currying. To see it, compare these 2 lines:
   (a^b)=>c              ⇔    a=>(b=>c)
   (a,b)->c              ⇔    a->(b->c)
   (2 argument function)       (function returning a function) 
First line is the logical formula. I have changed arrow-like symbols '=>' to look slightly different '->'. I have replaced '^' with ',' and ended up in FP!  This process is a mini-Category Theory in action. Cool, is it not?

There is no helping it, Groovy and Grails are OOP not FP. It is still important to be able to think outside of that box. Otherwise we will start convincing ourselves of things like ‘static definitions are always bad’, ‘unit testing is not about finding bugs’, ‘using refresh() resolves stale object problems in Hibernate session’,  or some other nonsense.

Thinking in C++, Thinking in Java: it is worth trying to stop it, even if you program in these languages.

Grails is and will be a very popular and buggy framework. We can only blame ourselves for that. Writing this series was a big effort for me. My biggest hope is that it made some of my readers stop for a moment with a 'hmm'.

The End (for now).

(I ended up republishing this blog due to some weird formatting issues -  if I created a chaos in your RRS/Atom feed - sorry!)

Saturday, October 18, 2014

I don't like Hibernate/Grails part 10: Repeatable finder, lessons learned

Repeatable finder (concurrency issue described in part 2) is what started/motivated this series. I had hoped that that this issue will draw some reaction from the community. It did not. Why? Tallying up all the answers/responses from the last 2 moths amounts to: 6, none of them useful or even correct. What does that mean?

In this series I tried to sneak in some things that interest me like FP and logical reasoning of code correctness.  I will use repeatable finder problem as a way to sneak in a bit more of this stuff later in this post.

Denial isn't just a river in Egypt?
This has been a twilight-zone.
To refresh you memory:  if more than one query is executed in a single Hibernate session and the result sets intersect then the second query returns a weird combination of old and new data.  That can break the logic in your code,  for example:
       Users.findAllByNickName('bob')

can return records with nickName != 'bob'. Other things can go wrong too: Maybe you have used a DB unique key to define equals()?  Or maybe you have used a DB unique key as a key in a Map? Any of this could go very wrong.

At first, I thought that the issue must be well know and I am missing some way of handling it. This, unfortunately, is not the case. Very recently, I came across this blog from 2009: orm-sucks-hibernate-sucks-even-more
"... take a look how even a silly CRUD application would suffer, once you've got "not-very-recent" object from the session"
that quote points to a (now non-existing) page on the hibernate website. Did we know more in 2009 that we know now?  If we did know, why have we allowed for this issue to stay unresolved? Well, this is all speculation.

I tried my best to do 2 things:  make the community aware and persuade Hibernate to fix it. I have failed miserably on both accounts. Here are the results of my efforts (as of Oct 17, 2014, tallied after a bit over 2 months since I started my crusade):
  • post part 2:  effectively no replies, but over 1100 reads.
  • Grails JIRA: incorrect comments and then ignored
  • Hibernate JIRA:  rejected (works as intended) with suggested work-around which is incorrect
  • Stack Overflow question:  a whooping +4 score (started at -1) and bunch of incorrect or meaningless answers
  • Grails forum: 0 replies
Hibernate ticket was the weirdest experience.  It got rejected very fast (not a bug) with a comment to just use refresh().  After pointing out that this workaround is a total nonsense, I was sent to read some completely not relevant documentation about concurrency.  After that, my (and Tim's) comments have been ignored.

What can I conclude from these 3 facts?:
  • nobody seems to know how to resolve or even work-around this issue
  • experts provide advice that is incorrect
  • there is no interest in solving, discussing or even acknowledging it as a problem
I do not know, but probably nothing good. I think it is interesting to try to puzzle out the few responses that the problem did generate. I will try to do that here.

The replies I got from the expects fall into 2 categories. The first category are answers like this:
  • It is any ORM issue
  • Any database application will have an issue like this 
It is true that the issue can be resolved with DB locking.  In particular, I could prevent repeatable finder by having all HTTP requests wrapped in long transactions and configuring higher (repeatable read) isolation level. Indeed, it is a big framework design failure, if we need to resort to things like this.
It is NOT true that any ORM and any database application will have this problem.  The most likely explanation for this type of response is that developers do not think about side-effects. They see Hibernate query and think of a SELECT statement only.  If I see a problem, it must be from the SELECT, where else would it come from?  This is consistent with the point I tried to make in my previous posts.

The second category are answers that suggest using refresh() or discard() to fix the problem:
'To fix your problem'
  • add refresh() to your code
  • or:  add discard()/evict() to your code
My first reaction was: Grrr, my second:  Hmm.  If I could only continue this conversation I am sure it would go like this:
Me: Where do I add these?  Expert's Reply: Add them where you have that problem. 

If you have been following various Hibernate discussion forums, you must have noticed that the same type of advice (either to add refresh() or to add evict()) shows up very frequently. This advice is never right.

Grails and Hibernate experts:  I am very disappointed in you.

Add refresh()...  add evict()... Thinking in Hibernate.
(Here is where I sneak-in some interesting stuff.)
This is how we typically reason about our code: the problem is on line 57 because variable xyz is ... and then on line 89 we do that..., and then on line 127 we have an if statement that goes like that...
We reason about our code by examining chains of programming instructions.

This is called imperative thinking and imperative programming.  If you read my previous posts you may assume that I consider such programs not logical,  they are logical, only the logic is very cumbersome and complex.
A well designed OO program is where lines 57, 89 and 127 are all in the same class and the chains of instructions we need to examine are relatively short.  In a procedural program lines 57, 89, and 127 can be anywhere and chains are long.  Badly designed OO programs behave like procedural programs.

Repeatable finder is a great example where imperative reasoning fails.  The problem is not something between lines 57 and 89 or something on line 115.  The problem is (or can be) anywhere.
Answer 'add refresh()' or 'add discard()' is a very imperative thinking: it assumes I can add it on line 89.  (I can only conclude that this expert advice is not to sprinkle refresh() all over my code just for fun ... and because my code will run too fast without it.)

So what is the alternative?  The idea is to think about a block of code in a way that can prove certain behavior of that block. If we know that a code block 'A' exhibits the same behavior no matter where it is placed or how is used, then we no longer need to think about lines 57, 89, and 127 or about chains of computing instructions.

This is called declarative thinking and programming.  Logical reasoning is now simplified, I no longer need to follow chains of programming steps to reason about the correctness.

Declarative thinking works great, except, if I cannot trust any property, even that:
    Users.findAllByNickName('bob').every{it.nickName == 'bob'}
then I am stuck.
That may sound like a limitation of declarative programming:  I can still keep going using imperative approach. That is true, and we all 'keep going'.  I did not stop programming my project and no, I did not add refresh() all over my code. That is why our applications are so buggy: we ignore logical problems unless we can pin them to line 127.

Side Note: The fun starts when I start combining my declarative code blocks into bigger blocks. Code needs to be logically composable.  I want a bigger block (composed of smaller blocks) to have properties too. Some like to call it programming with combinators.

Conclusions:
I would like to suggest this as a new rule of thumb: 
  the answer to use Hibernate/GORM refresh() or evict()/discard() is wrong regardless of the question.
(with exception of Functional Tests - which may need to refresh some records used in asserts).  Please comment below if you find a counter example to this rule.

I am not claiming that I know the solution to Repeatable Finder.  Maintaining Hibernate cache synchronized with the DB is hard, maybe impossible.  One way of dealing with hard problems is: make them somebody else's.  If GORM/Hibernate just told me when the query is lying (returns stale data even if it has new data) or allowed me to request/configure the query to refresh all records... That would go a long way.

It looks like the community has decided to not acknowledge Repeatable Finder as a problem. There is really no good solution for it and acknowledging it would be admitting to that fact. This issue is likely to remain unsolved and ignored. More complex Grails apps are doomed to work incorrectly under heavier concurrent use.

I have added a label to my posts (which does not work so do not click on it - and that convinces me that blogger must be using Hibernate):  'Stop thinking in C++ and Java'.  I think we need to stop thinking imperative or at least stop thinking only in imperative terms.

Next post:  I need to do one more to wrap-up. I will be finishing this series next week.

I have a busy period ahead of me.  I started going over a set of courses published online by University of Oregon (Oregon Programming Languages Summer School) and that will be many hours of not very easy listening and learning.  I also have to start preparing/training for a ski camp in early November (I live in CO and skiing has started here already).  No, this will not be a SKI calculus camp;) - but then, believe it or not, technical skiing is (or should be) a fun intellectual activity too.

Friday, October 10, 2014

I don't like Hibernate/Grails part 9: Testable code

"I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence". Kent Beck.
Finding more problems with less effort is something I totally agree with.  Which approach to testing will give me the most bang for the buck?  I like to think about tests in a pragmatic way: I write tests to find bugs and guard against bugs. I consider manual testing rather ineffective and, in most cases, inferior when compared to automated test.

Testing seems to be deeply related to my last two posts.  It is obviously related to software correctness. Less obviously: out approach to testing impacts how we perceive the framework, how we test can explain why we like or don't like Hibernate and Grails.

I have moved away from unit test in Grails.  My approach to testing is an 'inverted pyramid'. I test internal implementation details using Grails integration tests and I write a lot of functional tests.  My test are less 'unit' than you may like them to be (they interact with actual database, mocking is replaced with data setups) but are closer to the reality. This post explains my reasoning behind this decision.

I am considering the unit/integration/functional division from the 'testing philosophy' point of view.  I care less if tests are placed in test/unit, test/integration or test/functional folder.  I will stay away from the hate-love TDD debate.

Testing choices: (Just to make sure we are on the same page.)
Testing spectrum has these 2 extremes with very different characteristics:
(1) Unit Tests:  testing expected behavior; testing in a fake environment; testing internals not visible to the end users 
Also: white box testing; bottom-up testing
I will test parts (units) of my software in isolation from anything else.  I will 'mock' interaction with the rest of the system.  Since I know exactly what can go wrong, I can write mocks that exercise the tested part under a specific situation.
(The mocking aspect is specific to OOP, nobody writes mocks in FP... with that said, there is this great book:  http://en.wikipedia.org/wiki/To_Mock_a_Mockingbird ;))

(2) Functional Tests: testing for unexpected problems; testing in a real environment; testing functionality visible to the end users 
Also: black (more gray than black - I need testing 'hooks') box testing; top-down testing
I will test my application as a whole.  I identify a list of specifications and I write tests to verify that my application works correctly with respect to these specifications. I do not need to know everything that can go wrong, I assume that since I covered a comprehensive range of data and scenarios representing actual software usage, most of what can go wrong will be uncovered.

Integration tests fall between 1 and 2.

In this post I argue that testing in Grails needs to focus on unexpected problems, and should be done in a real (or close to real) environment. That moves the testing away from unit and towards integration and functional.

Why Unit Tests are Great:
  • execution speed: no need to startup the whole infrastructure, etc
  • good coupling: the tested unit becomes married to the unit test, this can be a happy, strong marriage that is going to survive ups and downs of refactoring.
  • atomic/unit nature: the idea is to make a perfect whole from perfect units. (Except, that principle is logically flawed in OOP.) This is sometimes called bottom-up testing. 
  • unit tests can aid software design and coding.
  • aggressive conditions: once you know what kind of problems to expect you can stress the code 'more aggressively' introducing scenarios which are rare or complex to setup using integration of functional tests.  This is rarely done.
This post is not a criticism of Unit Tests.  It is a criticism of how they are used.
Where is the most bang?
Some people disagree with me when I say:  What really needs testing is side-effects. Errors caused by side-effects are the hardest to troubleshoot and fix and are often very intermittent.  Side-effects limit developer ability to do logical reasoning on the code.  Side-effects can be very confusing.
There are many books and articles about testing and side-effects have not been discussed much.  Why is that?

Case study:  This example is repeated from my last post: 
    def users = User.findAllByOffice(office1) //code (A)

Assume user has userName (with unique constraint),  office (of type Office) and userPreferences (of type UserPreferences). This code:
  1. will issue a SELECT statement (with whatever locks) 
  2. can save some changed objects to the database (part 5)
  3. can save some unchanged objects too  (part 6)
  4. will impact some record types returned by queries that follow it (from proxies to actual objects).  Some records returned from this query will use Hibernate Proxy to implement preferences and some will use the actual UserPreferences class. (Some developers may be surprised that the same goes for the office association). (part 4)
  5. will impact the data content of some records returned by queries that follow it.  Similarly, the content of some records returned from this query may be different from the data returned by the underlying SELECT statement. (part 2)
Can simply adding line (A) break my code?  Clearly it can!

Side Note 1: Why are side-effects confusing:  There is this theory that human memory and reasoning work by 'chunking'.   The idea is that the human brain stores knowledge in chunks.  Each chunk gets a label which works something like a DB index. Human brain can recall the whole chunk using that label. There is this very prominent chunk we have all formed:  SQL SELECT query. When you look at code (A) you inadvertently call for that chunk.  Yet 4 out of 5 side-effects associated with that code have nothing to do with 'SELECT' chunk your brain has just found.

Side Note 2: Why are side-effects not logical:  Maybe a better term would be: not logic-friendly.  I have dedicated large parts of this post to explaining why, but I will add a quick high-level explanation from a slightly different angle.  You can safely skip this side-note if you are allergic to academic CS.

Logic needs connectives, to start with, it needs conjunction ('and', '^'). Logical conjunction follows this reduction rule:
  if A is true 'and' B is true then B is true.
You can think of this as one of the axioms. Logic cannot start without it. The programming equivalent of this is the 'beta reduction' rule which looks more or less like this:
  second (A, B) '=='  B
and means: if I do computations A and B (compute a pair) and then ignore A, then this should be equivalent to just computing B.  This is obviously not true if A has side-effects impacting computation B! Side-effects inflict a mortal wound to the (straightforward) correspondence between logic and programming.

The above paragraph may as well have been copied from first pages of an introductory Type Theory book. Keeping side-effects on a tight leash allows to recover correspondence to logic, but, this is no longer first pages (google: 'Hoare Type Theory'). A comment by Mark, on my software correctness post, used the term 'effect' (contrast it with the more unruly 'side-effect', google: 'lax logic', 'monads in lax logic' or 'Effect System').
With unruly 'anything goes' side-effects logical reasoning on a program becomes very, very complex and much less potent, Type Theory-Logic correspondence is lost. I am trying to learn this stuff but even at my current newbe level: it is eye opening.

If you decided to skip the last side-note this is a good point to return:
Welcome back!  You can ignore all of this academic mumbo-jumbo. Just be aware that whatever voodoo your brain does to form judgments about your GORM code (or any other code with unruly side-effects) should be a suspect. This voodoo is far from any straightforward logic and it is easy to get things wrong. Hopefully, I managed to scientifically convince you that: 
  • Side-effects need extra attention when testing
  • This will not be: 'testing expected behavior'
Here is a more pragmatic argument:  As I explained in my last post: side-effects 2-5 are a no-show for very simple CRUD applications and are not that frequent for simple CRUD apps.  The question is: are you satisfied with your app if it works 95% of the time?
The point is: complex Grails apps or apps that strive for more than 95% correctness need to make serious attempt to test side-effects 2-5.

Unit Test Ostrichism
Keep your nose out of trouble, and no trouble'll come to you. - The Lord of the Rings movie
I have been scratching my head asking: why my current project is finding so many issues that nobody else has reported.  One possible answer is:  everyone else has been unit testing and 'assuming away' the reality.  If that is true, then maybe everyone else has also decided that Hibernate is OK.

Here is a question for you:  When writing unit tests for code that includes GORM queries (assume something similar to code (A)), do you use mocks to verify that:
  • the query does not save any objects? 
  • your code works correctly even if the query returns User object with office2 != office1?
  • your code works correctly even if the query returns objects that violate DB enforced constraints (such as more than one user with the same userName)?
  • your code works correctly not only on actual domain objects but also on hibernate proxies? 
I do not. I decided that the amount of work needed to write tests like this would be prohibitive.  But if you disagree please drop me a comment!

Here is another question:  The above bullet list spells out some impacts of side-effects 2-5. This list is not complete.  Can you think of other examples?
I know that I do not know all impacts.  I am not even sure if I understand all GORM side-effects. Testing for expected problems almost guarantees that I will not find what I do not know.

Code with non-localized impacts is not testable.
OO code needs decoupling.  In a well written OO code, state mutation in object A does not impact other objects. OO software designed without decoupling is not testable.  There is a technical term for it: spaghetti.

Almost every post in my 'I don't like' series has shown an example of GORM code where the behavior changes (even breaks) when an isolated GORM query is added to or removed from the code. Hibernate queries create impacts that cannot be easily localized to one or few classes.

How can GORM and Hibernate defend their design as testable?

Fail Fast and Test Easy:
This is something very much missing in Grails.  Very often an incorrect code is likely to work just fine 50% maybe even 80% of the time. Data changes or a query is added or removed somewhere and things break.
Most of my posts in this series pointed out examples like this (see part 3 and part 6).  Ideally, incorrect code should either fail to compile or should fail during my first attempt to execute it.  This is often not that easy to accomplish with a languages like Java or Groovy, but there is just no excuse for, for example, allowing me to use GORM object obtained using hibernate session S1 within session S2.  If such code fails intermittently it should ALWAYS fail.

Without fail-fast philosophy unit tests are not worth much.

Unit Testing and FP:
"Writing unit tests is reinventing functional programming in non-functional languages" (Christian Sunesson - on github).  (I found the linked post a very interesting reading).
Here is a mental exercise:  when reading the following blog about 'testable code': http://googletesting.blogspot.in/2008/08/by-miko-hevery-so-you-decided-to.html , think how each of the guidelines relates to FP. Notice they are all N/A!
'Testable code' term is OOP self admitting to its limitations (my OO code is not testable unless I follow these list of rules...). FP code does not need any tweaking or special guidelines to be testable.

In many respects Hibernate design is the opposite of FP. If one is very unit testable the other one probably is the opposite of testable.

Not just Hibernate:
Grails/GORM/Hibernate stack is complex and has some complex bugs. So does Groovy. I know that lots of my app functionality will break next time I upgrade Grails. Is testing for 'expected' problems a the best investment in this environment?

Conclusions:
Grails contradicts itself on unit testing:  Grails framework provides rich tools to unit test domain objects, only these tests will be rather useless. I think, this confusion is not something that Grails has introduced. Java community in general does not perceive side-effects as something to worry much about. Unfortunately, perception != reality.

More on the confusion:  it should be clear by know that the blame for a lot of this resides with how Hibernate session works. Non-localized side-effects are really hard to test.  With that stated, I would expect each JUnit integration tests in Grails to run in its own session.  The very idea of several tests sharing the same Hibernate session is repulsive.  Take a look that these JIRA tickets: GRAILS-11644GRAILS-11706.

I find Grails attitude towards testing insanely confusing.

Testable Code
Testable code should be defined as: a code that makes unit tests effective in bug prevention.
This statement is not something everyone agrees with. Unit tests are often viewed as a way to aid the design and coding process (TDD) and this becomes their main purpose in life.  
"Unit testing is not about finding bugs", see Writing Great Unit Tests. Don't you agree that something is very wrong with this sentence?  Instead of saying "unit testing is not about finding bugs", maybe we should rethink how we write the tested code so it is?  How about: start paying attention to the side-effects?

If you are doing something over and over and it does not work you have 3 choices:
  • keep on doing it and expect it will work (called insanity, also know as Grails approach to unit testing)
  • keep doing it and null out your expectations ("unit testing is not about finding bugs")
  • change the way you do it (redesign the tested code or/and the way you test).
I am suggesting the 3rd bullet is the way to go.  I cannot redesign Grails but I can rethink how I test.

Functional Test Testable
I like Geb. Writing good functional tests is not trivial and, like with unit tests, it impacts the design of the tested code. The term 'testable code' needs to be extended to functional tests.  ... it maybe an idea for a long post somewhere in the future.

Next Post:
I want to start wrapping up my 'I don't like' series.  In my next post, I plan to give an update about the repeatable finder issue (post 2) and include a few final thoughts.