Software has two ingredients: opinions and logic (=programming). The second ingredient is rare and is typically replaced by the first.
I blog about code correctness, maintainability, testability, and functional programming.
This blog does not represent views or opinions of my employer.
Showing posts with label Hibernate. Show all posts
Showing posts with label Hibernate. Show all posts

Saturday, October 25, 2014

I don't like Hibernate/Grails part 11. Final thoughts.

This series was born out of my frustration with Grails. But, instead of making it a comprehensive criticism of the framework, I have decided to focus on a few GORM and Hibernate issues. I had several reasons to do that.

Why GORM/Hibernate focus?
There is quite a few blogs which basically say: Grails is very buggy and then provide few or no details. There are also many blogs saying Grails is great and then provide equal amount of fluff to support their claim. All of this becomes very subjective.

(Section Edited for clarity, Oct 30, 2014)
It is not that hard to demonstrate that this is a very buggy environment. It has been founded on Groovy, and, in my experience, Groovy is and always was is a very buggy language.  Here is one curious example (tested with Groovy 2.3.6, other versions I checked behave the same way):
  1 as Long == 1 as Integer //true (note, false in Java)
  1 as Integer == 1 as Long //true (note, false in Java)
  [(1 as Integer)].contains((1 as Long))  //false (inconsistent with equals!)
  (1 as Long) in [(1 as Integer)] //false

That is some scary stuff, right?  I think it is scary. This one is not very fixable,  but most Groovy bugs will eventually get fixed.  Hibernate bugs will be with us forever. This series was about bugs other people call functionality and sophisticated design.

Designing a good web framework is not easy.  I think the key is: the framework must be intuitive. It should do what developers expect to happen. That is one more reason for my focusing on GORM/Hibernate. It is hard to claim that these 2 are intuitive, but a similar claim becomes much more subjective in other parts of Grails.
Here is one example of a non-intuitive behavior: How does controller forwarding work. The intuitive behavior would be that forwarded method executes in a separate thread. It does not, it executes as part of the calling method. Instead of wasting time on disagreeing that this is a bad design, do this experiment: Create a controller with methods ‘a’ and ‘b’ and have 'a' forward to 'b'.  Add a filter which simply prints action name in ‘before’ and ‘afterView’.  Here is what you will get with Grails 2.4.3:
    before a,  before  b,  afterView b,  afterView b 
(both afterView print the same action name!).  Would it not be nicer if we got:
    before a,  afterView a,  before b,   afterView b      
Confusing design + mutating state == bugs. But how can I argue that this is not just an innocent overlook on the part of Grails/Spring framework?

One of the most irritating aspects of Grails is that everything is so intermittent, and that is by design. The fail-fast philosophy is totally foreign to this framework. For example, if calling object.save() no longer always saves the object (yes you are reading it right, see GRAILS-11797GRAILS-11536) then would you not want object.save() to fail if the object is not going to be ever saved?  Again, focusing on GROM/Hibernate simplified my job of demonstrating examples with a very intermittent behavior.
The uncanny ability to exacute bad code in Grails goes way beyond what I was able to demonstrate. This is the very scary: How did that ever work before? thing. I suspect something is wrong with Grails/Groovy compilation and a bad code sometimes magically works until some totally not relevant change exposes the problem. I cannot justify this claim. The only think I have is anecdotal evidence.

I have very strong opinions about what are the main causes of OOP bugs. That does not mean you would have agreed with me. Without good examples, this blog would have been labeled as a one more guy that thinks that 'FP is the new silver bullet'. Focusing on GORM allowed me to pinpoint the problems in a way that is hard to dispute. (And yet, I still got the label.)

Is it all about the cache?
All of the GORM problems listed in this series can be attributed in some way to Hibernate session/1st level cache.  You can argue that having cache is beneficial and that some problems are unavoidable with any cache implementation.
Ideally, caching should behave as if it was not there: application using a cache should work exactly the same way if the cache was removed. However, synchronization of ORM cache and DB state is a very hard problem and achieving ‘opaque’ implementation may be hard or even impossible.

Dear GORM/Hibernate:  If you can’t implement cache which behaves ‘like it is not there’ then don’t design your API like the cache ‘is not there’.  Make the cache very explicit and optional (Identity Map?). Invalidate cache immediately, or at least, provide a way for the application to learn as soon as you know that cached data is stale. Design invalidating (you like to call it session.clear()) your cache in a way that does not make half of objects used by the application useless.  Remember, you are just a cache, the data is still there!  The data is what is important, not you.  If you want to call yourself a cache, stop being so bossy!  :)

More criticism of Hibernate
I must point out that I am not the only Hibernate hater. Here are some examples:
http://www.slideshare.net/alimenkou/why-do-i-hate-hibernate
http://mentablog.soliveirajr.com/2012/11/hibernate-is-more-complex-than-the-problem-it-tries-to-solve/
http://brian.pontarelli.com/2007/04/03/hibernate-pitfalls-part-2/

In my blog, I have just scratched the surface.  Some examples of 'bug generating' design that could use more discussion include: defaulted and recommend 'flush:false' in save() (maybe a moot point now), or what happens when Hibernate session closes suddenly and unexpectedly (like, when transactional code fails).
I need to stop somewhere and this post seems a good place and time to stop.

Sneaking-in FP and other things that interest me
I decided that complaining about Grails not working right is much less powerful than pointing out why it does not work right.  I think analyzing and criticizing bad programs is a great way to advance programming skills. With each installment I tried to sneak-in some concept that explained why bad is bad: properties, shared state/side-effects, fail-fast, unit testability, even a bit of combinators. These are all common sense things that explain design flaws.

One thing I could not fit into my posts was types.  I decided that this will be too foreign concept in the context of Java and Groovy. Types are very powerful and I regret not finding a good place for them in this series.

Is FP the ‘new’ silver bullet?
I was asked this question and it makes sense that I try to answer it.
FP is not new, FP predates OOP,  Haskell is older than Java, combinatory logic is older than Turing machines, lambda calculus is about the same age.  The silver bullet is and probably always was the ability to logically reason on the code.

In this series, I tried to emphasize the importance of logic.  Programs should be logically simple and ‘mappable’ to logic. Programming and Logic are very related on a theoretical level (google: Curry-Howard).  This 3 are called the trinity of CS:  Type Theory, Category Theory and Proof Theory (google: Curry-Howard-Lambek).

Programs I write using Groovy and Grails may look straightforward and shorter than Java and Spring but from the point of view of logic these are still just spaghetti threads of programming instructions. Add to it a total disregard for side-effects and this exercise becomes equivalent to building a house of cards on a foundation that is shaking.

Quiz Question:  Recalling Logic 101, here is a logical ‘formula’:
    (a^b)=>c   ⇔   a=>(b=>c)
Do you know/can you figure out what that corresponds to in programming?  Answer at the end of this post.

I used to love Grails
I started this blog site with a series about 'Imperative curlies'.  My idea at that time was that I can count the number of curly braces ({}) in my code and use that number as a measure of how good my code is. The fewer 'curlies', the better the code.  The idea was to break away from coding and thinking using imperative sequences of instructions (for loops, if statements all use 'curlies').  I remember it worked very well for me. If you look at these old posts, you will see that there was a time when I really liked Grails.

Is all OOP bad?
I think that is a complex question.  Good OOP is about things like decoupling, separation of concerns, eliminating shared state, meaningful polymorphism, etc. These things may achieve some of the same goals FP is fighting for. The concept of a shared session state (Hibernate session) is not very OO.  Hibernate StatelessSession interface which does not extend Session is not a great example of OO polymorphism. Ability to decouple is mostly gone due to Hibernate non-localized side-effects. Hibernate is simply not a good OOP.

What any OOP will always lack is this: a clear and simple correspondence to logic. This is what makes FP unique.

Parting thought:
Answer to the Quiz Question:  It is currying. To see it, compare these 2 lines:
   (a^b)=>c              ⇔    a=>(b=>c)
   (a,b)->c              ⇔    a->(b->c)
   (2 argument function)       (function returning a function) 
First line is the logical formula. I have changed arrow-like symbols '=>' to look slightly different '->'. I have replaced '^' with ',' and ended up in FP!  This process is a mini-Category Theory in action. Cool, is it not?

There is no helping it, Groovy and Grails are OOP not FP. It is still important to be able to think outside of that box. Otherwise we will start convincing ourselves of things like ‘static definitions are always bad’, ‘unit testing is not about finding bugs’, ‘using refresh() resolves stale object problems in Hibernate session’,  or some other nonsense.

Thinking in C++, Thinking in Java: it is worth trying to stop it, even if you program in these languages.

Grails is and will be a very popular and buggy framework. We can only blame ourselves for that. Writing this series was a big effort for me. My biggest hope is that it made some of my readers stop for a moment with a 'hmm'.

The End (for now).

(I ended up republishing this blog due to some weird formatting issues -  if I created a chaos in your RRS/Atom feed - sorry!)

Saturday, October 18, 2014

I don't like Hibernate/Grails part 10: Repeatable finder, lessons learned

Repeatable finder (concurrency issue described in part 2) is what started/motivated this series. I had hoped that that this issue will draw some reaction from the community. It did not. Why? Tallying up all the answers/responses from the last 2 moths amounts to: 6, none of them useful or even correct. What does that mean?

In this series I tried to sneak in some things that interest me like FP and logical reasoning of code correctness.  I will use repeatable finder problem as a way to sneak in a bit more of this stuff later in this post.

Denial isn't just a river in Egypt?
This has been a twilight-zone.
To refresh you memory:  if more than one query is executed in a single Hibernate session and the result sets intersect then the second query returns a weird combination of old and new data.  That can break the logic in your code,  for example:
       Users.findAllByNickName('bob')

can return records with nickName != 'bob'. Other things can go wrong too: Maybe you have used a DB unique key to define equals()?  Or maybe you have used a DB unique key as a key in a Map? Any of this could go very wrong.

At first, I thought that the issue must be well know and I am missing some way of handling it. This, unfortunately, is not the case. Very recently, I came across this blog from 2009: orm-sucks-hibernate-sucks-even-more
"... take a look how even a silly CRUD application would suffer, once you've got "not-very-recent" object from the session"
that quote points to a (now non-existing) page on the hibernate website. Did we know more in 2009 that we know now?  If we did know, why have we allowed for this issue to stay unresolved? Well, this is all speculation.

I tried my best to do 2 things:  make the community aware and persuade Hibernate to fix it. I have failed miserably on both accounts. Here are the results of my efforts (as of Oct 17, 2014, tallied after a bit over 2 months since I started my crusade):
  • post part 2:  effectively no replies, but over 1100 reads.
  • Grails JIRA: incorrect comments and then ignored
  • Hibernate JIRA:  rejected (works as intended) with suggested work-around which is incorrect
  • Stack Overflow question:  a whooping +4 score (started at -1) and bunch of incorrect or meaningless answers
  • Grails forum: 0 replies
Hibernate ticket was the weirdest experience.  It got rejected very fast (not a bug) with a comment to just use refresh().  After pointing out that this workaround is a total nonsense, I was sent to read some completely not relevant documentation about concurrency.  After that, my (and Tim's) comments have been ignored.

What can I conclude from these 3 facts?:
  • nobody seems to know how to resolve or even work-around this issue
  • experts provide advice that is incorrect
  • there is no interest in solving, discussing or even acknowledging it as a problem
I do not know, but probably nothing good. I think it is interesting to try to puzzle out the few responses that the problem did generate. I will try to do that here.

The replies I got from the expects fall into 2 categories. The first category are answers like this:
  • It is any ORM issue
  • Any database application will have an issue like this 
It is true that the issue can be resolved with DB locking.  In particular, I could prevent repeatable finder by having all HTTP requests wrapped in long transactions and configuring higher (repeatable read) isolation level. Indeed, it is a big framework design failure, if we need to resort to things like this.
It is NOT true that any ORM and any database application will have this problem.  The most likely explanation for this type of response is that developers do not think about side-effects. They see Hibernate query and think of a SELECT statement only.  If I see a problem, it must be from the SELECT, where else would it come from?  This is consistent with the point I tried to make in my previous posts.

The second category are answers that suggest using refresh() or discard() to fix the problem:
'To fix your problem'
  • add refresh() to your code
  • or:  add discard()/evict() to your code
My first reaction was: Grrr, my second:  Hmm.  If I could only continue this conversation I am sure it would go like this:
Me: Where do I add these?  Expert's Reply: Add them where you have that problem. 

If you have been following various Hibernate discussion forums, you must have noticed that the same type of advice (either to add refresh() or to add evict()) shows up very frequently. This advice is never right.

Grails and Hibernate experts:  I am very disappointed in you.

Add refresh()...  add evict()... Thinking in Hibernate.
(Here is where I sneak-in some interesting stuff.)
This is how we typically reason about our code: the problem is on line 57 because variable xyz is ... and then on line 89 we do that..., and then on line 127 we have an if statement that goes like that...
We reason about our code by examining chains of programming instructions.

This is called imperative thinking and imperative programming.  If you read my previous posts you may assume that I consider such programs not logical,  they are logical, only the logic is very cumbersome and complex.
A well designed OO program is where lines 57, 89 and 127 are all in the same class and the chains of instructions we need to examine are relatively short.  In a procedural program lines 57, 89, and 127 can be anywhere and chains are long.  Badly designed OO programs behave like procedural programs.

Repeatable finder is a great example where imperative reasoning fails.  The problem is not something between lines 57 and 89 or something on line 115.  The problem is (or can be) anywhere.
Answer 'add refresh()' or 'add discard()' is a very imperative thinking: it assumes I can add it on line 89.  (I can only conclude that this expert advice is not to sprinkle refresh() all over my code just for fun ... and because my code will run too fast without it.)

So what is the alternative?  The idea is to think about a block of code in a way that can prove certain behavior of that block. If we know that a code block 'A' exhibits the same behavior no matter where it is placed or how is used, then we no longer need to think about lines 57, 89, and 127 or about chains of computing instructions.

This is called declarative thinking and programming.  Logical reasoning is now simplified, I no longer need to follow chains of programming steps to reason about the correctness.

Declarative thinking works great, except, if I cannot trust any property, even that:
    Users.findAllByNickName('bob').every{it.nickName == 'bob'}
then I am stuck.
That may sound like a limitation of declarative programming:  I can still keep going using imperative approach. That is true, and we all 'keep going'.  I did not stop programming my project and no, I did not add refresh() all over my code. That is why our applications are so buggy: we ignore logical problems unless we can pin them to line 127.

Side Note: The fun starts when I start combining my declarative code blocks into bigger blocks. Code needs to be logically composable.  I want a bigger block (composed of smaller blocks) to have properties too. Some like to call it programming with combinators.

Conclusions:
I would like to suggest this as a new rule of thumb: 
  the answer to use Hibernate/GORM refresh() or evict()/discard() is wrong regardless of the question.
(with exception of Functional Tests - which may need to refresh some records used in asserts).  Please comment below if you find a counter example to this rule.

I am not claiming that I know the solution to Repeatable Finder.  Maintaining Hibernate cache synchronized with the DB is hard, maybe impossible.  One way of dealing with hard problems is: make them somebody else's.  If GORM/Hibernate just told me when the query is lying (returns stale data even if it has new data) or allowed me to request/configure the query to refresh all records... That would go a long way.

It looks like the community has decided to not acknowledge Repeatable Finder as a problem. There is really no good solution for it and acknowledging it would be admitting to that fact. This issue is likely to remain unsolved and ignored. More complex Grails apps are doomed to work incorrectly under heavier concurrent use.

I have added a label to my posts (which does not work so do not click on it - and that convinces me that blogger must be using Hibernate):  'Stop thinking in C++ and Java'.  I think we need to stop thinking imperative or at least stop thinking only in imperative terms.

Next post:  I need to do one more to wrap-up. I will be finishing this series next week.

I have a busy period ahead of me.  I started going over a set of courses published online by University of Oregon (Oregon Programming Languages Summer School) and that will be many hours of not very easy listening and learning.  I also have to start preparing/training for a ski camp in early November (I live in CO and skiing has started here already).  No, this will not be a SKI calculus camp;) - but then, believe it or not, technical skiing is (or should be) a fun intellectual activity too.

Friday, October 10, 2014

I don't like Hibernate/Grails part 9: Testable code

"I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence". Kent Beck.
Finding more problems with less effort is something I totally agree with.  Which approach to testing will give me the most bang for the buck?  I like to think about tests in a pragmatic way: I write tests to find bugs and guard against bugs. I consider manual testing rather ineffective and, in most cases, inferior when compared to automated test.

Testing seems to be deeply related to my last two posts.  It is obviously related to software correctness. Less obviously: out approach to testing impacts how we perceive the framework, how we test can explain why we like or don't like Hibernate and Grails.

I have moved away from unit test in Grails.  My approach to testing is an 'inverted pyramid'. I test internal implementation details using Grails integration tests and I write a lot of functional tests.  My test are less 'unit' than you may like them to be (they interact with actual database, mocking is replaced with data setups) but are closer to the reality. This post explains my reasoning behind this decision.

I am considering the unit/integration/functional division from the 'testing philosophy' point of view.  I care less if tests are placed in test/unit, test/integration or test/functional folder.  I will stay away from the hate-love TDD debate.

Testing choices: (Just to make sure we are on the same page.)
Testing spectrum has these 2 extremes with very different characteristics:
(1) Unit Tests:  testing expected behavior; testing in a fake environment; testing internals not visible to the end users 
Also: white box testing; bottom-up testing
I will test parts (units) of my software in isolation from anything else.  I will 'mock' interaction with the rest of the system.  Since I know exactly what can go wrong, I can write mocks that exercise the tested part under a specific situation.
(The mocking aspect is specific to OOP, nobody writes mocks in FP... with that said, there is this great book:  http://en.wikipedia.org/wiki/To_Mock_a_Mockingbird ;))

(2) Functional Tests: testing for unexpected problems; testing in a real environment; testing functionality visible to the end users 
Also: black (more gray than black - I need testing 'hooks') box testing; top-down testing
I will test my application as a whole.  I identify a list of specifications and I write tests to verify that my application works correctly with respect to these specifications. I do not need to know everything that can go wrong, I assume that since I covered a comprehensive range of data and scenarios representing actual software usage, most of what can go wrong will be uncovered.

Integration tests fall between 1 and 2.

In this post I argue that testing in Grails needs to focus on unexpected problems, and should be done in a real (or close to real) environment. That moves the testing away from unit and towards integration and functional.

Why Unit Tests are Great:
  • execution speed: no need to startup the whole infrastructure, etc
  • good coupling: the tested unit becomes married to the unit test, this can be a happy, strong marriage that is going to survive ups and downs of refactoring.
  • atomic/unit nature: the idea is to make a perfect whole from perfect units. (Except, that principle is logically flawed in OOP.) This is sometimes called bottom-up testing. 
  • unit tests can aid software design and coding.
  • aggressive conditions: once you know what kind of problems to expect you can stress the code 'more aggressively' introducing scenarios which are rare or complex to setup using integration of functional tests.  This is rarely done.
This post is not a criticism of Unit Tests.  It is a criticism of how they are used.
Where is the most bang?
Some people disagree with me when I say:  What really needs testing is side-effects. Errors caused by side-effects are the hardest to troubleshoot and fix and are often very intermittent.  Side-effects limit developer ability to do logical reasoning on the code.  Side-effects can be very confusing.
There are many books and articles about testing and side-effects have not been discussed much.  Why is that?

Case study:  This example is repeated from my last post: 
    def users = User.findAllByOffice(office1) //code (A)

Assume user has userName (with unique constraint),  office (of type Office) and userPreferences (of type UserPreferences). This code:
  1. will issue a SELECT statement (with whatever locks) 
  2. can save some changed objects to the database (part 5)
  3. can save some unchanged objects too  (part 6)
  4. will impact some record types returned by queries that follow it (from proxies to actual objects).  Some records returned from this query will use Hibernate Proxy to implement preferences and some will use the actual UserPreferences class. (Some developers may be surprised that the same goes for the office association). (part 4)
  5. will impact the data content of some records returned by queries that follow it.  Similarly, the content of some records returned from this query may be different from the data returned by the underlying SELECT statement. (part 2)
Can simply adding line (A) break my code?  Clearly it can!

Side Note 1: Why are side-effects confusing:  There is this theory that human memory and reasoning work by 'chunking'.   The idea is that the human brain stores knowledge in chunks.  Each chunk gets a label which works something like a DB index. Human brain can recall the whole chunk using that label. There is this very prominent chunk we have all formed:  SQL SELECT query. When you look at code (A) you inadvertently call for that chunk.  Yet 4 out of 5 side-effects associated with that code have nothing to do with 'SELECT' chunk your brain has just found.

Side Note 2: Why are side-effects not logical:  Maybe a better term would be: not logic-friendly.  I have dedicated large parts of this post to explaining why, but I will add a quick high-level explanation from a slightly different angle.  You can safely skip this side-note if you are allergic to academic CS.

Logic needs connectives, to start with, it needs conjunction ('and', '^'). Logical conjunction follows this reduction rule:
  if A is true 'and' B is true then B is true.
You can think of this as one of the axioms. Logic cannot start without it. The programming equivalent of this is the 'beta reduction' rule which looks more or less like this:
  second (A, B) '=='  B
and means: if I do computations A and B (compute a pair) and then ignore A, then this should be equivalent to just computing B.  This is obviously not true if A has side-effects impacting computation B! Side-effects inflict a mortal wound to the (straightforward) correspondence between logic and programming.

The above paragraph may as well have been copied from first pages of an introductory Type Theory book. Keeping side-effects on a tight leash allows to recover correspondence to logic, but, this is no longer first pages (google: 'Hoare Type Theory'). A comment by Mark, on my software correctness post, used the term 'effect' (contrast it with the more unruly 'side-effect', google: 'lax logic', 'monads in lax logic' or 'Effect System').
With unruly 'anything goes' side-effects logical reasoning on a program becomes very, very complex and much less potent, Type Theory-Logic correspondence is lost. I am trying to learn this stuff but even at my current newbe level: it is eye opening.

If you decided to skip the last side-note this is a good point to return:
Welcome back!  You can ignore all of this academic mumbo-jumbo. Just be aware that whatever voodoo your brain does to form judgments about your GORM code (or any other code with unruly side-effects) should be a suspect. This voodoo is far from any straightforward logic and it is easy to get things wrong. Hopefully, I managed to scientifically convince you that: 
  • Side-effects need extra attention when testing
  • This will not be: 'testing expected behavior'
Here is a more pragmatic argument:  As I explained in my last post: side-effects 2-5 are a no-show for very simple CRUD applications and are not that frequent for simple CRUD apps.  The question is: are you satisfied with your app if it works 95% of the time?
The point is: complex Grails apps or apps that strive for more than 95% correctness need to make serious attempt to test side-effects 2-5.

Unit Test Ostrichism
Keep your nose out of trouble, and no trouble'll come to you. - The Lord of the Rings movie
I have been scratching my head asking: why my current project is finding so many issues that nobody else has reported.  One possible answer is:  everyone else has been unit testing and 'assuming away' the reality.  If that is true, then maybe everyone else has also decided that Hibernate is OK.

Here is a question for you:  When writing unit tests for code that includes GORM queries (assume something similar to code (A)), do you use mocks to verify that:
  • the query does not save any objects? 
  • your code works correctly even if the query returns User object with office2 != office1?
  • your code works correctly even if the query returns objects that violate DB enforced constraints (such as more than one user with the same userName)?
  • your code works correctly not only on actual domain objects but also on hibernate proxies? 
I do not. I decided that the amount of work needed to write tests like this would be prohibitive.  But if you disagree please drop me a comment!

Here is another question:  The above bullet list spells out some impacts of side-effects 2-5. This list is not complete.  Can you think of other examples?
I know that I do not know all impacts.  I am not even sure if I understand all GORM side-effects. Testing for expected problems almost guarantees that I will not find what I do not know.

Code with non-localized impacts is not testable.
OO code needs decoupling.  In a well written OO code, state mutation in object A does not impact other objects. OO software designed without decoupling is not testable.  There is a technical term for it: spaghetti.

Almost every post in my 'I don't like' series has shown an example of GORM code where the behavior changes (even breaks) when an isolated GORM query is added to or removed from the code. Hibernate queries create impacts that cannot be easily localized to one or few classes.

How can GORM and Hibernate defend their design as testable?

Fail Fast and Test Easy:
This is something very much missing in Grails.  Very often an incorrect code is likely to work just fine 50% maybe even 80% of the time. Data changes or a query is added or removed somewhere and things break.
Most of my posts in this series pointed out examples like this (see part 3 and part 6).  Ideally, incorrect code should either fail to compile or should fail during my first attempt to execute it.  This is often not that easy to accomplish with a languages like Java or Groovy, but there is just no excuse for, for example, allowing me to use GORM object obtained using hibernate session S1 within session S2.  If such code fails intermittently it should ALWAYS fail.

Without fail-fast philosophy unit tests are not worth much.

Unit Testing and FP:
"Writing unit tests is reinventing functional programming in non-functional languages" (Christian Sunesson - on github).  (I found the linked post a very interesting reading).
Here is a mental exercise:  when reading the following blog about 'testable code': http://googletesting.blogspot.in/2008/08/by-miko-hevery-so-you-decided-to.html , think how each of the guidelines relates to FP. Notice they are all N/A!
'Testable code' term is OOP self admitting to its limitations (my OO code is not testable unless I follow these list of rules...). FP code does not need any tweaking or special guidelines to be testable.

In many respects Hibernate design is the opposite of FP. If one is very unit testable the other one probably is the opposite of testable.

Not just Hibernate:
Grails/GORM/Hibernate stack is complex and has some complex bugs. So does Groovy. I know that lots of my app functionality will break next time I upgrade Grails. Is testing for 'expected' problems a the best investment in this environment?

Conclusions:
Grails contradicts itself on unit testing:  Grails framework provides rich tools to unit test domain objects, only these tests will be rather useless. I think, this confusion is not something that Grails has introduced. Java community in general does not perceive side-effects as something to worry much about. Unfortunately, perception != reality.

More on the confusion:  it should be clear by know that the blame for a lot of this resides with how Hibernate session works. Non-localized side-effects are really hard to test.  With that stated, I would expect each JUnit integration tests in Grails to run in its own session.  The very idea of several tests sharing the same Hibernate session is repulsive.  Take a look that these JIRA tickets: GRAILS-11644GRAILS-11706.

I find Grails attitude towards testing insanely confusing.

Testable Code
Testable code should be defined as: a code that makes unit tests effective in bug prevention.
This statement is not something everyone agrees with. Unit tests are often viewed as a way to aid the design and coding process (TDD) and this becomes their main purpose in life.  
"Unit testing is not about finding bugs", see Writing Great Unit Tests. Don't you agree that something is very wrong with this sentence?  Instead of saying "unit testing is not about finding bugs", maybe we should rethink how we write the tested code so it is?  How about: start paying attention to the side-effects?

If you are doing something over and over and it does not work you have 3 choices:
  • keep on doing it and expect it will work (called insanity, also know as Grails approach to unit testing)
  • keep doing it and null out your expectations ("unit testing is not about finding bugs")
  • change the way you do it (redesign the tested code or/and the way you test).
I am suggesting the 3rd bullet is the way to go.  I cannot redesign Grails but I can rethink how I test.

Functional Test Testable
I like Geb. Writing good functional tests is not trivial and, like with unit tests, it impacts the design of the tested code. The term 'testable code' needs to be extended to functional tests.  ... it maybe an idea for a long post somewhere in the future.

Next Post:
I want to start wrapping up my 'I don't like' series.  In my next post, I plan to give an update about the repeatable finder issue (post 2) and include a few final thoughts.



Friday, October 3, 2014

I don't like Hibernate/Grails, part 8, but some like Hibernate and Grails. Why?

Small change in plans. I wanted to write about testing, but this topic logically precedes testing. Writing last 2 posts made me realize something: each application is different and Grails/Hibernate problems I am likely to notice are very, very much dependent on the app I am working on.

Simple App:  Think of something you can generate by asking Grails to do the coding for you. Simple app is a CRUD application with these characteristics:
  • Simple domain Objects without relationships
  • No transactions/services
  • Not more than one hibernate query per request
  • Simple validation logic (contained in domain objects)
This example repeats part of my last post.  Assume that I have a domain object User which looks similar to this:
    class User {
        ...
        Office office
        UserPreferences preferences
    }

In a simple app this code:
    User.findAllByOffice(office1) //Code Example (A)

will exhibit only one type of side-effects: it issues SELECT statement + some DB locks.

Complex App:  In my last post I have listed several side effects that can be associated with (A). Clearly, complex apps are a different ball game!  Here are the side effects listed again.  Code (A)
  1. will issue a SELECT statement (with whatever locks) 
  2. can save some changed objects to the database (part 5)
  3. can save some unchanged objects too  (part 6)
  4. will impact some record types returned by queries that follow it (from proxies to actual objects).  Some records returned from this query will use Hibernate Proxy to implement preferences and some will use the actual UserPreferences class. (Some developers may be surprised that the same goes for the office association). (part 4)
  5. will impact the data content of some records returned by queries that follow it.  Similarly, the content of some records returned from this query may be different from the data returned by the underlying SELECT statement. (part 2)
Side-effect 2-5 have very non-local impacts (can affect unrelated parts of code sharing the same hibernate session) and can be extremely hard to avoid.  This leads to code that is unpredictable.  So, how is it possible for Hibernate to be so popular?

Counting the number of side-effects in (A) that impact my code is just one possible measure of my app complexity.  As my application becomes more complex, these side-effects become 'louder' (impacts become more frequent and more noticeable).  In this post, I am less interested in how many side-effects impact my code, more in how 'loud' they are. 

Following table shows a fictional CRUD application that started very simple and became more complex over time.  
H5 - total number of side-effects in queries similar to (A)
H6 - frequency of problems related to queries similar to (A)



Complexity
Exposure to Problems
Explanation of impacts
H5
H6
1
Simple App
Happy Days!
Happy Days!
1

2
Add Domain Object Relationships
Hibernate Proxy Objects
2
low
3
Add more complex validation - more than one query in the same session
Repeatable Finder
3
low
4
Move validation logic outside of domain objects
Auto-flushing
4
low -mid
5
Add Services and Transactions
LazyInitializationException

Auto-flashing becomes less loud (unwanted saves are rolled back)

Transactional integration tests diverge from reality
Auto-flushing can still be a problem: saving can fail.
mid
6
More complex data passed from client
Unmarshalling of client data keeps changing between Grails versions
Small intro to version upgrade problems

mid
7
Added logic in Filters
Higher probability of repeatable finder issue.

Need for more complex test infrastructure

Side Note: What is the before-after-afterView ordering if Controller does a forward?

Potentially interesting filter cleanup or setup ordering issues

mid
8
Application managed hibernate sessions
DuplicateKeyException and similar hard to troubleshoot errors.

Queries can save unmodified objects.
5
high

8 needs more explanation.  Here is one example why I need to create small hibernate sessions:
Integration tests may need to include scenarios mimicking activity performed over several HTTP requests.  Typically, I see such tests mashing all logic into one test method executing everything in the scope of the same hibernate session.  In real life, each of the requests will be performed in a separate hibernate session. Tests have diverged from reality.  Side-effects 2-5 listed above will have dramatically different impact on the code when ran in the test environment.

Audacious App:
Consider Grails/Hibernate implementing Type 2 Slowly Changing Dimension (the one using time slices).  To do that, I may want to use something like a session variable (available in most databases, including Oracle or Postgresql) to define a time point and use read-only views to get a snapshot of all my data at the specified time point.  I will configure my GORM domain objects against these views and see what happens.

To use session variables, I will need to wrestle with Spring framework DB connection management to make sure that the connection is not swapped under me, because that would change the definition of session variable.  That is not trivial but doable.

How does that change my exposure to Hibernate side-effects?  Consider this: moving the time-point one day forward applies a day worth of user activity in just a few milliseconds!  Many concurrency problems in the application just became super loud. That includes repeatable finder (side-effect  5, my blog part 2).  Because of that I need to wrap each use of a time-point session variable with a withNewSession() block (or something equivalent).  That exposes my code to issues documented in part 3 and 6 (side-effect number 3!).  All of these are now super loud.

Conclusions:
This is my best attempt to explain the discrepancy in perception of Hibernate and Grails. I think there is more going on that I do not understand, but this is my best answer to-date.

The 5 side-effects listed in this post are worth investigating more.  I will refer to them again when talking about testing (next post).

Friday, September 12, 2014

I don't like Hibernate/Grails part 7: working on more complex project

Grails big claim to fame is productivity.  A simple CRUD application can be up and running in just a few hours of development work.  Grails makes writing small CRUD application unbelievably easy.
But small simple apps sometimes need to grow up to become bigger and complex.  How easy is to do that with Grails?  Here is my experience with how that works.

In my previous posts I tried to rigorously document framework issues that I believed are not well known by Grails community, some not know at all.  You may find some surprising things in this post too, but instead of providing exact code examples I simply 'ramble' about my experience.  I have grouped this post into sections so you can pick and choose what is of interest to you.
If you find that boring, I am going to have next installment about Unit Testing soon.

My current project:
I am not going to bore you about the details, just a very high level bits and pieces.   I need to put 'more complex' in some context.  As it often happens, my project started relatively simple and the requirements grew more and more complex.

Now, the project enjoys complex domain model with nontrivial business logic (even CRUD is complex).  Some domain objects can be shared between users from one office, some across several offices, some across regions, some across everywhere (add model relationships and this becomes very interesting).  Objects are time dependent, maintain historical information and some relationships can go across points in time (object A at time T1 is in relationship with B at time T2).  Complex uniqueness constraints are in place, with overrides for users with certain roles and certain object types.  Users can view 'deleted' objects... We do some fun algorithmic work too, like programming on graphs and trees.

All of this may sound crazy complicated but it mostly boils down to more complex business logic, need for more complex queries and more complex testing requirements. I keep asking myself:  is Hibernate simply a wrong choice for more complex apps like this?

High level design:  All logic is kept in or behind services.  Services form deep class inheritance trees and use mixins.  Controllers are thin, but use inheritance too.  Domain objects are very anemic (almost no logic).  We had used domain objects with mixins at one point but these have been refactored to use JPA style and class inheritance.  Level 2 cache is used on certain domain objects which are never updated.  GORM criteria queries are used extensively (and almost exclusively) for querying. Some direct Sql is used for performance. Processing of large imports is often split into small hibernate sessions for performance.

Groovy Magic:
I hear that commonly as a complaint,  but I do not really agree.  With improving tool support (such as IntelliJ) introspecting framework code becomes easier and easier.   I think Grails team has done phenomenal job in providing a terse interface to Spring/Hibernate stack.  My problems are typically not with the Groovy visible top but with the whole stack.  Terseness makes things look nice, but then 's..t' is a very terse word too.

Complexity of Grails/Spring/Hibernate stack:
Beast sitting on a beast sitting on a beast.  This stuff is complex.  I did not work with Spring much nor with Hibernate (lucky 'previous' me).  A lot of stuff can be configured but how safe is it? How much configuration can I change in Hibernate to not upset Spring?  How much in Spring to not upset GORM?
For example,  some time ago Spring/Hibernate used to hold on to database connection for the duration of whole HTTP request before returning it to the pool. Our app needs this behavior.  There is a legacy setting for it and which accepts the on_close value (hibernate.connection.release_mode). But it does not work, other settings need to be modified too ... up to the point where we ended up implementing Hibernate interceptors to handle some of the problems. Seem like this configuration ('on_close') is no longer supported by the stack.  Not that we found much documentation about it.  (Please comment on this thread started by Tim and Chris if you have more knowledge about this:  http://groups.google.com/forum/#!topic/grails-dev-discuss/w3MnxautR1c)

There are some problems reported in JIRA that many developers have experienced but Grails/Spring project developers cannot reproduce.  I imagine there are many that do not get reported at all. One example is the notorious problem with hot reloading of code changes to services.  (http://jira.spring.io/browse/SPR-4153). We are seeing this problem.  Other developers that I know are seeing this too.  Reproducing this behavior on a freshly created Grails app seems impossible.  Tim was determined to figure out how to reproduce it 'from scratch' but so far, no luck.

Framework Bugs:
The prize for the most interesting bug ever goes to Chris:  If controller tries to render JSON on detached criteria query (instead of results of that query) the source code is whipped out. Yes, you are reading this correctly, invoking controller method ends up with removing your code. Thank GOD for version control!    Chris has entered bug for it into Grails JIRA, sadly it was never assigned or acknowledged in any way  http://jira.grails.org/browse/GRAILS-9169.   I have not tried to reproduce it so I do not know if it is still a problem. 
Side Note: Groovy/Grails is in a very good company here!  Once upon a time, a similarly embarrassing bug has happened with GHC (Glasgow Haskell Compiler) which would have erased the source if it did not like some code.
As entertaining (and embarrassing) as this problem was/is, I consider it not as big a deal as, say, some Hibernate 'features'.  The scariest to me is code where adding a not relevant functionality (such as an isolated database query) can cause the code to break or behave differently.  With Hibernate this is not even considered a bug, it is how Hibernate works.

None of the problems that I have discussed in my previous posts have been exclusive to complex apps. I guess, there must be some law which makes the relationship between project complexity and exposure to framework bugs go exponential.  We are seeing a lot of problems now, much more than when the app was 'young and naive'.

Keeping up with Grails releases:
Historically, upgrades had been painful.  Here are some examples of things that got us in trouble: mixins, sending JSON from the client,  ivy, having both a unit and an integration tests.  The project started on  2.0.1.  Release 2.1 had some bugs which where a no-go for us so we waited.  Eventually we decided to bite the bullet and upgrade to 2.3.2.   The cost of that upgrade was somewhere in the 2 developer week range!   Updating to 2.3.3 was easy, but 2.3.4 broke our code again and we decided to wait ...
We are currently in process of upgrading to 2.4.3... and this time it looks like a much easier process, done in about 2-3 developer days.  I hope this means that Grails project is stating to stabilize and next upgrades will get easier and easier.

Update process is important,  just the improvements in Groovy compiler are really making it worth the high cost... but the cost is high.

Testability:
For reasons that should be apparent from my previous posts, we have reversed the typical testing pyramid:  We have a lot of functional tests,  many integration tests and no unit tests.  Out 'atomic'/unit testing is implemented in Grails Integration Tests. This differentiates my current project form other Grails projects and may explain, in part, why we are seeing problems that others are not.

Can popularity of Hibernate be explained with how developers approach the testing process?  Can Grails bugs be explained by how Grails project tests itself?  Can unit tests be blamed for all of that?

This approach has a drawback:  out tests are slow to run, which impacts the overall productivity.
I will write about testing in Grails in a separate post (or posts).

Performance:
Fortunately, my project has no high scalability requirements. My problems are more related to the need for our custom graph and tree algorithms to work fast.   Grails and Groovy are working on improving performance, which is great. The biggest headache for me is the cost of the call stack.  Basically the cost of method or closure call in Groovy is much higher than in Java.  
My take on this is that some performance critical parts of the code need to be implemented directly in Java.  Hopefully there will be very few of these.

Concurrency:
Concurrency is not something exclusive to bigger projects.  Any web app needs to consider it.
My previous posts have identified some concurrency issues in GORM.  I consider 'repeatable finder' a serious Hibernate design flaw.  I expect the framework to help me with concurrency, not to make things worse by throwing on me code ill designed for handling it.   The worst case is where concurrency problem is thrown at me and I have no way of solving it.   (Again, these issues appear to be inherited from Spring/Hibernate technology stack.)

We have some functional and integration tests for concurrency, but not that many.  What we done differently is:  we used a lot of 'withNewSession' logic in our tests (which is a good thing).  We also have spent some time simply thinking about what are the potential logical problems of how Hibernate session works.    That has exposed two interesting issues (documented in previous posts) that probably nobody else knew about.

It maybe interesting to compare Grails others frameworks.  Ruby on Rails 'share nothing' philosophy offloads handling of complex concurrency issues to the database engine.  The main motivating factors behind 'share nothing', as I understand, are scalability and concurrency.  Active Record is session-less.   Play/Ebean is like this too.  But Play started to move to JPA/Hibernate.  That is a very surprising move for me.   I always viewed Scala community to be more FP-oriented. 

Community:
Being able to simply google to find answers is priceless.  Grails documentation is growing too (and has hard time staying current).  However we rarely had any lack with posting questions and getting answers.  The silence about 'repeatable finder problem' which I have posted in 2 JIRAs,  stack overflow and Grails forum is a good example.  Maybe other people just do not see these problems often enough to notice?  It seems like once we passed certain level of complexity in our project we are sailing these waters on our own.

This maybe less true than it was before, but it seems like people get in trouble when they criticize Hibernate.  .. and resolution of such criticism is to label the critic ignorant or stupid.  That really amazes me. So there is this library where just adding a new query can change the behavior or break your previous code and you cannot criticize that?

Here is a conversation between two fictional programmers A and B (it resembles some I have seen on the internet when searching for answers to my Hibernate questions):
A:  I am finding how Hibernate works hard to swallow,  my application is very hard to maintain.
B:  I have used Hibernate in 4 projects and never had a problem,  the problem maybe you.
.. and the discussion stops here

I am very puzzled by all of this. What does B differently than I do?  Less complex projects - is that it?

How did that ever work?
Did you ever experience this?  I am looking at some code, typically a test that is failing and scratching my head how did that ever work?  I have experienced situations where, for example, a test had a local variable declared and defined in one closure that was used in another closure.   This test has been passing for months on developer computers and on Jenkins...  I made some not relevant change and it started to fail because of not defined variable error...

The only logical explanation I can come up with is some incremental compilation issue that is not resolved by cleaning.  That, or .. I am crazy.


Clean, clean-all, super-clean, delete folders  ...
And then sometimes nothing works until you clean, super-clean, restart IntelliJ,  reboot...   Again this is a complex technology stack.

IntelliJ:
It is improving, syntax detection works better and better. Ability to drill into source code is improving.  Still there are some glitches but considering how complex this must be, it is impressive.
I have used STS in the past but not for some time, so I cannot comment on STS.

Overall Productivity:
As a developer,  I much rather focus on complex business functionality than on figuring out Spring or Hibernate gotchas or concurrency issues.   This is a hard project management problem.  How can we estimate cost of new functionality if resolving gotchas takes significant chunk of our time? What gotchas?  Please look at part1 - part 6 posts on this blog site.

Not being able to trust in unit or integration tests is another big problem.  Functional tests have many benefits, but these tests are slow to run and running them after each code change is not very feasible.  

It would be good to figure out some stats for what percentage of time is spent on dealing with framework related issues. I have no such stats but it feels like Grails/Hibernate does not scale well to more complex projects...  but then, would I prefer to use Spring/Hibernate directly? NO! NO! NO!

Maybe we are seeing more of Spring/Hibernate issues because Grails 'overlay' allows us to spot problems that have been previously overlooked by developers bogged down with writing verbose Java code?   'No good deed ever goes unpunished' , am I blaming Grails framework for making things better?  Quite possibly, I do.

Finally, I think a big lesson from all of this is maybe that Hibernate and 'outside of the box' thinking do not mix.  Keeping things orthodox and simple may help reduce the volume of issues that I and you will see.   My previous projects have been very vanilla like,  I have seen some problems but not enough to write home (or blog) about.  True, I knew less back then on what can go wrong, so there is some ostrichism in this advice, but the likelihood of things going wrong appears to be significantly smaller when things are kept simple.

Next Posts:
I want to take a short break from Grails/Hibernate 'bashing' and write about what absolutely fascinates me:  My dream is to have ability to write code that has no bugs at all. Is that even possible?  (I you read my "I don't like" series, you probably can see why that is my dream.... and yes it is possible!)
I also want to discuss this questions: Is Groovy 'functional', how functional is Grails?

After that: I will talk about challenges and problems I see with testing in Grails.

Thursday, September 4, 2014

I don't like Hibernate/Grails part 6, how to save objects using refresh()


This one took some time to figure out, but the closer I got to unraveling the problem, the more my jaw dropped.  I have seen mysterious errors in Geb/Spock tests, the errors where not hard to fix but the behavior was very scary.  Tests have been trying  to save a domain object (and failing to save it) - but that object was not modified by the test nor there was any explicit logic to save it!

Eventually, I was able to reproduce it 'from scratch'  (using current Grails 2.4.3 and 2.3.3 project with all new project defaults).  I use this simple domain class:
   class Simple {
      String name
   }


and I assume I have a record for it with name 'simple1' saved in the database. Consider this code:

   def simple1 = Simple.findByName('simple1') //(1)

   Simple.withNewSession {
     simple1.refresh() //(2)
   }

   Simple.findAll() //(3)


If simple1 is concurrently modified, GORM will actually attempt to save simple1 on the line marked with (3).  If this happens, the above code will fail with HibernateOptimisticLockingFailureException because Simple class has implicit version property.

Side Note:  I think, developers who configure 'version false' on their domain objects are indeed very brave!

Please note that the above code has NOT modified simple1 in any way. What happens here: line marked as (2) sets dirty flag on simple1 object!  This should be somewhat embarrassing bug because developers may use refresh() to reset the object content and to prevent any saving from happening. In reality, the very use of refresh() can actually trigger the save!

Here is a test method which demonstrates the issue (this test passes in current latest version 2.4.3 and in 2.3.3): 

void testNewSessionWithRefresh() {
  def simple = Simple.findByName('simple1')

  //start concurrent mod
  def session = sessionFactory.currentSession
  Sql sql = new Sql(session.connection())
  sql.execute("""

     update simple
     set version=version + 1, name='new_' || name
  """.toString()
  )

  def rec = sql.firstRow(

     "select name from simple where id=${simple.id}".toString()
  )
  assert rec[0] == 'new_simple1'
  //end concurrent mod

  Simple.withNewSession {
      simple.refresh()
  }

  assert simple.dirty

}

Maybe it is the nature of ugly side-effects that if you combine them (auto-save + session) new ugly (and uglier) side-effects are born?

Fail Fast?:  You can argue that refresh() call should not be made in a context of different hibernate session.  Yes, but if this is the case, should I not expect an exception thrown from that refresh() call?   This is yet one of the cases, so typical to Grails/Hibernate technology stack, where bad code works fine most of the time. A lot of code in Grails is like a ticking bomb.

I have uncovered this particular problem by analyzing weird behavior in my Geb functional tests. Geb code triggers server activity,  which is concurrent to Geb test itself.  How can I be sure that all my code is safe from this problem?  Should I avoid using more then one hibernate session per, say, one HTTP request?  Yes, that would go well with Grails/Spring design in general but there are situations where having more than one session is needed (as discussed in my post part 3).

Exercise for the reader:  Change last line in the above test method to:
   assert !simple.dirty

The test now will fail.  Fix the test so it passes by inserting this line of code (and only this line) somewhere in the test: 
   Simple.findAll()

Yeah, seriously, this is one more of 'these' problems as well.

Note on Flush Mode:  As discussed in the previous post, and in http://jira.grails.org/browse/GRAILS-11687: Grails 2.4.3 Flush Mode behavior is very intermittent. Line (3) in the above code snapshot fails with Grails 2.4.3 so, in this case, GORM decided to use FlushMode auto-like behavior.  Examples from the previous post had GORM behave in manual-like fashion with Grails 2.4.3.  The flush mode configuration in both examples is the same and if you decided to print the value on the current session it would print 'AUTO'.

Testing for such problems:  Concurrency problems are hard to test for.  I expect these to be handled for me by the framework as much as possible.  Complex concurrency issues maybe a consequence of business requirements that my code is trying to implement,  I can handle these.  But, concurrency issues should not be something the framework throws at me 'just because'.  As we have seen already (in post part 2) concurrency is not something Grails does well.  

This is simply one more example of a not testable problem.
Unit tests are a non-starter here, I may have some luck uncovering these with integration or functional tests if I know where to look.  But this is not very likely.

References:
http://jira.grails.org/browse/GRAILS-11701

Next Post:  My plan is to write one post about what it is like to work on a more complex project that uses Grails.   Something that goes beyond a simple CRUD application.  Instead of providing concrete code examples, I will simply describe some difficulties my team has encountered.   After that I want to come back to the discussion started in part 1: testing in Grails.