Software has two ingredients: opinions and logic (=programming). The second ingredient is rare and is typically replaced by the first.
I blog about code correctness, maintainability, testability, and functional programming.
This blog does not represent views or opinions of my employer.

Saturday, October 25, 2014

I don't like Hibernate/Grails part 11. Final thoughts.

This series was born out of my frustration with Grails. But, instead of making it a comprehensive criticism of the framework, I have decided to focus on a few GORM and Hibernate issues. I had several reasons to do that.

Why GORM/Hibernate focus?
There is quite a few blogs which basically say: Grails is very buggy and then provide few or no details. There are also many blogs saying Grails is great and then provide equal amount of fluff to support their claim. All of this becomes very subjective.

(Section Edited for clarity, Oct 30, 2014)
It is not that hard to demonstrate that this is a very buggy environment. It has been founded on Groovy, and, in my experience, Groovy is and always was is a very buggy language.  Here is one curious example (tested with Groovy 2.3.6, other versions I checked behave the same way):
  1 as Long == 1 as Integer //true (note, false in Java)
  1 as Integer == 1 as Long //true (note, false in Java)
  [(1 as Integer)].contains((1 as Long))  //false (inconsistent with equals!)
  (1 as Long) in [(1 as Integer)] //false

That is some scary stuff, right?  I think it is scary. This one is not very fixable,  but most Groovy bugs will eventually get fixed.  Hibernate bugs will be with us forever. This series was about bugs other people call functionality and sophisticated design.

Designing a good web framework is not easy.  I think the key is: the framework must be intuitive. It should do what developers expect to happen. That is one more reason for my focusing on GORM/Hibernate. It is hard to claim that these 2 are intuitive, but a similar claim becomes much more subjective in other parts of Grails.
Here is one example of a non-intuitive behavior: How does controller forwarding work. The intuitive behavior would be that forwarded method executes in a separate thread. It does not, it executes as part of the calling method. Instead of wasting time on disagreeing that this is a bad design, do this experiment: Create a controller with methods ‘a’ and ‘b’ and have 'a' forward to 'b'.  Add a filter which simply prints action name in ‘before’ and ‘afterView’.  Here is what you will get with Grails 2.4.3:
    before a,  before  b,  afterView b,  afterView b 
(both afterView print the same action name!).  Would it not be nicer if we got:
    before a,  afterView a,  before b,   afterView b      
Confusing design + mutating state == bugs. But how can I argue that this is not just an innocent overlook on the part of Grails/Spring framework?

One of the most irritating aspects of Grails is that everything is so intermittent, and that is by design. The fail-fast philosophy is totally foreign to this framework. For example, if calling object.save() no longer always saves the object (yes you are reading it right, see GRAILS-11797GRAILS-11536) then would you not want object.save() to fail if the object is not going to be ever saved?  Again, focusing on GROM/Hibernate simplified my job of demonstrating examples with a very intermittent behavior.
The uncanny ability to exacute bad code in Grails goes way beyond what I was able to demonstrate. This is the very scary: How did that ever work before? thing. I suspect something is wrong with Grails/Groovy compilation and a bad code sometimes magically works until some totally not relevant change exposes the problem. I cannot justify this claim. The only think I have is anecdotal evidence.

I have very strong opinions about what are the main causes of OOP bugs. That does not mean you would have agreed with me. Without good examples, this blog would have been labeled as a one more guy that thinks that 'FP is the new silver bullet'. Focusing on GORM allowed me to pinpoint the problems in a way that is hard to dispute. (And yet, I still got the label.)

Is it all about the cache?
All of the GORM problems listed in this series can be attributed in some way to Hibernate session/1st level cache.  You can argue that having cache is beneficial and that some problems are unavoidable with any cache implementation.
Ideally, caching should behave as if it was not there: application using a cache should work exactly the same way if the cache was removed. However, synchronization of ORM cache and DB state is a very hard problem and achieving ‘opaque’ implementation may be hard or even impossible.

Dear GORM/Hibernate:  If you can’t implement cache which behaves ‘like it is not there’ then don’t design your API like the cache ‘is not there’.  Make the cache very explicit and optional (Identity Map?). Invalidate cache immediately, or at least, provide a way for the application to learn as soon as you know that cached data is stale. Design invalidating (you like to call it session.clear()) your cache in a way that does not make half of objects used by the application useless.  Remember, you are just a cache, the data is still there!  The data is what is important, not you.  If you want to call yourself a cache, stop being so bossy!  :)

More criticism of Hibernate
I must point out that I am not the only Hibernate hater. Here are some examples:
http://www.slideshare.net/alimenkou/why-do-i-hate-hibernate
http://mentablog.soliveirajr.com/2012/11/hibernate-is-more-complex-than-the-problem-it-tries-to-solve/
http://brian.pontarelli.com/2007/04/03/hibernate-pitfalls-part-2/

In my blog, I have just scratched the surface.  Some examples of 'bug generating' design that could use more discussion include: defaulted and recommend 'flush:false' in save() (maybe a moot point now), or what happens when Hibernate session closes suddenly and unexpectedly (like, when transactional code fails).
I need to stop somewhere and this post seems a good place and time to stop.

Sneaking-in FP and other things that interest me
I decided that complaining about Grails not working right is much less powerful than pointing out why it does not work right.  I think analyzing and criticizing bad programs is a great way to advance programming skills. With each installment I tried to sneak-in some concept that explained why bad is bad: properties, shared state/side-effects, fail-fast, unit testability, even a bit of combinators. These are all common sense things that explain design flaws.

One thing I could not fit into my posts was types.  I decided that this will be too foreign concept in the context of Java and Groovy. Types are very powerful and I regret not finding a good place for them in this series.

Is FP the ‘new’ silver bullet?
I was asked this question and it makes sense that I try to answer it.
FP is not new, FP predates OOP,  Haskell is older than Java, combinatory logic is older than Turing machines, lambda calculus is about the same age.  The silver bullet is and probably always was the ability to logically reason on the code.

In this series, I tried to emphasize the importance of logic.  Programs should be logically simple and ‘mappable’ to logic. Programming and Logic are very related on a theoretical level (google: Curry-Howard).  This 3 are called the trinity of CS:  Type Theory, Category Theory and Proof Theory (google: Curry-Howard-Lambek).

Programs I write using Groovy and Grails may look straightforward and shorter than Java and Spring but from the point of view of logic these are still just spaghetti threads of programming instructions. Add to it a total disregard for side-effects and this exercise becomes equivalent to building a house of cards on a foundation that is shaking.

Quiz Question:  Recalling Logic 101, here is a logical ‘formula’:
    (a^b)=>c   ⇔   a=>(b=>c)
Do you know/can you figure out what that corresponds to in programming?  Answer at the end of this post.

I used to love Grails
I started this blog site with a series about 'Imperative curlies'.  My idea at that time was that I can count the number of curly braces ({}) in my code and use that number as a measure of how good my code is. The fewer 'curlies', the better the code.  The idea was to break away from coding and thinking using imperative sequences of instructions (for loops, if statements all use 'curlies').  I remember it worked very well for me. If you look at these old posts, you will see that there was a time when I really liked Grails.

Is all OOP bad?
I think that is a complex question.  Good OOP is about things like decoupling, separation of concerns, eliminating shared state, meaningful polymorphism, etc. These things may achieve some of the same goals FP is fighting for. The concept of a shared session state (Hibernate session) is not very OO.  Hibernate StatelessSession interface which does not extend Session is not a great example of OO polymorphism. Ability to decouple is mostly gone due to Hibernate non-localized side-effects. Hibernate is simply not a good OOP.

What any OOP will always lack is this: a clear and simple correspondence to logic. This is what makes FP unique.

Parting thought:
Answer to the Quiz Question:  It is currying. To see it, compare these 2 lines:
   (a^b)=>c              ⇔    a=>(b=>c)
   (a,b)->c              ⇔    a->(b->c)
   (2 argument function)       (function returning a function) 
First line is the logical formula. I have changed arrow-like symbols '=>' to look slightly different '->'. I have replaced '^' with ',' and ended up in FP!  This process is a mini-Category Theory in action. Cool, is it not?

There is no helping it, Groovy and Grails are OOP not FP. It is still important to be able to think outside of that box. Otherwise we will start convincing ourselves of things like ‘static definitions are always bad’, ‘unit testing is not about finding bugs’, ‘using refresh() resolves stale object problems in Hibernate session’,  or some other nonsense.

Thinking in C++, Thinking in Java: it is worth trying to stop it, even if you program in these languages.

Grails is and will be a very popular and buggy framework. We can only blame ourselves for that. Writing this series was a big effort for me. My biggest hope is that it made some of my readers stop for a moment with a 'hmm'.

The End (for now).

(I ended up republishing this blog due to some weird formatting issues -  if I created a chaos in your RRS/Atom feed - sorry!)

Saturday, October 18, 2014

I don't like Hibernate/Grails part 10: Repeatable finder, lessons learned

Repeatable finder (concurrency issue described in part 2) is what started/motivated this series. I had hoped that that this issue will draw some reaction from the community. It did not. Why? Tallying up all the answers/responses from the last 2 moths amounts to: 6, none of them useful or even correct. What does that mean?

In this series I tried to sneak in some things that interest me like FP and logical reasoning of code correctness.  I will use repeatable finder problem as a way to sneak in a bit more of this stuff later in this post.

Denial isn't just a river in Egypt?
This has been a twilight-zone.
To refresh you memory:  if more than one query is executed in a single Hibernate session and the result sets intersect then the second query returns a weird combination of old and new data.  That can break the logic in your code,  for example:
       Users.findAllByNickName('bob')

can return records with nickName != 'bob'. Other things can go wrong too: Maybe you have used a DB unique key to define equals()?  Or maybe you have used a DB unique key as a key in a Map? Any of this could go very wrong.

At first, I thought that the issue must be well know and I am missing some way of handling it. This, unfortunately, is not the case. Very recently, I came across this blog from 2009: orm-sucks-hibernate-sucks-even-more
"... take a look how even a silly CRUD application would suffer, once you've got "not-very-recent" object from the session"
that quote points to a (now non-existing) page on the hibernate website. Did we know more in 2009 that we know now?  If we did know, why have we allowed for this issue to stay unresolved? Well, this is all speculation.

I tried my best to do 2 things:  make the community aware and persuade Hibernate to fix it. I have failed miserably on both accounts. Here are the results of my efforts (as of Oct 17, 2014, tallied after a bit over 2 months since I started my crusade):
  • post part 2:  effectively no replies, but over 1100 reads.
  • Grails JIRA: incorrect comments and then ignored
  • Hibernate JIRA:  rejected (works as intended) with suggested work-around which is incorrect
  • Stack Overflow question:  a whooping +4 score (started at -1) and bunch of incorrect or meaningless answers
  • Grails forum: 0 replies
Hibernate ticket was the weirdest experience.  It got rejected very fast (not a bug) with a comment to just use refresh().  After pointing out that this workaround is a total nonsense, I was sent to read some completely not relevant documentation about concurrency.  After that, my (and Tim's) comments have been ignored.

What can I conclude from these 3 facts?:
  • nobody seems to know how to resolve or even work-around this issue
  • experts provide advice that is incorrect
  • there is no interest in solving, discussing or even acknowledging it as a problem
I do not know, but probably nothing good. I think it is interesting to try to puzzle out the few responses that the problem did generate. I will try to do that here.

The replies I got from the expects fall into 2 categories. The first category are answers like this:
  • It is any ORM issue
  • Any database application will have an issue like this 
It is true that the issue can be resolved with DB locking.  In particular, I could prevent repeatable finder by having all HTTP requests wrapped in long transactions and configuring higher (repeatable read) isolation level. Indeed, it is a big framework design failure, if we need to resort to things like this.
It is NOT true that any ORM and any database application will have this problem.  The most likely explanation for this type of response is that developers do not think about side-effects. They see Hibernate query and think of a SELECT statement only.  If I see a problem, it must be from the SELECT, where else would it come from?  This is consistent with the point I tried to make in my previous posts.

The second category are answers that suggest using refresh() or discard() to fix the problem:
'To fix your problem'
  • add refresh() to your code
  • or:  add discard()/evict() to your code
My first reaction was: Grrr, my second:  Hmm.  If I could only continue this conversation I am sure it would go like this:
Me: Where do I add these?  Expert's Reply: Add them where you have that problem. 

If you have been following various Hibernate discussion forums, you must have noticed that the same type of advice (either to add refresh() or to add evict()) shows up very frequently. This advice is never right.

Grails and Hibernate experts:  I am very disappointed in you.

Add refresh()...  add evict()... Thinking in Hibernate.
(Here is where I sneak-in some interesting stuff.)
This is how we typically reason about our code: the problem is on line 57 because variable xyz is ... and then on line 89 we do that..., and then on line 127 we have an if statement that goes like that...
We reason about our code by examining chains of programming instructions.

This is called imperative thinking and imperative programming.  If you read my previous posts you may assume that I consider such programs not logical,  they are logical, only the logic is very cumbersome and complex.
A well designed OO program is where lines 57, 89 and 127 are all in the same class and the chains of instructions we need to examine are relatively short.  In a procedural program lines 57, 89, and 127 can be anywhere and chains are long.  Badly designed OO programs behave like procedural programs.

Repeatable finder is a great example where imperative reasoning fails.  The problem is not something between lines 57 and 89 or something on line 115.  The problem is (or can be) anywhere.
Answer 'add refresh()' or 'add discard()' is a very imperative thinking: it assumes I can add it on line 89.  (I can only conclude that this expert advice is not to sprinkle refresh() all over my code just for fun ... and because my code will run too fast without it.)

So what is the alternative?  The idea is to think about a block of code in a way that can prove certain behavior of that block. If we know that a code block 'A' exhibits the same behavior no matter where it is placed or how is used, then we no longer need to think about lines 57, 89, and 127 or about chains of computing instructions.

This is called declarative thinking and programming.  Logical reasoning is now simplified, I no longer need to follow chains of programming steps to reason about the correctness.

Declarative thinking works great, except, if I cannot trust any property, even that:
    Users.findAllByNickName('bob').every{it.nickName == 'bob'}
then I am stuck.
That may sound like a limitation of declarative programming:  I can still keep going using imperative approach. That is true, and we all 'keep going'.  I did not stop programming my project and no, I did not add refresh() all over my code. That is why our applications are so buggy: we ignore logical problems unless we can pin them to line 127.

Side Note: The fun starts when I start combining my declarative code blocks into bigger blocks. Code needs to be logically composable.  I want a bigger block (composed of smaller blocks) to have properties too. Some like to call it programming with combinators.

Conclusions:
I would like to suggest this as a new rule of thumb: 
  the answer to use Hibernate/GORM refresh() or evict()/discard() is wrong regardless of the question.
(with exception of Functional Tests - which may need to refresh some records used in asserts).  Please comment below if you find a counter example to this rule.

I am not claiming that I know the solution to Repeatable Finder.  Maintaining Hibernate cache synchronized with the DB is hard, maybe impossible.  One way of dealing with hard problems is: make them somebody else's.  If GORM/Hibernate just told me when the query is lying (returns stale data even if it has new data) or allowed me to request/configure the query to refresh all records... That would go a long way.

It looks like the community has decided to not acknowledge Repeatable Finder as a problem. There is really no good solution for it and acknowledging it would be admitting to that fact. This issue is likely to remain unsolved and ignored. More complex Grails apps are doomed to work incorrectly under heavier concurrent use.

I have added a label to my posts (which does not work so do not click on it - and that convinces me that blogger must be using Hibernate):  'Stop thinking in C++ and Java'.  I think we need to stop thinking imperative or at least stop thinking only in imperative terms.

Next post:  I need to do one more to wrap-up. I will be finishing this series next week.

I have a busy period ahead of me.  I started going over a set of courses published online by University of Oregon (Oregon Programming Languages Summer School) and that will be many hours of not very easy listening and learning.  I also have to start preparing/training for a ski camp in early November (I live in CO and skiing has started here already).  No, this will not be a SKI calculus camp;) - but then, believe it or not, technical skiing is (or should be) a fun intellectual activity too.

Friday, October 10, 2014

I don't like Hibernate/Grails part 9: Testable code

"I get paid for code that works, not for tests, so my philosophy is to test as little as possible to reach a given level of confidence". Kent Beck.
Finding more problems with less effort is something I totally agree with.  Which approach to testing will give me the most bang for the buck?  I like to think about tests in a pragmatic way: I write tests to find bugs and guard against bugs. I consider manual testing rather ineffective and, in most cases, inferior when compared to automated test.

Testing seems to be deeply related to my last two posts.  It is obviously related to software correctness. Less obviously: out approach to testing impacts how we perceive the framework, how we test can explain why we like or don't like Hibernate and Grails.

I have moved away from unit test in Grails.  My approach to testing is an 'inverted pyramid'. I test internal implementation details using Grails integration tests and I write a lot of functional tests.  My test are less 'unit' than you may like them to be (they interact with actual database, mocking is replaced with data setups) but are closer to the reality. This post explains my reasoning behind this decision.

I am considering the unit/integration/functional division from the 'testing philosophy' point of view.  I care less if tests are placed in test/unit, test/integration or test/functional folder.  I will stay away from the hate-love TDD debate.

Testing choices: (Just to make sure we are on the same page.)
Testing spectrum has these 2 extremes with very different characteristics:
(1) Unit Tests:  testing expected behavior; testing in a fake environment; testing internals not visible to the end users 
Also: white box testing; bottom-up testing
I will test parts (units) of my software in isolation from anything else.  I will 'mock' interaction with the rest of the system.  Since I know exactly what can go wrong, I can write mocks that exercise the tested part under a specific situation.
(The mocking aspect is specific to OOP, nobody writes mocks in FP... with that said, there is this great book:  http://en.wikipedia.org/wiki/To_Mock_a_Mockingbird ;))

(2) Functional Tests: testing for unexpected problems; testing in a real environment; testing functionality visible to the end users 
Also: black (more gray than black - I need testing 'hooks') box testing; top-down testing
I will test my application as a whole.  I identify a list of specifications and I write tests to verify that my application works correctly with respect to these specifications. I do not need to know everything that can go wrong, I assume that since I covered a comprehensive range of data and scenarios representing actual software usage, most of what can go wrong will be uncovered.

Integration tests fall between 1 and 2.

In this post I argue that testing in Grails needs to focus on unexpected problems, and should be done in a real (or close to real) environment. That moves the testing away from unit and towards integration and functional.

Why Unit Tests are Great:
  • execution speed: no need to startup the whole infrastructure, etc
  • good coupling: the tested unit becomes married to the unit test, this can be a happy, strong marriage that is going to survive ups and downs of refactoring.
  • atomic/unit nature: the idea is to make a perfect whole from perfect units. (Except, that principle is logically flawed in OOP.) This is sometimes called bottom-up testing. 
  • unit tests can aid software design and coding.
  • aggressive conditions: once you know what kind of problems to expect you can stress the code 'more aggressively' introducing scenarios which are rare or complex to setup using integration of functional tests.  This is rarely done.
This post is not a criticism of Unit Tests.  It is a criticism of how they are used.
Where is the most bang?
Some people disagree with me when I say:  What really needs testing is side-effects. Errors caused by side-effects are the hardest to troubleshoot and fix and are often very intermittent.  Side-effects limit developer ability to do logical reasoning on the code.  Side-effects can be very confusing.
There are many books and articles about testing and side-effects have not been discussed much.  Why is that?

Case study:  This example is repeated from my last post: 
    def users = User.findAllByOffice(office1) //code (A)

Assume user has userName (with unique constraint),  office (of type Office) and userPreferences (of type UserPreferences). This code:
  1. will issue a SELECT statement (with whatever locks) 
  2. can save some changed objects to the database (part 5)
  3. can save some unchanged objects too  (part 6)
  4. will impact some record types returned by queries that follow it (from proxies to actual objects).  Some records returned from this query will use Hibernate Proxy to implement preferences and some will use the actual UserPreferences class. (Some developers may be surprised that the same goes for the office association). (part 4)
  5. will impact the data content of some records returned by queries that follow it.  Similarly, the content of some records returned from this query may be different from the data returned by the underlying SELECT statement. (part 2)
Can simply adding line (A) break my code?  Clearly it can!

Side Note 1: Why are side-effects confusing:  There is this theory that human memory and reasoning work by 'chunking'.   The idea is that the human brain stores knowledge in chunks.  Each chunk gets a label which works something like a DB index. Human brain can recall the whole chunk using that label. There is this very prominent chunk we have all formed:  SQL SELECT query. When you look at code (A) you inadvertently call for that chunk.  Yet 4 out of 5 side-effects associated with that code have nothing to do with 'SELECT' chunk your brain has just found.

Side Note 2: Why are side-effects not logical:  Maybe a better term would be: not logic-friendly.  I have dedicated large parts of this post to explaining why, but I will add a quick high-level explanation from a slightly different angle.  You can safely skip this side-note if you are allergic to academic CS.

Logic needs connectives, to start with, it needs conjunction ('and', '^'). Logical conjunction follows this reduction rule:
  if A is true 'and' B is true then B is true.
You can think of this as one of the axioms. Logic cannot start without it. The programming equivalent of this is the 'beta reduction' rule which looks more or less like this:
  second (A, B) '=='  B
and means: if I do computations A and B (compute a pair) and then ignore A, then this should be equivalent to just computing B.  This is obviously not true if A has side-effects impacting computation B! Side-effects inflict a mortal wound to the (straightforward) correspondence between logic and programming.

The above paragraph may as well have been copied from first pages of an introductory Type Theory book. Keeping side-effects on a tight leash allows to recover correspondence to logic, but, this is no longer first pages (google: 'Hoare Type Theory'). A comment by Mark, on my software correctness post, used the term 'effect' (contrast it with the more unruly 'side-effect', google: 'lax logic', 'monads in lax logic' or 'Effect System').
With unruly 'anything goes' side-effects logical reasoning on a program becomes very, very complex and much less potent, Type Theory-Logic correspondence is lost. I am trying to learn this stuff but even at my current newbe level: it is eye opening.

If you decided to skip the last side-note this is a good point to return:
Welcome back!  You can ignore all of this academic mumbo-jumbo. Just be aware that whatever voodoo your brain does to form judgments about your GORM code (or any other code with unruly side-effects) should be a suspect. This voodoo is far from any straightforward logic and it is easy to get things wrong. Hopefully, I managed to scientifically convince you that: 
  • Side-effects need extra attention when testing
  • This will not be: 'testing expected behavior'
Here is a more pragmatic argument:  As I explained in my last post: side-effects 2-5 are a no-show for very simple CRUD applications and are not that frequent for simple CRUD apps.  The question is: are you satisfied with your app if it works 95% of the time?
The point is: complex Grails apps or apps that strive for more than 95% correctness need to make serious attempt to test side-effects 2-5.

Unit Test Ostrichism
Keep your nose out of trouble, and no trouble'll come to you. - The Lord of the Rings movie
I have been scratching my head asking: why my current project is finding so many issues that nobody else has reported.  One possible answer is:  everyone else has been unit testing and 'assuming away' the reality.  If that is true, then maybe everyone else has also decided that Hibernate is OK.

Here is a question for you:  When writing unit tests for code that includes GORM queries (assume something similar to code (A)), do you use mocks to verify that:
  • the query does not save any objects? 
  • your code works correctly even if the query returns User object with office2 != office1?
  • your code works correctly even if the query returns objects that violate DB enforced constraints (such as more than one user with the same userName)?
  • your code works correctly not only on actual domain objects but also on hibernate proxies? 
I do not. I decided that the amount of work needed to write tests like this would be prohibitive.  But if you disagree please drop me a comment!

Here is another question:  The above bullet list spells out some impacts of side-effects 2-5. This list is not complete.  Can you think of other examples?
I know that I do not know all impacts.  I am not even sure if I understand all GORM side-effects. Testing for expected problems almost guarantees that I will not find what I do not know.

Code with non-localized impacts is not testable.
OO code needs decoupling.  In a well written OO code, state mutation in object A does not impact other objects. OO software designed without decoupling is not testable.  There is a technical term for it: spaghetti.

Almost every post in my 'I don't like' series has shown an example of GORM code where the behavior changes (even breaks) when an isolated GORM query is added to or removed from the code. Hibernate queries create impacts that cannot be easily localized to one or few classes.

How can GORM and Hibernate defend their design as testable?

Fail Fast and Test Easy:
This is something very much missing in Grails.  Very often an incorrect code is likely to work just fine 50% maybe even 80% of the time. Data changes or a query is added or removed somewhere and things break.
Most of my posts in this series pointed out examples like this (see part 3 and part 6).  Ideally, incorrect code should either fail to compile or should fail during my first attempt to execute it.  This is often not that easy to accomplish with a languages like Java or Groovy, but there is just no excuse for, for example, allowing me to use GORM object obtained using hibernate session S1 within session S2.  If such code fails intermittently it should ALWAYS fail.

Without fail-fast philosophy unit tests are not worth much.

Unit Testing and FP:
"Writing unit tests is reinventing functional programming in non-functional languages" (Christian Sunesson - on github).  (I found the linked post a very interesting reading).
Here is a mental exercise:  when reading the following blog about 'testable code': http://googletesting.blogspot.in/2008/08/by-miko-hevery-so-you-decided-to.html , think how each of the guidelines relates to FP. Notice they are all N/A!
'Testable code' term is OOP self admitting to its limitations (my OO code is not testable unless I follow these list of rules...). FP code does not need any tweaking or special guidelines to be testable.

In many respects Hibernate design is the opposite of FP. If one is very unit testable the other one probably is the opposite of testable.

Not just Hibernate:
Grails/GORM/Hibernate stack is complex and has some complex bugs. So does Groovy. I know that lots of my app functionality will break next time I upgrade Grails. Is testing for 'expected' problems a the best investment in this environment?

Conclusions:
Grails contradicts itself on unit testing:  Grails framework provides rich tools to unit test domain objects, only these tests will be rather useless. I think, this confusion is not something that Grails has introduced. Java community in general does not perceive side-effects as something to worry much about. Unfortunately, perception != reality.

More on the confusion:  it should be clear by know that the blame for a lot of this resides with how Hibernate session works. Non-localized side-effects are really hard to test.  With that stated, I would expect each JUnit integration tests in Grails to run in its own session.  The very idea of several tests sharing the same Hibernate session is repulsive.  Take a look that these JIRA tickets: GRAILS-11644GRAILS-11706.

I find Grails attitude towards testing insanely confusing.

Testable Code
Testable code should be defined as: a code that makes unit tests effective in bug prevention.
This statement is not something everyone agrees with. Unit tests are often viewed as a way to aid the design and coding process (TDD) and this becomes their main purpose in life.  
"Unit testing is not about finding bugs", see Writing Great Unit Tests. Don't you agree that something is very wrong with this sentence?  Instead of saying "unit testing is not about finding bugs", maybe we should rethink how we write the tested code so it is?  How about: start paying attention to the side-effects?

If you are doing something over and over and it does not work you have 3 choices:
  • keep on doing it and expect it will work (called insanity, also know as Grails approach to unit testing)
  • keep doing it and null out your expectations ("unit testing is not about finding bugs")
  • change the way you do it (redesign the tested code or/and the way you test).
I am suggesting the 3rd bullet is the way to go.  I cannot redesign Grails but I can rethink how I test.

Functional Test Testable
I like Geb. Writing good functional tests is not trivial and, like with unit tests, it impacts the design of the tested code. The term 'testable code' needs to be extended to functional tests.  ... it maybe an idea for a long post somewhere in the future.

Next Post:
I want to start wrapping up my 'I don't like' series.  In my next post, I plan to give an update about the repeatable finder issue (post 2) and include a few final thoughts.



Friday, October 3, 2014

I don't like Hibernate/Grails, part 8, but some like Hibernate and Grails. Why?

Small change in plans. I wanted to write about testing, but this topic logically precedes testing. Writing last 2 posts made me realize something: each application is different and Grails/Hibernate problems I am likely to notice are very, very much dependent on the app I am working on.

Simple App:  Think of something you can generate by asking Grails to do the coding for you. Simple app is a CRUD application with these characteristics:
  • Simple domain Objects without relationships
  • No transactions/services
  • Not more than one hibernate query per request
  • Simple validation logic (contained in domain objects)
This example repeats part of my last post.  Assume that I have a domain object User which looks similar to this:
    class User {
        ...
        Office office
        UserPreferences preferences
    }

In a simple app this code:
    User.findAllByOffice(office1) //Code Example (A)

will exhibit only one type of side-effects: it issues SELECT statement + some DB locks.

Complex App:  In my last post I have listed several side effects that can be associated with (A). Clearly, complex apps are a different ball game!  Here are the side effects listed again.  Code (A)
  1. will issue a SELECT statement (with whatever locks) 
  2. can save some changed objects to the database (part 5)
  3. can save some unchanged objects too  (part 6)
  4. will impact some record types returned by queries that follow it (from proxies to actual objects).  Some records returned from this query will use Hibernate Proxy to implement preferences and some will use the actual UserPreferences class. (Some developers may be surprised that the same goes for the office association). (part 4)
  5. will impact the data content of some records returned by queries that follow it.  Similarly, the content of some records returned from this query may be different from the data returned by the underlying SELECT statement. (part 2)
Side-effect 2-5 have very non-local impacts (can affect unrelated parts of code sharing the same hibernate session) and can be extremely hard to avoid.  This leads to code that is unpredictable.  So, how is it possible for Hibernate to be so popular?

Counting the number of side-effects in (A) that impact my code is just one possible measure of my app complexity.  As my application becomes more complex, these side-effects become 'louder' (impacts become more frequent and more noticeable).  In this post, I am less interested in how many side-effects impact my code, more in how 'loud' they are. 

Following table shows a fictional CRUD application that started very simple and became more complex over time.  
H5 - total number of side-effects in queries similar to (A)
H6 - frequency of problems related to queries similar to (A)



Complexity
Exposure to Problems
Explanation of impacts
H5
H6
1
Simple App
Happy Days!
Happy Days!
1

2
Add Domain Object Relationships
Hibernate Proxy Objects
2
low
3
Add more complex validation - more than one query in the same session
Repeatable Finder
3
low
4
Move validation logic outside of domain objects
Auto-flushing
4
low -mid
5
Add Services and Transactions
LazyInitializationException

Auto-flashing becomes less loud (unwanted saves are rolled back)

Transactional integration tests diverge from reality
Auto-flushing can still be a problem: saving can fail.
mid
6
More complex data passed from client
Unmarshalling of client data keeps changing between Grails versions
Small intro to version upgrade problems

mid
7
Added logic in Filters
Higher probability of repeatable finder issue.

Need for more complex test infrastructure

Side Note: What is the before-after-afterView ordering if Controller does a forward?

Potentially interesting filter cleanup or setup ordering issues

mid
8
Application managed hibernate sessions
DuplicateKeyException and similar hard to troubleshoot errors.

Queries can save unmodified objects.
5
high

8 needs more explanation.  Here is one example why I need to create small hibernate sessions:
Integration tests may need to include scenarios mimicking activity performed over several HTTP requests.  Typically, I see such tests mashing all logic into one test method executing everything in the scope of the same hibernate session.  In real life, each of the requests will be performed in a separate hibernate session. Tests have diverged from reality.  Side-effects 2-5 listed above will have dramatically different impact on the code when ran in the test environment.

Audacious App:
Consider Grails/Hibernate implementing Type 2 Slowly Changing Dimension (the one using time slices).  To do that, I may want to use something like a session variable (available in most databases, including Oracle or Postgresql) to define a time point and use read-only views to get a snapshot of all my data at the specified time point.  I will configure my GORM domain objects against these views and see what happens.

To use session variables, I will need to wrestle with Spring framework DB connection management to make sure that the connection is not swapped under me, because that would change the definition of session variable.  That is not trivial but doable.

How does that change my exposure to Hibernate side-effects?  Consider this: moving the time-point one day forward applies a day worth of user activity in just a few milliseconds!  Many concurrency problems in the application just became super loud. That includes repeatable finder (side-effect  5, my blog part 2).  Because of that I need to wrap each use of a time-point session variable with a withNewSession() block (or something equivalent).  That exposes my code to issues documented in part 3 and 6 (side-effect number 3!).  All of these are now super loud.

Conclusions:
This is my best attempt to explain the discrepancy in perception of Hibernate and Grails. I think there is more going on that I do not understand, but this is my best answer to-date.

The 5 side-effects listed in this post are worth investigating more.  I will refer to them again when talking about testing (next post).

Saturday, September 20, 2014

My dream: software without any bugs ... and is Groovy functional? How about Grails?

This post is about a topic that absolutely fascinates me: completely bug free software (is that even possible?).  This may be not a very easy reading but I do hope you will stay with me to the end of this post.  I divided it into smaller sections, so you can do a quick skim, come back to drill into details.  The intended audience is Groovy developers. (But if you do not program in Groovy you may find a lot of this relevant too.)

This post was motivated by several things, some Groovy, some Grails, some Java, but one of the biggest motivators was Hibernate rejection of this JIRA ticket: HHH-9367.  

Death, Taxes and Software Errors:

Software tests provide empirical evidence of software correctness.  With all the software projects humanity has completed, we also have (a much stronger) evidence that no matter how well tested, the software will have bugs.  Tests (assume 100% code coverage) stress the software under a fraction of possible combinations of things that impact it.  We write tests because that is the best thing we know how to do.

Is there any different way to achieve software correctness than testing? 

On one of my job interviews I have been asked about my attitude towards TDD and testing in general, and I answered:
  Ideally, I would love to write code that does not need to be tested. 
As you can guess, I did not get that job.

Functional Programming:

"The Groovy programming language, since its inception, has always been pretty functional"
(http://glaforge.appspot.com/article/functional-groovy-presentation).  This is a very good and informative presentation. Growing level of interest in Groovy/Grails community in functional programming is a great thing.  I like the gradient, Groovy language is moving in the FP direction.

But ... there seems to be quite a bit of confusion about what FP is. This state of confusion is very normal for our industry.  Here is an example:  What is OO? Tim Rentsch (1982): "Every manufacturer will promote his products as supporting it. Every manager will pay lip service to it. Every programmer will practice it (differently). And no one will know just what it is".  This proverbial words can equally well be applied to Functional Programming,  RESTful design and many other concepts.

So what is FP?  Some will say that FP is about using the new stream API in Java (are we confusing Functional P with Fluent P?), but open an introductory paper about FP and you are likely to be greeted with terms like Kleisli category (where is my category theory book?).  It is hard to find any middle-ground here.  I will focus on a better question: what is FP forThe why? question is often less confusing and easier to understand than the what?  


Back to software without errors:

Fluff ends here.  'Correct software' means nothing until I spell out how exactly I expect the software to work.  I need to formalize this somehow.  I will use the concept of property to do that.
Property is simply a logical condition about the code.  Verifying a property is verifying that the code works as expected.
So how can I verify with 100% certainty that software satisfies a property?   For this demonstration, I need something that is easy to reason about:
Recursion:  I chose recursion as my tool of choice for this post because: it is a well understood concept and because it is very well suited to proving correctness (there is this Math 101 thing called mathematical induction).

Examples of Properties:  Here are just some examples to think about.
Example 1:  Groovy allows to overload operators, in particular '+'.  What properties does the '+' have?  Can I assume that
   (a + b) + c  == a + (b + c)
(which you would expect for '+' in algebra)?   The answer is NO.   It is OK for developer to code whatever he/she pleases when overloading '+'.

Example 2:  If I only had a penny for each time when equals() and hashCode() have not been implemented consistently in a Java app:
      if a == b then a.hashCode() == b.hashCode()
Java documentation tells developers to do it, well they often don't.  This is part of imperative language culture:  polymorphism does not mean much.

Example 2b: Another property related to Java equals() is this:
    if a == b and b == c then a == c
It is supposed to be always true. Is it?

Example 3:   Groovy has a method defined on collections called collect. It applies a closure to each element of a list returning a new list. This is often called map in other places.  Recall that Closures overload << to mean function composition.  Is there a property associated with these? Here is one:
  list.collect(c1).collect(c2) == list.collect(c2 << c1)
If I have used fpiglet, I could write this in a much cleaner form:
  (map(c1) << map(c2)) (list) == map(c1 << c2)(list)
and, if it was possible to implement a meaningful equals for closures, we could simply say:
  map(c1) << map(c2) == map(c1 << c2)
another words: map 'preserves' function composition.   Incidentally, there is a name for this property, it is called second Functor law.

Properties are software requirements that are 'logic friendly'.  They typically have the form:
     For all [list of symbols]  [logical expression]
because 'For some' just doesn't cut it in logical reasoning.  Property testing is one of the things missing in OO programming. But more on this later.

Example 4 (Advanced):  Try formulating some of your application business requirements as a property. (Example: username has to be unique across non-retired users. Where would I formulate it: DB records, domain objects, JSON results, ...?)

What is wrong with Java equals():

I will not be able to go far with logical reasoning about source code if I cannot say that code A is equivalent to code B.  Consider this Groovy code:
 Closure c1 = {int x -> x+1}
 Closure c2 = {int x -> x+1}
 assert c1 == c2 //FAILS

c1 and c2 are logically the same function but == comparison between them fails.  Is this a bug in Groovy?  No, this is a general (any computer language) problem.  It is impossible to computationally verify that 2 functions are the same.  Programmatic == is not something that always makes sense.  (Language can be more logical than Java about this and if you attempt to use == where it does not make computational sense, the compiler could reject your code - Haskell does that.)
Also, see Example 2 and 2b,  to do logical reasoning we need something stronger, something that does not depend on a developer's whim.

I will use '==' to indicate logically equivalent code.  So 1 '==' 1 and c1 '==' c2.  Basically, A '==' B if I can guarantee that replacing code A with code B will not change the behavior of the program.


What is wrong with side-effects:

I want to be able to logically reason correctness of parts of my code.  This is close to impossible if my code has side-effects.  To be able to do formal reasoning I need my code to be predictable.  If c is a closure and x is its parameter, I would like to have this 'predictability' property:
    c(x) '==' c(x)
... basically if I call a function twice with the same arguments I expect the same result.

Example:  Consider this Groovy code:
    def i = 0
    Closure f = { x -> i++}
    assert f(1) == f(1) //FAILS

Things take a dramatic turn if I remove possibility of side-effects:  the above property has to be true.   Lack of side-effects makes programs behave in a very predictable way.  This is the key needed to do logical reasoning about source code correctness.

But we do need side-effects!  We need to write/read files, database records, sockets, etc.  If you think of predictability and logical reasoning as underlying goals for FP, the following are obvious conclusions:
  • Side-effects need to be somehow isolated/decoupled 
  • Side-effects need to be as explicit as possible (for example, in Haskell, compiler can distinguish between code that wants to mutate content of a file from code that wants to mutate content of a variable, both are isolated from each other).

What is wrong with OO:

Objects are meant to encapsulate both state and behavior.  This provides some level of control over state mutation but the mutating state ends up spread out all over the application. And there is typically a lot of it too. We cannot completely avoid side-effects but OO programming uses side-effects where they are not necessarily. Good example to think about is Java Beans architecture where each parameter is a state.

Objects are not a good match for formal reasoning for several reasons (one of them is the internal state) and are not very composable.
So what is the alternative that is composable and allows for formal reasoning?  The answer is: function, not the old C function, but the even older mathematical function.

FP language definition: 

I will use a very strict definition of FP.  I will call language functional if programming in it is done using functions and the language can guarantee no side-effects (at least on parts of the application code).

This definition makes FP language code behave like symbolic calculations in math.  If x=1 and y=2 then nobody expects x or y to change because I wrote equation f(x) = g(y).  Functions behave like mathematical functions - hence the term Functional Programming.  And math is ... yeah, the science in charge of logical correctness.

For FP language to be useful we need the ability to create side-effects.  I simply assume that the language has the ability to clearly isolate such code. Anything without side-effects will be called pure and anything with side-effects will be called impure, and I assume the language to have a way to force separation between pure code and impure code.  Ability to do logical reasoning will be limited in the impure part.

There is currently only one commercially available language satisfying this definition:  Haskell.   To make room for the likes of Erlang, Clojure, Scala some people try to relax what FP language means (see FP style section of this post).
Side Note: Here is a cool thing to consider.  Functions need arguments.  To be able to write f(x) we need a concept of a variable x.  We can think about x as a special 'constant' function.  In FP language 'everything' is a function!  Well, almost everything.

Bug free proven code:

I will show that the '2nd functor law' property (Example 3) is true based on how map and function composition are implemented. I will do that purely by formal reasoning.

This is how map is coded in Haskell (I have omitted the type declaration, but this is full implementation code):
map _ []     = []
map f (x:xs) = (f x):(map f xs)
If you have never seen this code, it is worth spending some time thinking about it and puzzling it out. The syntax should be very intuitive.  I use it because it is super short.  Here is quick explanation: '_' means anything (here any function); '[]' means empty list; 'x:xs' means a list with first element 'x' and tail 'xs';  'f x' means function f applied to x;  'map f xs' means map function applied to function 'f 'and list 'xs';  and parentheses are used only for logical grouping.
(... If you still have problems parsing this code:  This is a recursive implementation.  Read it like so:
line1:  for any function (_) and empty list ([ ]) result of map is an empty list ([ ]);  
line2:  for function 'f' and list starting with 'x' and tailing with 'xs' the result is a list starting with 'f(x)' and tailing with (recursion) 'map f xs'.)  
You may be thinking: Stack Overflow.   I am not going into details, but that is not a problem.  We do not need to understand why and how stack overflow is prevented to move forward with this post. (The keyword here is: laziness.)

We also need implementation code for function composition (which will be noted with '.'). Here is the full implementation code which had me break some sweat:
(f . g) x = f (g x) 
So, I want to actually prove that, for the above implementation code, the following line has to be logically true:
 map f (map g xs) '==' map f.g xs       (2FL)
for all functions f, g and any list xs.  (Side note: if would be cool to write LHS as: '(map f . map g) xs', this is called currying, and I am not going there in this post).

The Proof:
I will do that in 2 steps, first for empty list that is simple:
map f (map g []) '==' [] '==' map f.g [] 
(both equalities are true because of the first line of implementation of map).

Second, I will prove (2FL) for list (x:xs) assuming that (2FL) is true for xs (mathematical induction).
Using second line of implementation of map I have:
map f (map g (x:xs)) '==' map f (g x : map g xs) '==' f(g x): map f (map g xs) 
because of how function composition is implemented, I get this:
'==' (f.g)x : map f (map g xs) 
using inductive assumption gives me:
'==' (f.g)x : map (f.g) xs 
and again second line of implementation of map yields:
'==' map (f.g) (x:xs) 
Done!   I have proven it.   There is no logical possibility for a bug here!  This property is something we can trust to be always true.  So here we have it:  the strongest test ever written for a computer program.  If you use unit tests in your coding, think of this as a unit-proof!  Notice the power of declarative code: map implementation code is really a set of 2 'math formulas'. (Actually, you may have noticed that implementation of map is really a set of 2 properties).

So is writing proofs in the job description for future developers?  Should I be studying Bird–Meertens Formalism or something?  I would not mind if this was the case, but I do not think so.  I don't believe there is a commercially available language or a code analysis tool that can prove a simple property like this today, but tomorrow...

For very diligent readers:  Can still something go wrong with (2FL)?  I made an implicit assumption that the language itself will execute the code correctly.  The instruction sets on our CPUs are imperative and functional programming language needs to have imperative implementation layer.  So the language itself can only be 'tested' for correctness.  Despite that limitation, I hope you agree that combining math with FP code leads to something quite amazing.

Exercise (only for the brave):  Prove that Groovy's collect() implementation code satisfies Groovy equivalent of (2FL).  Let me know when you succeed ;)

Exercise for the reader:  Where did I assume side-effect free code in the proof?  Come up with list a and closures f and g in Groovy so that:
  a.collect(g).collect(f) != a.collect(f << g)

Oh, NO!  And I was so excited that I have something I KNOW and can trust is always true. This is not a bug, Groovy works here as 'expected'. Yes, the 'brave reader exercise' was a booby trap, but the journey was the important part of that exercise. There are three issues:
  • it is very hard to formally reason if code has side-effects
  • it is very hard to formally reason imperative code even if you assume no side-effects
  • with side-effects, most of the the properties that we can come up with will not be true or will need to be weakened.

FP as programming style:

Let us look at yet another example to see that side-effects mess up everything:  If I asked for a vote if the following property has to be true, I bet the overwhelming answer would be: yes it has to.
   list.findAll(predicate).every(predicate)  

Well it is not:
 def list = [1,2,3,4]
 def b = true
 def predicate = {x -> b=!b; return (b && x==2)}
 assert list.findAll(predicate).every(predicate)  //FAILS

This is something to ponder for a few moments:  We could make the above property work if we considered only predicates that have no side-effects.  But it is not possible for me to know if a Closure I use in my program has side effects or not unless I have access to the source code. This is one of the reasons why in imperative languages the term Functional Programming is used to describe something much weaker: a programming style.   Still, this style can be very powerful.  Language cannot verify that code is pure, but this knowledge can be established with code organization, naming conventions, etc.
  

QuickCheck:  Less than proving but more than traditional testing: 

There is a big synergy between writing functional code and writing unit-testable code, but OO Unit Testing is a bit different, harder to implement, and significantly weaker.  In expression 'f(x)':  'x' does not have any behavior to test, you need to test 'f,' not 'x'.  We can test properties of 'f'.  This is done by running a particular property against a large number of 'randomly' generated input.  Such 'random' generation needs to create a comprehensive set of data.  This method works very, very well in uncovering problems developers can fail to envision.  I think of this as something that gives me very high probability (very close to 1) that my code works, not just some empirical evidence that OO testing provides.

So, instead of actually proving the second functor law, we could have tried to throw lots of data at it.  That would mean lots of different lists and lots of different functions. Can functions be generated randomly?  Yes they can!  There is a fantastic test library (QuickCheck) that has been ported to various languages, but these ports are not as good as the original. There is a catch (as far as I know): you have to run the tests type by type (lists of integers, lists of doubles, lists of strings, etc).

I will, again, throw a bit of Haskell code here to see test implementation for 2'nd functor law. With properly declared 'f', 'g' and 'x', this is it: ('fmap' generalizes 'map' and works across many other types, not just collections)

... = (fmap (g . f) x) == ((fmap g . fmap f) x)
This is the full test implementation code.  Neat, is it not?
I wanted to show this code, because it demonstrates terseness at its best. Terseness that Groovy can learn a lot from.   For me the definition of readable code is: 
                    code that expresses the problem, not the solution.
This program is extremely polymorphic: this test can be ran on lists of any type (that supports ==) as well as on other functors (whatever that means).

Code that just works:  

There is this belief among Haskell developers that if the code compiles it will work as intended (http://www.haskell.org/haskellwiki/Why_Haskell_just_works).  GHC is a very, very smart compiler, but a lot of this is not Haskell specific and it is true about FP in general.
FP has this almost unreal thing going for it:  if it 'type checks', it works.  If my code is wrong - most likely the type signatures are wrong.  I have experienced this even when writing fpiglet (which is not strongly typed and compiles with Groovy so I was the 'strong' compiler).

Conclusions:

I need to stop here because the likelihood of readers continuing on is probably diminishing rapidly. If you went that far with me: Thank you!, I hope the journey was worth the effort!  If you have not seen much of FP before, you probably have a headache, but I hope it is a good headache!

I have combined some very basic Math with a simple functional program and got (I think you will agree) some amazing results.  We have actually managed a formal mathematical proof of code correctness.  The FP-Math relationship is very strong and well established,  it is one of the things that makes FP what it is.  I think of that this way:  we want bug free software and there is this couple thousand years old science dealing with logical correctness.  Seems like a no-brainer to put these together ... and FP puts these together!

What is FP for?  That has many answers, including:  bug free code, concurrency safe code,  parallel computing, performance, extreme expressiveness and composability.  The underlying goals are predictability and logical reasoning.  Bug free code is to me the strongest reason for FP. Some claim that testable code is simply a better code. If that is true than 'provable' code is simply a much, much better code.

So how do we write FP code?  Here are the common denominators:  Code needs to be declarative. Pure code (without side-effects) needs to be isolated from non-pure code (with side-effects).   Any side-effects need to be as explicit as possible.  Properties need to be identified and tested.

State without state:  Most side-effects in OO programming are caused by the need to handle internal object state. I am sure you remember this viral quote:
         'hypermedia is the engine of application state'.  
Yeah, we do not need that stinky HTTP session. Stop using it and state disappears, as do many bugs.  FP is a lot like that.  You can use partial application (curry in Groovy)  as 'an engine of state' but there are other 'engines of state' (reader or state monad, lenses, and more).
This is how you know that your code is functional:  you start using all these scary concepts some people talk about.

So is Groovy functional?  I have placed that bar too high. Here are better questions to think about: Do Groovy programs (or we as Groovy developers) :
  • isolate pure code?
  • isolate specific side-effects?
  • avoid imperative (write declarative) code?
  • use properties to define and verify behavior?
  • favor methods or closures (used as functions) in API design?  
Maybe the most important thing in FP is not the language but the programming community?  That is what makes Erlang or Scala more functional.

How functional is Grails?    Let's look at this code:
    def users = User.findAllByOffice(office1)

and analyze its side-effects from the point of view of predictability, suitability for logical reasoning, and explicitness.  Well, in addition to issuing a SELECT statement, the above code :
  • can save some changed objects to the database
  • can save some unchanged objects too  
  • will impact the data content of some records returned by queries that follow it
  • will impact some record types returned by queries that follow it (from proxies to actual objects)
  • it will return records that may be different from what is currently in the database and you have no way of finding which of them are different
What kind of properties can we assert about this code?  Not many, for example the obvious candidates:
     users.every{it.office == office1}
or (assume database enforced unique key on userName)
    users.collect{it.userName}.unique().size() == users.size()
do NOT need to be true even if your code did not modify any objects.  But I am repeating my previous posts.

Grails choice for its ORM is the strongest argument against Groovy's community claim for being FP-friendly.

In my previous post I wrote about my experience with bugs in Grails and pointed to the Spring/Hibernate technology stack underneath.   Some things in life are certain. Is 'software will have bugs' one of these things?  I hope not.  But, I do know this: software using Hibernate will have lots of bugs.  If you cannot logically reason about your code, your code will have errors.

Some Cool FP References:
http://learnyouahaskell.com/ - free and excellent, funny, and easy to read book that will open your mind about FP.
http://youtu.be/52VsgyexS8Q - 'Hole Driven Development' is a bit of a toy concept but is very interesting to think about.  Safe from too haskellish concepts for the first 5 minutes.

Dear Reader.:  It may have been not a very easy reading, but I truly hope that you will think it was worth your time.  It took many evenings of adding/deleting/retyping rethinking this post.  If you finished reading it, I would really, really appreciate some feedback, note of approval or disapproval or a google + recommendation, so I know that my effort was received in one way or the other. 
After HHH-9367 experience I needed some venue to vent my frustration and writing this post provided it for me.

Edit (6/2018): I have learned things since writing this post. We can do much better than paper and pencil proofs when verifying software correctness. For example see this blog about proofs of laws, or this example in my IdrisTddNotes project Functor laws, Idris vs Haskell.   The topic of verified functional programming is big with programming languages and books devoted to it.

Next Post:  I am planning to go back to 'I don't like' series: testing.