Stuff about programming, programming style, maintainability, testability. Dedicated to my coworkers and friends. Everyone is welcome to leave comments or disagree with me. This blog does not represent views or opinions of my employer.

Saturday, September 20, 2014

My dream: software without any bugs ... and is Groovy functional? How about Grails?

This post is about a topic that absolutely fascinates me: completely bug free software (is that even possible?).  This may be not a very easy reading but I do hope you will stay with me to the end of this post.  I divided it into smaller sections, so you can do a quick skim, come back to drill into details.  The intended audience is Groovy developers. (But if you do not program in Groovy you may find a lot of this relevant too.)

This post was motivated by several things, some Groovy, some Grails, some Java, but one of the biggest motivators was Hibernate rejection of this JIRA ticket: HHH-9367.  

Death, Taxes and Software Errors:

Software tests provide empirical evidence of software correctness.  With all the software projects humanity has completed, we also have (a much stronger) evidence that no matter how well tested, the software will have bugs.  Tests (assume 100% code coverage) stress the software under a fraction of possible combinations of things that impact it.  We write tests because that is the best thing we know how to do.

Is there any different way to achieve software correctness than testing? 

On one of my job interviews I have been asked about my attitude towards TDD and testing in general, and I answered:
  Ideally, I would love to write code that does not need to be tested. 
As you can guess, I did not get that job.

Functional Programming:

"The Groovy programming language, since its inception, has always been pretty functional"
(http://glaforge.appspot.com/article/functional-groovy-presentation).  This is a very good and informative presentation. Growing level of interest in Groovy/Grails community in functional programming is a great thing.  I like the gradient, Groovy language is moving in the FP direction.

But ... there seems to be quite a bit of confusion about what FP is. This state of confusion is very normal for our industry.  Here is an example:  What is OO? Tim Rentsch (1982): "Every manufacturer will promote his products as supporting it. Every manager will pay lip service to it. Every programmer will practice it (differently). And no one will know just what it is".  This proverbial words can equally well be applied to Functional Programming,  RESTful design and many other concepts.

So what is FP?  Some will say that FP is about using the new stream API in Java (are we confusing Functional P with Fluent P?), but open an introductory paper about FP and you are likely to be greeted with terms like Kleisli category (where is my category theory book?).  It is hard to find any middle-ground here.  I will focus on a better question: what is FP forThe why? question is often less confusing and easier to understand than the what?  


Back to software without errors:

Fluff ends here.  'Correct software' means nothing until I spell out how exactly I expect the software to work.  I need to formalize this somehow.  I will use the concept of property to do that.
Property is simply a logical condition about the code.  Verifying a property is verifying that the code works as expected.
So how can I verify with 100% certainty that software satisfies a property?   For this demonstration, I need something that is easy to reason about:
Recursion:  I chose recursion as my tool of choice for this post because: it is a well understood concept and because it is very well suited to proving correctness (there is this Math 101 thing called mathematical induction).

Examples of Properties:  Here are just some examples to think about.
Example 1:  Groovy allows to overload operators, in particular '+'.  What properties does the '+' have?  Can I assume that
   (a + b) + c  == a + (b + c)
(which you would expect for '+' in algebra)?   The answer is NO.   It is OK for developer to code whatever he/she pleases when overloading '+'.

Example 2:  If I only had a penny for each time when equals() and hashCode() have not been implemented consistently in a Java app:
      if a == b then a.hashCode() == b.hashCode()
Java documentation tells developers to do it, well they often don't.  This is part of imperative language culture:  polymorphism does not mean much.

Example 2b: Another property related to Java equals() is this:
    if a == b and b == c then a == c
It is supposed to be always true. Is it?

Example 3:   Groovy has a method defined on collections called collect. It applies a closure to each element of a list returning a new list. This is often called map in other places.  Recall that Closures overload << to mean function composition.  Is there a property associated with these? Here is one:
  list.collect(c1).collect(c2) == list.collect(c2 << c1)
If I have used fpiglet, I could write this in a much cleaner form:
  (map(c1) << map(c2)) (list) == map(c1 << c2)(list)
and, if it was possible to implement a meaningful equals for closures, we could simply say:
  map(c1) << map(c2) == map(c1 << c2)
another words: map 'preserves' function composition.   Incidentally, there is a name for this property, it is called second Functor law.

Properties are software requirements that are 'logic friendly'.  They typically have the form:
     For all [list of symbols]  [logical expression]
because 'For some' just doesn't cut it in logical reasoning.  Property testing is one of the things missing in OO programming. But more on this later.

Example 4 (Advanced):  Try formulating some of your application business requirements as a property. (Example: username has to be unique across non-retired users. Where would I formulate it: DB records, domain objects, JSON results, ...?)

What is wrong with Java equals():

I will not be able to go far with logical reasoning about source code if I cannot say that code A is equivalent to code B.  Consider this Groovy code:
 Closure c1 = {int x -> x+1}
 Closure c2 = {int x -> x+1}
 assert c1 == c2 //FAILS

c1 and c2 are logically the same function but == comparison between them fails.  Is this a bug in Groovy?  No, this is a general (any computer language) problem.  It is impossible to computationally verify that 2 functions are the same.  Programmatic == is not something that always makes sense.  (Language can be more logical than Java about this and if you attempt to use == where it does not make computational sense, the compiler could reject your code - Haskell does that.)
Also, see Example 2 and 2b,  to do logical reasoning we need something stronger, something that does not depend on a developer's whim.

I will use '==' to indicate logically equivalent code.  So 1 '==' 1 and c1 '==' c2.  Basically, A '==' B if I can guarantee that replacing code A with code B will not change the behavior of the program.


What is wrong with side-effects:

I want to be able to logically reason correctness of parts of my code.  This is close to impossible if my code has side-effects.  To be able to do formal reasoning I need my code to be predictable.  If c is a closure and x is its parameter, I would like to have this 'predictability' property:
    c(x) '==' c(x)
... basically if I call a function twice with the same arguments I expect the same result.

Example:  Consider this Groovy code:
    def i = 0
    Closure f = { x -> i++}
    assert f(1) == f(1) //FAILS

Things take a dramatic turn if I remove possibility of side-effects:  the above property has to be true.   Lack of side-effects makes programs behave in a very predictable way.  This is the key needed to do logical reasoning about source code correctness.

But we do need side-effects!  We need to write/read files, database records, sockets, etc.  If you think of predictability and logical reasoning as underlying goals for FP, the following are obvious conclusions:
  • Side-effects need to be somehow isolated/decoupled 
  • Side-effects need to be as explicit as possible (for example, in Haskell, compiler can distinguish between code that wants to mutate content of a file from code that wants to mutate content of a variable, both are isolated from each other).

What is wrong with OO:

Objects are meant to encapsulate both state and behavior.  This provides some level of control over state mutation but the mutating state ends up spread out all over the application. And there is typically a lot of it too. We cannot completely avoid side-effects but OO programming uses side-effects where they are not necessarily. Good example to think about is Java Beans architecture where each parameter is a state.

Objects are not a good match for formal reasoning for several reasons (one of them is the internal state) and are not very composable.
So what is the alternative that is composable and allows for formal reasoning?  The answer is: function, not the old C function, but the even older mathematical function.

FP language definition: 

I will use a very strict definition of FP.  I will call language functional if programming in it is done using functions and the language can guarantee no side-effects (at least on parts of the application code).

This definition makes FP language code behave like symbolic calculations in math.  If x=1 and y=2 then nobody expects x or y to change because I wrote equation f(x) = g(y).  Functions behave like mathematical functions - hence the term Functional Programming.  And math is ... yeah, the science in charge of logical correctness.

For FP language to be useful we need the ability to create side-effects.  I simply assume that the language has the ability to clearly isolate such code. Anything without side-effects will be called pure and anything with side-effects will be called impure, and I assume the language to have a way to force separation between pure code and impure code.  Ability to do logical reasoning will be limited in the impure part.

There is currently only one commercially available language satisfying this definition:  Haskell.   To make room for the likes of Erlang, Clojure, Scala some people try to relax what FP language means (see FP style section of this post).
Side Note: Here is a cool thing to consider.  Functions need arguments.  To be able to write f(x) we need a concept of a variable x.  We can think about x as a special 'constant' function.  In FP language 'everything' is a function!  Well, almost everything.

Bug free proven code:

I will show that the '2nd functor law' property (Example 3) is true based on how map and function composition are implemented. I will do that purely by formal reasoning.

This is how map is coded in Haskell (I have omitted the type declaration, but this is full implementation code):
map _ []     = []
map f (x:xs) = (f x):(map f xs)
If you have never seen this code, it is worth spending some time thinking about it and puzzling it out. The syntax should be very intuitive.  I use it because it is super short.  Here is quick explanation: '_' means anything (here any function); '[]' means empty list; 'x:xs' means a list with first element 'x' and tail 'xs';  'f x' means function f applied to x;  'map f xs' means map function applied to function 'f 'and list 'xs';  and parentheses are used only for logical grouping.
(... If you still have problems parsing this code:  This is a recursive implementation.  Read it like so:
line1:  for any function (_) and empty list ([ ]) result of map is an empty list ([ ]);  
line2:  for function 'f' and list starting with 'x' and tailing with 'xs' the result is a list starting with 'f(x)' and tailing with (recursion) 'map f xs'.)  
You may be thinking: Stack Overflow.   I am not going into details, but that is not a problem.  We do not need to understand why and how stack overflow is prevented to move forward with this post. (The keyword here is: laziness.)

We also need implementation code for function composition (which will be noted with '.'). Here is the full implementation code which had me break some sweat:
(f . g) x = f (g x) 
So, I want to actually prove that, for the above implementation code, the following line has to be logically true:
 map f (map g xs) '==' map f.g xs       (2FL)
for all functions f, g and any list xs.  (Side note: if would be cool to write LHS as: '(map f . map g) xs', this is called currying, and I am not going there in this post).

The Proof:
I will do that in 2 steps, first for empty list that is simple:
map f (map g []) '==' [] '==' map f.g [] 
(both equalities are true because of the first line of implementation of map).

Second, I will prove (2FL) for list (x:xs) assuming that (2FL) is true for xs (mathematical induction).
Using second line of implementation of map I have:
map f (map g (x:xs)) '==' map f (g x : map g xs) '==' f(g x): map f (map g xs) 
because of how function composition is implemented, I get this:
'==' (f.g)x : map f (map g xs) 
using inductive assumption gives me:
'==' (f.g)x : map (f.g) xs 
and again second line of implementation of map yields:
'==' map (f.g) (x:xs) 
Done!   I have proven it.   There is no logical possibility for a bug here!  This property is something we can trust to be always true.  So here we have it:  the strongest test ever written for a computer program.  If you use unit tests in your coding, think of this as a unit-proof!  Notice the power of declarative code: map implementation code is really a set of 2 'math formulas'. (Actually, you may have noticed that implementation of map is really a set of 2 properties).

So is writing proofs in the job description for future developers?  Should I be studying Bird–Meertens Formalism or something?  I would not mind if this was the case, but I do not think so.  I don't believe there is a commercially available language or a code analysis tool that can prove a simple property like this today, but tomorrow...

For very diligent readers:  Can still something go wrong with (2FL)?  I made an implicit assumption that the language itself will execute the code correctly.  The instruction sets on our CPUs are imperative and functional programming language needs to have imperative implementation layer.  So the language itself can only be 'tested' for correctness.  Despite that limitation, I hope you agree that combining math with FP code leads to something quite amazing.

Exercise (only for the brave):  Prove that Groovy's collect() implementation code satisfies Groovy equivalent of (2FL).  Let me know when you succeed ;)

Exercise for the reader:  Where did I assume side-effect free code in the proof?  Come up with list a and closures f and g in Groovy so that:
  a.collect(g).collect(f) != a.collect(f << g)

Oh, NO!  And I was so excited that I have something I KNOW and can trust is always true. This is not a bug, Groovy works here as 'expected'. Yes, the 'brave reader exercise' was a booby trap, but the journey was the important part of that exercise. There are three issues:
  • it is very hard to formally reason if code has side-effects
  • it is very hard to formally reason imperative code even if you assume no side-effects
  • with side-effects, most of the the properties that we can come up with will not be true or will need to be weakened.

FP as programming style:

Let us look at yet another example to see that side-effects mess up everything:  If I asked for a vote if the following property has to be true, I bet the overwhelming answer would be: yes it has to.
   list.findAll(predicate).every(predicate)  

Well it is not:
 def list = [1,2,3,4]
 def b = true
 def predicate = {x -> b=!b; return (b && x==2)}
 assert list.findAll(predicate).every(predicate)  //FAILS

This is something to ponder for a few moments:  We could make the above property work if we considered only predicates that have no side-effects.  But it is not possible for me to know if a Closure I use in my program has side effects or not unless I have access to the source code. This is one of the reasons why in imperative languages the term Functional Programming is used to describe something much weaker: a programming style.   Still, this style can be very powerful.  Language cannot verify that code is pure, but this knowledge can be established with code organization, naming conventions, etc.
  

QuickCheck:  Less than proving but more than traditional testing: 

There is a big synergy between writing functional code and writing unit-testable code, but OO Unit Testing is a bit different, harder to implement, and significantly weaker.  In expression 'f(x)':  'x' does not have any behavior to test, you need to test 'f,' not 'x'.  We can test properties of 'f'.  This is done by running a particular property against a large number of 'randomly' generated input.  Such 'random' generation needs to create a comprehensive set of data.  This method works very, very well in uncovering problems developers can fail to envision.  I think of this as something that gives me very high probability (very close to 1) that my code works, not just some empirical evidence that OO testing provides.

So, instead of actually proving the second functor law, we could have tried to throw lots of data at it.  That would mean lots of different lists and lots of different functions. Can functions be generated randomly?  Yes they can!  There is a fantastic test library (QuickCheck) that has been ported to various languages, but these ports are not as good as the original. There is a catch (as far as I know): you have to run the tests type by type (lists of integers, lists of doubles, lists of strings, etc).

I will, again, throw a bit of Haskell code here to see test implementation for 2'nd functor law. With properly declared 'f', 'g' and 'x', this is it: ('fmap' generalizes 'map' and works across many other types, not just collections)

... = (fmap (g . f) x) == ((fmap g . fmap f) x)
This is the full test implementation code.  Neat, is it not?
I wanted to show this code, because it demonstrates terseness at its best. Terseness that Groovy can learn a lot from.   For me the definition of readable code is: 
                    code that expresses the problem, not the solution.
This program is extremely polymorphic: this test can be ran on lists of any type (that supports ==) as well as on other functors (whatever that means).

Code that just works:  

There is this belief among Haskell developers that if the code compiles it will work as intended (http://www.haskell.org/haskellwiki/Why_Haskell_just_works).  GHC is a very, very smart compiler, but a lot of this is not Haskell specific and it is true about FP in general.
FP has this almost unreal thing going for it:  if it 'type checks', it works.  If my code is wrong - most likely the type signatures are wrong.  I have experienced this even when writing fpiglet (which is not strongly typed and compiles with Groovy so I was the 'strong' compiler).

Conclusions:

I need to stop here because the likelihood of readers continuing on is probably diminishing rapidly. If you went that far with me: Thank you!, I hope the journey was worth the effort!  If you have not seen much of FP before, you probably have a headache, but I hope it is a good headache!

I have combined some very basic Math with a simple functional program and got (I think you will agree) some amazing results.  We have actually managed a formal mathematical proof of code correctness.  The FP-Math relationship is very strong and well established,  it is one of the things that makes FP what it is.  I think of that this way:  we want bug free software and there is this couple thousand years old science dealing with logical correctness.  Seems like a no-brainer to put these together ... and FP puts these together!

What is FP for?  That has many answers, including:  bug free code, concurrency safe code,  parallel computing, performance, extreme expressiveness and composability.  The underlying goals are predictability and logical reasoning.  Bug free code is to me the strongest reason for FP. Some claim that testable code is simply a better code. If that is true than 'provable' code is simply a much, much better code.

So how do we write FP code?  Here are the common denominators:  Code needs to be declarative. Pure code (without side-effects) needs to be isolated from non-pure code (with side-effects).   Any side-effects need to be as explicit as possible.  Properties need to be identified and tested.

State without state:  Most side-effects in OO programming are caused by the need to handle internal object state. I am sure you remember this viral quote:
         'hypermedia is the engine of application state'.  
Yeah, we do not need that stinky HTTP session. Stop using it and state disappears, as do many bugs.  FP is a lot like that.  You can use partial application (curry in Groovy)  as 'an engine of state' but there are other 'engines of state' (reader or state monad, lenses, and more).
This is how you know that your code is functional:  you start using all these scary concepts some people talk about.

So is Groovy functional?  I have placed that bar too high. Here are better questions to think about: Do Groovy programs (or we as Groovy developers) :
  • isolate pure code?
  • isolate specific side-effects?
  • avoid imperative (write declarative) code?
  • use properties to define and verify behavior?
  • favor methods or closures (used as functions) in API design?  
Maybe the most important thing in FP is not the language but the programming community?  That is what makes Erlang or Scala more functional.

How functional is Grails?    Let's look at this code:
    def users = User.findAllByOffice(office1)

and analyze its side-effects from the point of view of predictability, suitability for logical reasoning, and explicitness.  Well, in addition to issuing a SELECT statement, the above code :
  • can save some changed objects to the database
  • can save some unchanged objects too  
  • will impact the data content of some records returned by queries that follow it
  • will impact some record types returned by queries that follow it (from proxies to actual objects)
  • it will return records that may be different from what is currently in the database and you have no way of finding which of them are different
What kind of properties can we assert about this code?  Not many, for example the obvious candidates:
     users.every{it.office == office1}
or (assume database enforced unique key on userName)
    users.collect{it.userName}.unique().size() == users.size()
do NOT need to be true even if your code did not modify any objects.  But I am repeating my previous posts.

Grails choice for its ORM is the strongest argument against Groovy's community claim for being FP-friendly.

In my previous post I wrote about my experience with bugs in Grails and pointed to the Spring/Hibernate technology stack underneath.   Some things in life are certain. Is 'software will have bugs' one of these things?  I hope not.  But, I do know this: software using Hibernate will have lots of bugs.  If you cannot logically reason about your code, your code will have errors.

Some Cool FP References:
http://learnyouahaskell.com/ - free and excellent, funny, and easy to read book that will open your mind about FP.
http://youtu.be/52VsgyexS8Q - 'Hole Driven Development' is a bit of a toy concept but is very interesting to think about.  Safe from too haskellish concepts for the first 5 minutes.

Dear Reader.:  It may have been not a very easy reading, but I truly hope that you will think it was worth your time.  It took many evenings of adding/deleting/retyping rethinking this post.  If you finished reading it, I would really, really appreciate some feedback, note of approval or disapproval or a google + recommendation, so I know that my effort was received in one way or the other. 
After HHH-9367 experience I needed some venue to vent my frustration and writing this post provided it for me.

Next Post:  I am planning to go back to 'I don't like' series: testing.

8 comments:

  1. Hibernate and any ORM for that matter are essentially dealing with side effects. How on earth writing Hibernate in Haskell is going to make it bug free? Or help to fix your bug? It looks to me that FP is the new "Silver Bullet".

    ReplyDelete
    Replies
    1. You are right any ORM needs side-effects, the question is how explicit and isolated from each other these side-effects are. Not all ORMs are created equal in that regard.

      'Writing Hibnerate in Haskell': no I am not suggesting it, and that is funny. 'your bug' ... if you are referring to HHH-9367 I do not consider it 'my' bug, it will break your code as well.

      Delete
    2. Any ORM needs to deal with effects. What we want is to do this without side effects so we have all the nice properties of referentially transparent code: greater reasoning, greater reuse, higher productivity.

      The point is that compilers can already do a lot more to make sure that your code is correct. For most languages, that burden is shifted from a single compiler to every user of that compiler.

      We already know how to do type-safe SQL without the use of side effects. An earlier version of this that is quite readable is the paper, "Domain Specific Embedded Compilers", Leijen and Meijer, https://www.usenix.org/legacy/events/dsl99/full_papers/leijen/leijen.pdf (99).

      We want to reuse effects, but be able to combine them in novel, unforeseen ways to create new abstractions so that programs don't have to deal with the buggy stateful connections, sessions, statements, etc. For an example see "Connection Management, FP Style: A Case Study", Wesley-Smith, https://www.youtube.com/watch?v=qPhNdNMoStY, slides http://yowconference.com.au/slides/yowlambdajam2013/Wesley-Smith-Connection-Management.pdf.

      It is very difficult to create abstractions that re-use side effects. To understand code that side effects you need to understand the side effects of that method and all other methods (and any transitively called methods) that method calls. If side effects are littered throughout the system then pretty soon reusing any method requires understanding the side-effects of the entire system. This is obviously not modular, scalable or productive.

      Delete
  2. About your particular problem, I am not sure but I think you can programmatically clear the 1st level cache for that entity:
    session.evict(User);
    or clear the whole cache all together
    session.clear();

    ReplyDelete
    Replies
    1. I am assumming you are referring to HHH-9367. Yes, I can add logic to evict/discard all records returned from all hibernate queries. I can also precede every query with session.clear(). Both approaches will prevent the issue.
      Do you do that in your code? What sense does it make for hibernate to keep addding records to session and for me to keep removing them?

      Delete
  3. I find the concept very interesting; I'd think the best application would be at a language-author level to provide identification and isolation language constructs. I find FP hard to digest largely because examples (as in this post) tend to focus on simple maps or abstract functions. I'd love to see a build-up from these simple concepts to an actual application.

    What's your opinion on closures? By definition they cannot be pure, since they are functions with free variables. The free variables will always introduce a risk of side effects. I was surprised to see that Haskell has closures at all.

    ReplyDelete
    Replies
    1. If you are thinking of a language which is functional inside and provides more imperative outer shell for developers, that is seems to be somewhat possible (and has been done using things like monadic comprehensions) but that is a bit of a cheat.
      In this blog I tried to stay away from Types, FP types are very powerful, much more so than OO types. FP types in language like haskell are very close to Type Theory which very very close (actually equivalent) to Proof Theory. I would expect application developer to be exposed to functional Types because they create this 'if it type checks if works phenomena'.

      My take on Groovy Closures:
      I love them but I wish for some improvements (like a way to implicitly curry them).
      Obviously, they can mutate whatever they can see and have internal state (delegate), but avoiding using/overusing side-effects can become a coding practice.

      Haskell has functions and anonymous functions (lambdas). They cannot have side-effects. Similarly to Groovy, they 'enclose' things that they can see so you can do stuff like this:

      z = 1
      myLambda = (\x -> z + x)
      main = print (myLambda 1) -- will print 2

      This is not dengerous, you can think of 'z' as a constant function.
      Variables cannot be mutated except in very special cases, like inside STM monad, which executes as part of main.

      Non trivial examples:
      Definately you can do some very sophisticated applications with FP. There are Haskell equivalents of Grails, for example Yesod. I have not used these yet.
      Because type system is so strong, equivalents of .gsp pages are compiler checked! Plus other amazing things like Type-safe security (code injection prevention, etc).

      In this post, I have implemented functional Map. FP is great for parallel computing.
      For example, you can implement map-reduce using very functional (and quite simple) code in Haskell to take advantage of (now probably) 8 cores on your PC. You can use it for things that map-reduce is typically used for, like parsing long log files, etc. Did I say parsing .. that is another excellent match for FP. That story could continue for a while...

      But your point is well taken. Maybe we have seen too many functional examples using collections. I think we see them a lot because they are easy to demonstrate.

      Delete
    2. The combination of nice "FP-style" Groovy features and totally imperative stateful OO-style stuff in Grails is certainly a strange one.

      Like Tim, I've been interested in FP for a while now but have not put in enough continuous effort to get past the wall between toy examples and real-world useful programs that do stateful things in a sane way. That wall feels very steep, but maybe after climbing it, it will only look like a gentle bump. We'll see!
      I'd love to see if the very attractive QuickCheck (or other property-based test libraries) can also scale to bigger problems. Perhaps this is determined by the degree to which you cleanly isolate the pure and impure parts of your programs.

      Delete