This post is about a topic that absolutely fascinates me: completely bug free software (is that even possible?). This may be not a very easy reading but I do hope you will stay with me to the end of this post. I divided it into smaller sections, so you can do a quick skim, come back to drill into details. The intended audience is Groovy developers. (But if you do not program in Groovy you may find a lot of this relevant too.)
This post was motivated by several things, some Groovy, some Grails, some Java, but one of the biggest motivators was Hibernate rejection of this JIRA ticket:
HHH-9367.
Death, Taxes and Software Errors:
Software tests provide empirical evidence of software correctness. With all the software projects humanity has completed, we also have (a much stronger) evidence that no matter how well tested, the software will have bugs. Tests (assume 100% code coverage) stress the software under a fraction of possible combinations of things that impact it. We write tests because that is the best thing we know how to do.
Is there any different way to achieve software correctness than testing?
On one of my job interviews I have been asked about my attitude towards TDD and testing in general, and I answered:
Ideally, I would love to write code that does not need to be tested.
As you can guess, I did not get that job.
Functional Programming:
"The Groovy programming language, since its inception, has always been pretty functional"
(
http://glaforge.appspot.com/article/functional-groovy-presentation). This
is a very good and informative presentation. Growing level of interest in Groovy/Grails community in functional
programming is a great thing. I like the gradient, Groovy language is moving in the FP direction.
But ... there seems to be quite a bit of confusion about what FP is. This state of confusion is very normal for our industry. Here is an example:
What is OO?
Tim Rentsch (1982): "Every manufacturer will promote his products as
supporting it. Every manager will pay lip service to it. Every
programmer will practice it (differently). And no one will know just
what it is". This proverbial words can equally well be applied to Functional Programming, RESTful design and many other concepts.
So
what is FP? Some will say that FP is about using the new stream API in Java (are we confusing F
unctional P with F
luent P?), but open an introductory paper about FP and you are likely to be greeted with terms like Kleisli category (where is my category theory book?). It is hard to find any middle-ground here. I will focus on a better question:
what is FP for?
The why? question is often less confusing and easier to understand than
the what?
Back to software without errors:
Fluff ends here. 'Correct software' means nothing until I spell out how exactly I expect the software to work. I need to formalize this somehow. I will use the concept of
property to do that.
Property is simply a logical condition about the code. Verifying a property is verifying that the code works as expected.
So how can I verify with 100% certainty that software satisfies a property? For this demonstration, I need something that is easy to reason about:
Recursion: I chose recursion as my tool of choice for this post because: it is a well understood concept and because it is very well suited to proving correctness (there is this Math 101 thing called mathematical induction).
Examples of Properties: Here are just some examples to think about.
Example 1: Groovy allows to overload operators, in particular '+'. What properties does the '+' have? Can I assume that
(a + b) + c == a + (b + c)
(which
you would expect for '+' in algebra)? The answer is NO. It is OK for developer to code
whatever he/she pleases when overloading '+'.
Example 2:
If I only had a penny for each time when equals() and hashCode() have
not been implemented consistently in a Java app:
if a == b then a.hashCode() == b.hashCode()
Java documentation
tells developers to do it, well they often don't. This is part
of imperative language culture: polymorphism does not mean much.
Example 2b: Another property related to Java equals() is this:
if a == b and b == c then a == c
It is supposed to be always true. Is it?
Example 3: Groovy has a method defined on collections called
collect. It applies a closure to each element of a list returning a new list. This is often called
map in other places.
Recall that Closures overload << to mean function
composition. Is there a property associated with these? Here is one:
list.collect(c1).collect(c2) == list.collect(c2 << c1)
If I have used fpiglet, I could write this in a much cleaner form:
(map(c1) << map(c2)) (list) == map(c1 << c2)(list)
and, if it was possible to implement a meaningful equals for closures, we could simply say:
map(c1) << map(c2) == map(c1 << c2)
another words: map 'preserves' function composition.
Incidentally, there is a name for this property, it is called
second Functor law.
Properties are software requirements that are 'logic friendly'. They typically have the form:
For all [list of symbols] [logical expression]
because 'For some' just doesn't cut it in logical reasoning. Property testing is one of the things missing in OO programming. But more on this later.
Example 4 (Advanced): Try formulating some of your application business requirements as a property. (Example: username has to be unique across non-retired users. Where would I formulate it: DB records, domain objects, JSON results, ...?)
What is wrong with Java equals():
I will not be able to go far with logical reasoning about source code if I cannot say that code A is equivalent to code B. Consider this Groovy code:
Closure c1 = {int x -> x+1}
Closure c2 = {int x -> x+1}
assert c1 == c2 //FAILS
c1 and c2 are logically the same function but == comparison between them fails. Is this a bug in Groovy? No, this is a general (any computer language) problem. It is impossible to computationally verify that 2 functions are the same. Programmatic == is not something that always makes sense. (Language can be more logical than Java about this and if you attempt to use == where it does not make computational sense, the compiler could reject your code - Haskell does that.)
Also, see Example 2 and 2b, to do logical reasoning we need something stronger, something that does not depend on a developer's whim.
I will use '==' to indicate logically equivalent code. So 1 '==' 1 and c1 '==' c2. Basically, A '==' B if I can guarantee that replacing code A with code B will not change the behavior of the program.
What is wrong with side-effects:
I want to be able to logically reason correctness of parts of my code. This is close to impossible if my code has side-effects. To be able to do formal reasoning I need my code to be predictable. If c is a closure and x is its parameter, I would like to have this 'predictability' property:
c(x) '==' c(x)
... basically if I call a function twice with the same arguments I expect the same result.
Example: Consider this Groovy code:
def i = 0
Closure f = { x -> i++}
assert f(1) == f(1) //FAILS
Things take a dramatic turn if I remove possibility of side-effects: the above property has to be true. Lack of side-effects makes programs behave in a very predictable way. This is the key needed to do logical reasoning about source code correctness.
But we do need side-effects! We need to write/read files, database records, sockets, etc. If you think of predictability and logical reasoning as underlying goals for FP, the following are obvious conclusions:
- Side-effects need to be somehow isolated/decoupled
- Side-effects need to be as explicit as possible (for example, in Haskell, compiler can distinguish between code that wants to mutate content of a file from code that wants to mutate content of a variable, both are isolated from each other).
What is wrong with OO:
Objects are meant to encapsulate both state and behavior. This provides some level of control over state mutation but the mutating state ends up spread out all over the application. And there is typically a lot of it too. We cannot completely avoid side-effects but OO programming uses side-effects where they are not necessarily. Good example to think about is Java Beans architecture where each parameter is a state.
Objects are not a good match for formal reasoning for several reasons (one of them is the internal state) and are not very composable.
So what is the alternative that is composable and allows for formal reasoning? The answer is: function, not the old C function, but the even older mathematical function.
FP language definition:
I will use a very strict definition of FP. I will call language functional if programming in it is done using functions and the language can guarantee no side-effects (at least on parts of the application code).
This definition makes FP language code behave like symbolic calculations in math. If x=1 and y=2 then nobody expects x or y to change because I wrote equation f(x) = g(y). Functions behave like mathematical functions - hence the term Functional Programming. And math is ... yeah, the science in charge of logical correctness.
For FP language to be useful we need the ability to create side-effects. I simply assume that the language has the ability to clearly isolate such code. Anything without side-effects will be called pure and anything with side-effects will be called impure, and I assume the language to have a way to force separation between pure code and impure code. Ability to do logical reasoning will be limited in the impure part.
There is currently only one commercially available language satisfying this definition: Haskell. To make room for the likes of Erlang, Clojure, Scala some people try to relax what FP language means (see FP style section of this post).
Side Note: Here is a cool thing to consider. Functions need arguments. To be able to write f(x) we need a concept of a variable x. We can think about x as a special 'constant' function. In FP language 'everything' is a function! Well, almost everything.
Bug free proven code:
I will show that the '2nd functor law' property (Example 3) is true based on how map and function composition are implemented. I will do that purely by formal reasoning.
This is how map is coded in Haskell (I have omitted the type declaration, but this is full implementation code):
map _ [] = []
map f (x:xs) = (f x):(map f xs)
If you have never seen this code, it is worth spending some time thinking about it and puzzling it out. The syntax should be very intuitive. I use it because it is super short. Here is quick explanation: '_' means anything (here any function); '[]' means empty list; 'x:xs' means a list with first element 'x' and tail 'xs'; 'f x' means function f applied to x; 'map f xs' means map function applied to function 'f 'and list 'xs'; and parentheses are used only for logical grouping.
(... If you still have problems parsing this code: This is a recursive implementation. Read it like so:
line1: for any function (_) and empty list ([ ]) result of map is an empty list ([ ]);
line2: for function 'f' and list starting with 'x' and tailing with 'xs' the result is a list starting with 'f(x)' and tailing with (recursion) 'map f xs'.)
You may be thinking: Stack Overflow. I am not going into details, but that is not a problem. We do not need to understand why and how stack overflow is prevented to move forward with this post. (The keyword here is: laziness.)
We also need implementation code for function composition (which will be noted with '.'). Here is the full implementation code which had me break some sweat:
(f . g) x = f (g x)
So, I want to actually prove that, for the above implementation code, the following line has to be logically true:
map f (map g xs) '==' map f.g xs (2FL)
for all functions f, g and any list xs. (Side note: if would be cool to write LHS as: '(map f . map g) xs', this is called currying, and I am not going there in this post).
The Proof:
I will do that in 2 steps, first for empty list that is simple:
map f (map g []) '==' [] '==' map f.g []
(both equalities are true because of the first line of implementation of map).
Second, I will prove (2FL) for list (x:xs) assuming that (2FL) is true for xs (mathematical induction).
Using second line of implementation of map I have:
map f (map g (x:xs)) '==' map f (g x : map g xs) '==' f(g x): map f (map g xs)
because of how function composition is implemented, I get this:
'==' (f.g)x : map f (map g xs)
using inductive assumption gives me:
'==' (f.g)x : map (f.g) xs
and again second line of implementation of map yields:
'==' map (f.g) (x:xs)
Done! I have proven it. There is no logical possibility for a bug here! This property is something we can trust to be always true. So here we have it: the strongest test ever written for a computer program. If you use unit tests in your coding, think of this as a unit-proof! Notice the power of declarative code: map implementation code is really a set of 2 'math formulas'. (Actually, you may have noticed that implementation of map is really a set of 2 properties).
So is writing proofs in the job description for future developers? Should I be studying Bird–Meertens Formalism or something? I would not mind if this was the case, but I do not think so. I don't believe there is a commercially available language or a code analysis tool that can prove a simple property like this today, but tomorrow...
For very diligent readers: Can still something go wrong with (2FL)? I made an implicit assumption that the language itself will execute the code correctly. The instruction sets on our CPUs are imperative and functional programming language needs to have imperative implementation layer. So the language itself can only be 'tested' for correctness. Despite that limitation, I hope you agree that combining math with FP code leads to something quite amazing.
Exercise (only for the brave): Prove that Groovy's collect() implementation code satisfies Groovy equivalent of (2FL). Let me know when you succeed ;)
Exercise for the reader: Where did I assume side-effect free code in the proof? Come up with list a and closures f and g in Groovy so that:
a.collect(g).collect(f) != a.collect(f << g)
Oh, NO! And I was so excited that I have something I KNOW and can trust is always true. This is not a bug, Groovy works here as 'expected'. Yes, the 'brave reader exercise' was a booby trap, but the journey was the important part of that exercise. There are three issues:
- it is very hard to formally reason if code has side-effects
- it is very hard to formally reason imperative code even if you assume no side-effects
- with side-effects, most of the the properties that we can come up with will not be true or will need to be weakened.
FP as programming style:
Let us look at yet another example to see that side-effects mess up everything: If I asked for a vote if the following property has to be true, I bet the overwhelming answer would be: yes it has to.
list.findAll(predicate).every(predicate)
Well it is not:
def list = [1,2,3,4]
def b = true
def predicate = {x -> b=!b; return (b && x==2)}
assert list.findAll(predicate).every(predicate) //FAILS
This is something to ponder for a few moments: We could make the above property work if we considered only predicates that have no side-effects. But it is not possible for me to know if a Closure I use in my program has side effects or not unless I have access to the source code. This is one of the reasons why in imperative languages the term Functional Programming is used to describe something much weaker:
a programming style. Still, this style can be very powerful. Language cannot verify that code is pure, but this knowledge can be established with code organization, naming conventions, etc.
QuickCheck: Less than proving but more than traditional testing:
There is a big synergy between writing functional code and writing unit-testable code, but OO Unit Testing is a bit different, harder to implement, and significantly weaker. In expression 'f(x)': 'x' does not have any behavior to test, you need to test 'f,' not 'x'. We can test properties of 'f'. This is done by running a particular property against a large number of 'randomly' generated input. Such 'random' generation needs to create a comprehensive set of data. This method works very, very well in uncovering problems developers can fail to envision. I think of this as something that gives me very high probability (very close to 1) that my code works, not just some empirical evidence that OO testing provides.
So, instead of actually proving the second functor law, we could have tried to throw lots of data at it. That would mean lots of different lists and lots of different functions. Can functions be generated randomly? Yes they can! There is a fantastic test library (QuickCheck) that has been ported to various languages, but these ports are not as good as the original. There is a catch (as far as I know): you have to run the tests type by type (lists of integers, lists of doubles, lists of strings, etc).
I will, again, throw a bit of Haskell code here to see test implementation for 2'nd functor law. With properly declared 'f', 'g' and 'x', this is it: ('fmap' generalizes 'map' and works across many other types, not just collections)
... = (fmap (g . f) x)
== ((fmap g . fmap f) x)
This is the full test implementation code. Neat, is it not?
I wanted to show this code, because it demonstrates terseness at its best. Terseness that Groovy can learn a lot from. For me the definition of
readable code is:
code that expresses the problem, not the solution.
This program is extremely polymorphic: this test can be ran on lists of any type (that supports ==) as well as on other
functors (whatever that means).
Code that just works:
There is this belief among Haskell developers that if the code compiles it will work as intended (
http://www.haskell.org/haskellwiki/Why_Haskell_just_works). GHC is a very, very smart compiler, but a lot of this is not Haskell specific and it is true about FP in general.
FP has this almost unreal thing going for it: if it 'type checks', it works. If my code is wrong - most likely the type signatures are wrong. I have experienced this even when writing fpiglet (which is not strongly typed and compiles with Groovy so I was the 'strong' compiler).
Conclusions:
I need to stop here because the likelihood of readers continuing on is probably diminishing rapidly. If you went that far with me:
Thank you!, I hope the journey was worth the effort! If you have not seen much of FP before, you probably have a headache, but I hope it is a good headache!
I have combined some very basic Math with a simple functional program and got (I think you will agree) some amazing results. We have actually managed a
formal mathematical proof of code correctness. The FP-Math relationship is very strong and well established, it is one of the things that makes FP what it is. I think of that this way: we want bug free software and there is this couple thousand years old science dealing with logical correctness. Seems like a no-brainer to put these together ... and FP puts these together!
What is FP for? That has many answers, including: bug free code, concurrency safe code, parallel computing, performance, extreme expressiveness and composability. The underlying goals are predictability and logical reasoning. Bug free code is to me the strongest reason for FP. Some claim that testable code is simply a better code. If that is true than 'provable' code is simply a much, much better code.
So how do we write FP code? Here are the common
denominators: Code needs to be declarative. Pure code (without side-effects) needs to be isolated from non-pure code (with side-effects). Any side-effects need to be as explicit as possible. Properties need to be identified and tested.
State without state: Most side-effects in OO programming are caused by the need to handle internal object state. I am sure you remember this viral quote:
'hypermedia is the engine of application state'.
Yeah, we do not need that stinky HTTP session. Stop using it and state disappears, as do many bugs. FP is a lot like that. You can use partial application (curry in Groovy) as 'an engine of state' but there are other 'engines of state' (reader or state monad, lenses, and more).
This is how you know that your code is functional: you start using all these scary concepts some people talk about.
So
is Groovy functional? I have placed that bar too high. Here are better questions to think about: Do Groovy programs (or we as Groovy developers) :
- isolate pure code?
- isolate specific side-effects?
- avoid imperative (write declarative) code?
- use properties to define and verify behavior?
- favor methods or closures (used as functions) in API design?
Maybe the most important thing in FP is not the language but the programming community? That is what makes Erlang or Scala more functional.
How functional is Grails? Let's look at this code:
def users = User.findAllByOffice(office1)
and analyze its side-effects from the point of view of predictability, suitability for logical reasoning, and explicitness. Well, in addition to issuing a SELECT statement, the above code :
- can save some changed objects to the database
- can save some unchanged objects too
- will impact the data content of some records returned by queries that follow it
- will impact some record types returned by queries that follow it (from proxies to actual objects)
- it will return records that may be different from what is currently in the database and you have no way of finding which of them are different
What kind of
properties can we assert about this code? Not many, for example the obvious candidates:
users.every{it.office == office1}
or (assume database enforced unique key on userName)
users.collect{it.userName}.unique().size() == users.size()
do NOT need to be true even if your code did not modify any objects. But I am repeating my previous posts.
Grails choice for its ORM is the strongest argument against Groovy's community claim for being FP-friendly.
In my previous post I wrote about my experience with bugs in Grails and pointed to the Spring/Hibernate technology stack underneath. Some things in life are certain. Is 'software will have bugs' one of these things? I hope not. But, I do know this: software using Hibernate will have lots of bugs. If you cannot logically reason about your code, your code will have errors.
Some Cool FP References:
http://learnyouahaskell.com/ - free and excellent, funny, and easy to read book that will open your mind about FP.
http://youtu.be/52VsgyexS8Q - 'Hole Driven Development' is a bit of a toy concept but is very interesting to think about. Safe from too haskellish concepts for the first 5 minutes.
Dear Reader.: It may have been not a very easy reading, but I truly hope that you will think it was worth your time. It took many evenings of adding/deleting/retyping rethinking this post. If you finished reading it, I would really, really appreciate some feedback, note of approval or disapproval or a google + recommendation, so I know that my effort was received in one way or the other.
After HHH-9367 experience I needed some venue to vent my frustration and writing this post provided it for me.
Edit (6/2018): I have learned things since writing this post. We can do much better than
paper and pencil proofs when verifying software correctness. For example see this blog about
proofs of laws,
or this example in my IdrisTddNotes project
Functor laws, Idris vs Haskell. The topic of verified functional programming is big with programming languages and books devoted to it.
Next Post: I am planning to go back to 'I don't like' series: testing.