Software has two ingredients: opinions and logic (=programming). The second ingredient is rare and is typically replaced by the first.
I blog about code correctness, maintainability, testability, and functional programming.
This blog does not represent views or opinions of my employer.
Showing posts with label Haskell. Show all posts
Showing posts with label Haskell. Show all posts

Sunday, August 20, 2017

Comprehensive real life functional programming examples



Several years ago, I think at Uberconf, I was listening to yet another talk that showed some list manipulations and called it functional programming.  I also remember being very confused about how to use FP in real life.  Seeing bunch of examples did not translate, for me, to being able to write end-to-end application using these concepts.  I remember talking to my colleague that it would be really good to have a comprehensive example, not just isolated 5 code-liners.


About a year ago, I decided to just do it.  I decided to work on a simple CRUD application and implement it in a fully functional way.  The idea grew and became:  Polyglot application made out of interchangeable front-ends and back-ends implemented using various languages and functional libraries.  https://github.com/rpeszek/typesafe-web-polyglot. At this moment I have 3 back-ends (Yesod-Haskell, Servant-Haskell, Http4s-Scala) and 2 front-ends (Elm, Halogen-PureScript).   I will write short blogs describing my goals behind each of these projects.  This post describes my general goals.


Comprehensive, what do I mean?

These are relatively small apps (a bit over 1000 lines each) that still facilitate the amount of code reuse I would expect to see in a large application.  There is quite a bit of similarity between domain objects that expose create-retrieve-update-delete interface,  thus, I expect to have code reuse too. These apps do what typical web-apps do today: talk to database, do Ajax communication, deal with Json, etc.  
This is just about CRUD so there is no business logic to speak of and things are rather dull, but I think it becomes clear how would adding business logic look like in that code.


What is still not implemented?

Security,  better styling (completely ignored in my current PureScript app), more complex (has-many) relationships.


Fully functional way, what do I mean?

Referential transparency (same computation, same inputs == same results) -  this ought to be the defining property of FP since it enables everything else.  Effectful computations can still be be referentially transparent.  All of my apps treat this aspect with due respect. For example, my Scala backend app does only one unsafe operation:  creates in-memory H2 DB at startup.  
(Side-note: sadly this aspect did not make the cut in design of Java 8 streams - I say streams are based on opinions, not programming).


Reasoning about code -  I find that the use of type class polymorphism retains good correspondence to logic.  Method signatures look like rotated Gentzen notation with clearly identifiable assumptions (type constraints) and conclusion (result type).  Here are 2 examples:

PureScript:
type AppM eff = (Aff (ajax :: AJAXappconf:: APPCONFIGnav:: NAVIGATION | eff))
ui :: forall eff model.
                 Show model =>
                 EntityRoute model =>
                 EntityReadHTML model =>
                 EntityGET eff model =>
                 Proxy model -> H.Component HH.HTML Query Input Void (AppM eff)

Scala:
case class CrudHandler[K,D,E[_]](uriString)(implicit evMMonad[E]
                                              evPersitAsTaskE ~> Task,
                                              evConvertKeyIntId[K],
                                              evPersistPersistCrud[K,D,E],
                                              evJsonKEncoder[K],
                                              evJsonDEncoder[D],
                                              evDecodeJsonDDecoder[D]) { ... }

Even without knowing what these are or wanting to admit that logic is important in programming it is evident that having the information organized into logical assumptions and conclusions makes things clear. For example (PureScript), I know that I am only using GET REST effects (no PUT, DELETE, etc), I know that my code is NOT using == comparison or has no sorting logic...


Polymorphism and generalization - most of my examples try to be polymorphic in both domain objects (being CRUD-ed) and in effects used.  For example, the same code works with database back-ends and STM persistence and, at the same time, works across domain objects.  In OO world that approach would probably violate KISS.  In FP generalization is essential tool because it limits solution space (parametricity) and makes it harder to write programs that compile and do not work.
(Some projects have ‘simple’ branch which is more ‘hard-coded’ and less polymorphic).


Type safety - I am interested in type safe FP (hence the selection of language choices).  All application provide type safety over effects. Unique types are used as much as possible (for example integer IDs used in URLs or as DB keys have unique types instead of using just Integer. Big advantage of using typed functional languages is type safety in deconstruction (pattern matching) often referred to as non-exhaustive pattern match error. This type safety feature comes for free (maybe except for Scala which needs sealed annotation) and my example apps, obviously, take advantage of it.
Just looking at the method signatures shown above, one has to a least entertain possibility that this amount of type information could imply stronger type checking.


Composability - standard monadic and applicative approaches are used to achieve composability. I think real-world examples like these CRUD apps make monadic computations feel right at home and feel very readable.  
Side-note: It was somewhat of a revelation for me when I finally realized (Eugenio Moggi∗original https://core.ac.uk/download/pdf/21173011.pdf) that sequential composability == monads.  Basically, composability == category,  composable computations == Kleisli. You want composable code you need monads.  Maybe trivial, but sends shivers down my spine.



Saturday, September 3, 2016

How I got addicted to Haskell. And so can you!

  ...Or, what I done this summer.

Let me start with this statement:  I have learned ways to write and structure code I did not ever thought possible. Ways which my OO trained mind has hard time accepting as real or even believable.  Yet, in many ways, these ways are more correct and real than the ways I knew before.  

This reading is intended for OO developers.  This post will talk about some amazing code that is possible in FP.  It will present high level intuitions and concepts behind that code.  It will skip the steep theoretical background.   When you see monad (or whatever) and do not know what it is, simply think it is ‘a computation structure that allow me to do that’ and move on.  The focus is on ‘what is possible’, not on details of ‘how it's done’.

If you are an experienced Haskeller and would like to comment or add to my presentation you are welcome to do so as well.

It all started innocent.  I just wanted to check if I can implement, in Haskell, a particularly complex piece of imperative logic I wrote for my work.  So I did just that.

This logic was about graphs (as in wikipedia).  At work I had to create a lot of graphs to test various special cases in the software.  So,  I decided to check how graph construction would work in Haskell.  If I can build graphs, I should be able to adjust them too....  

I was always interested in polymorphism.  What it is, what it can do, what it can be.  So what exactly is a graph?  Can I define it using Haskell type constraints?   So I worked on  that.

Now I have these polymorphic Haskell graph type classes, I have a type class for readonly graphs, one for buildable graphs, one for adjustable graphs with inheritance going between them.  I also have 5 or six specific graph instance types that `implement` my type classes, like adjacency map (edges-by-vertices) type, adjacency matrix type, simple list of edges and vertices types and so on.  I start to play with graph creation.

Polymorphic data.   Polymorphic what?  So I created a polymorphic diamond (wikipedia) graph data structure. Which instance is it?  It is all of them.  It is polymorphic so it can be whichever you like!  You want to look at adjacency matrix just select AdjacencyMatrix type, you want to look at is as EdgesByVertices map just select that type, and so on. So how weird that is for an OO programmer?  Polymorphic data structure!

But it gets better than that.  How about polymorphic graph distance calculation which can either return integer distance or shortest path depending on which type I select?

To demystify this a bit,  Haskell type class polymorphism is opposite to the OO inheritance.  OO is based on exists quantifier, Haskell on forall quantifier.   That allows Haskell typechecker to enforce that polymorphic code does only things that apply to all instances.  For example, this OO code is not truly polymorphic:  

Graph createDiamond(){def res=new AdjacencyMap(); return res;} 

because that code can manipulate AdjacencyMap, which is a specific instance. Did I muddy the waters even more? Just think about Haskell polymorphism as an upside-down version of OO (whatever that means) and let’s move on.

Now I am thinking:  I have all these implementations of my graph type class (or -sses as there is Graph, DiGraph, EdgeSemantics, AdjacencyIndex and so on...) are they correct?  How do I test these?  

And, I thought, I wrote automated tests before.  What does it mean for a graph to be correct or not correct?  There must be some graphish things that make graphs ... well graphs.  Call them graph laws or graph properties.  I need to define them.  

I came up with a set of graphish properties to verify my implementing instance types.  How do I actually test these? How about I generate lots of random graphs and check that each satisfies the property in question.  How hard would that be?  Easy!  Property based testing appears to be the most popular way to test stuff in Haskell.  The library is called QuickCheck.   What an interesting way to test!  I have always believed that good testing should have some random aspect to it.  And so I did it and got addicted a bit more.

Do-it-yourself polymorphism.     But are type classes the only way to do polymorphism in Haskell?   Do I even need Haskell to do polymorphism for me?  Can I just do it myself?  So what is polymorphism again?  How about:

  Ability to write one program and use it (consume it) in different ways

  Or, write different programs and consume them the same way  

How can I do that without interfaces, type classes, language inheritance?  Here is an idea: I write my code as bunch of specialized DSLs (Domain Specific Languages) and write several interpreters to run these.  DSL programs are abstract syntax trees and nothing more,  interpreters are computations that actually do something (or many things) with these trees.  This way I decouple production (DSL) from consumption (Interpreter) and things should be good.  I will have a do-it-myself polymorphism, strong decoupling of concerns, and more.  

Does the very idea sound weird to you?  Actually, this approach is now a recognizable pattern for structuring code in functional programming.  I may do DSL - interpreter decoupling simply because of code clarity, maintainability and composability benefits.  Composa what? 

By composable DSLs,  I mean DSLs that do not know of each other existence and of the polyglot DSL that uses them.   Similarly,  interpreters need to be agnostic of other  interpreters and compose into a polyglot interpreter.  

Good to pause here.  Think how easy would it be to do this in OO language.  Try to visualize a syntax tree of a polyglot program.  It will look like a multicolor ‘zebra tree’ with complete mishmash of instructions from various DSLs.  How any interpreter could ever make sense of it?  But, they compose just fine...  

I will present several different ways this can be done, all with computation structure totally foreign to OO.  And  all of this in a type system that is strong, second to none strong,  no duck typing.

DSLs for free:    Each DSL language has its own type and inhabitants of that type are simply programs in that DSL. Free is a type, and is (among other things) a language generator.  

Creating a DSL is a simple as typing this:

 type MyDSL  = Free MyDslInstructions

Well, there is also a completely mechanical conversion of instructions into the new DSL type. The cost is minimal and will probably will get automated with Template Haskell at some point.

I can now start writing programs in that DSL and Haskell type checker will verify that they are correct.

This does not look real!  That Free must be some large chunk of code!  Actually it is 1 line 45 characters long.  So what gives?  It is all about finding right computation structure to do the job.  I will try to provide some intuitions.

Intuition:  Start with a set of,  preferably orthogonal, abstract instructions for a language.  That is the  MyDslInstructions type,  it need to be a sum type (more on this later).   Free type definition is recursive.  Free explodes MyDslInstructions type to a type that includes all possible sequences of instructions.  Programs are sequences of instructions and, bingo, we have what we want.

I also need to write an interpreter for my new DSL,  but in this post let me assume this has been done already.   So how could DSL composition work?

   

Computing programs or computing on programs?     I have one DSL that specializes in a graph traversal and another one that knows how to build a tree.  How about I use both to create a spanning (wikipedia) tree for the traversal?   Each traversal step can be mapped to a simple program that adds that step to a tree.  Then, all the simple DSL-Tree programs can be reduced into one big program.  That program is returned as result and interpreted separately.  It builds the tree.  

I just have done map-reduce and my reduction step was done on programs!   How cool!  And I am getting addicted just a little bit more.  

Stacking DSLs  one on top of the other (like an onion) is another way DSLs can work together.  Onion analogy also describes how interpretation would work, it peels layer after layer, working on one DSL at the time. That approach involves a very clever trick:  

Intuition:  Language of type A can host an arbitrary language of type B by providing an instruction that wraps state B (internal computation state of that language) in a 'bubble' that has type A.  This way A has consistent state and can host B instructions.  Clever!

This trick is not specific to DSLs and is widely used in Haskell.  (keyword: monad transformer).

Actually as sophisticated as this is, there are more interesting ways to compose DSLs.  But,  before I can explain what they are,  I need to take a sidetour and introduce 2 other unbelievable concepts:

Sidetour: Adding and multiplying Types.    Imagine that we could define operators that work,  hmm,  on types!  We could do crazy stuff like this (assume that the ‘ingredients’  Word4, Name, Email,… Get, Put, Up  etc are all types):  

  type IpAddress = Word4 :*: Word4 :*: Word4 :*: Word4

  type Contact   = Name :*: Address :*: [Email] :*: [Phone]

  type HttpProtocol =  Get :+: Put :+: Post :+: Delete

  type StockTicker  =  Up Float :+: Down Float  

:*: represents type product (type inhabitants must contain all the ingredients),  

:+: represents a type sum, also known as coproduct (type inhabitants inhabit one of the ingredient types).  Product and coproduct are dual concepts.  If you do not know what dual means, just think about it on intuitive level as ‘exact opposite’.

Even more interesting is the case of parameterized types, it would be nice to have something like this:

  type Collection a = (List :+: Set) a

another words,  Collection Int is either List Int or Set Int but is not Set String.  

This sci-fi is reality.  Type operators are more a syntax sugar,  but the real eureka moment is that types can be product or coproduct and that you can do full fledged algebra on types.   So there is a type representing 0 and type representing 1 and the usual laws like a :*: (b :+: c) == (a :*: b) :+: (a :*: c) (with some relaxed definition of == of course).   Algebras are a good thing, algebras do not have bugs.    

FYI, all or almost all of the types in traditional OO are product.  Java class instances have all the class methods, not one of its methods.   Some coproduct behavior can be modeled with inheritance but that approach is largely missing the boat.  And, incidentally,  OperatorTypes is Haskell language extension so you can write a very similar code to the pseudo-code examples I just wrote.  

I hope you can see that it is hard to stop learning this stuff.   But how is this related to the Free DSL story?   It turns out that Free plays nice with coproduct types.  Free ‘preserves’ coproduct.  We will use this fact in a bit.

Sidetour:  Extensible Effects.  Probably,  the most touted advantage of pure FP is that it provides type safety over programs that can do criminal activity:  the side-effects.  Haskell wraps effects  (IO, variable mutations, etc) in a monad type.  That type discloses the crime and the mark of shame cannot be erased.   Unfortunately these monad types are a bit coarse grained.  In particular, IO monad allows for any side effect there is.

Sidenote: I have heard of only 2 commercial languages that can provide this level of type safety:  Haskell and Purescript (which is very similar to Haskell). But, I know nothing about OCaml, so maybe 3?.

With DLS-Interpreter pattern effect type safety really starts to shine.  Effectfull type signatures distribute nicely between DSL and Interpreter.  Here is how type signature of a program that uses a FileSystem DSL and a REST DSL and returns a String may look like:

     myDslProgram :: Free (FileSystem :+: REST) String

Types are amazing!  There is much more to extensible effects than what I have described,  let’s get back to composing DSLs:.

DSL Injections,  Polymorphic DSLs, and Interpreters are folds.    This idea is conceptually simple to explain on a very high level,  but there is a bit depth to it and it took me most of this summer to get it.  As before, I have:

 type MyDSL  = Free MyDslInstructions

If I can map MyDslInstructions into bigger sum BigLanguageInstructions, it should be intuitive and expected that sequences of MyDslInstructions (programs) become sequences of mapped instructions (programs) in the bigger language.  Thus,  I can ‘inject’ MyDSL into any DSL that is an ‘extension’ of my language (can talk my instructions set).  In particular I should be able to inject MyDSL into:

 type BiggerDSL = Free (MyDslInstructions :+: SomeOtherDsl)

And the same is true about Free SomeOtherDsl language.  This means that DSLs compose and instructions (adjusted with ‘inject’ function) can be used in the polyglot composition. It turns out that the power of Haskell upside-down polymorphic type classes allows me to specify ‘injection’ without even knowing  BiggerDSL!  And I can define MyDSL instructions so they can be  simply be used ‘as is’ in polyglot programs.  

Interpreters are generalized folds.  If you worked with folds before (e.g. map-reduce), this maybe one of these simultaneous aha and wow moments.  Program is a sequence of instructions that need to be folded to produce result.   It turns out that this can be also done polymorphically allowing interpreters to run in the context of a larger, polyglot program.

Sidenote (Edited):  A good intuition for DSL ‘injection’ is to use analogy with Sets.  Sets can be added (sum) and multiplied (cartesian product) and so can Types.  Sets have a partial order (inclusion) defined by injections between sets, well, so have Types!   How is that for polymorphism!?

Sidenote:  There is a good amount of depth here as this solves Expression Problem in Haskell and is a testimony to FP expressiveness (wikipedia and Data types a la carte).

But this is not the end of it.  Now the story gets really unreal. 

Program-interpreter duality.  DSLs add, Interpreters multiply!    This amazing program simplicity is achieved by pairing dual concepts (types).   Product types are dual to coproduct types.  Free type has its dual: the Cofree type. Each of the DSL abstract instruction can be paired with corresponding interpreter that has dual type signature.  By exploring all of these dualities DSLs combine almost without any code!  Interpreters are simply co-programs.

Earlier I have  created a DSL using this line of code:

  type MyDSL = Free MyDslInstructions

For this to work properly MyDslInstructions type needs to be a coproduct type.

There is a dual version of the whole Free story.  In nutshell we get this:

  type MyDSL = Free MyDslInstructions

  type MyInterpreter = Cofree MyInterpretations

MyInterpretations  contains one abstract interpreter per abstract instruction, and as you are probably guessing it needs to be a product type.  In addition, each of the abstract ‘interpretations’ needs to have signature dual to that of corresponding abstract instruction.

What this all amounts to is that languages add and interpreters multiply.  That makes languages and interpreters very composable:

type HighLevelDSL

   = Free (abstrInstructions1 :+: abstrInstructions2 :+: ..)

type HighLevelInterpreter

   = Cofree (abstrInterpretations1 :*: abstrInterpetations2 :*: ..)

This was on type level but on the implementation level things are equally super easy.  

If you would like to see more details, you can find examples in the Dave Laing blog linked below or in my github graph library, also linked.

That does sound to me like the best thing since the sliced bread.  It allows for writing very composable code and for a very nice separation of concerns.  

Limitations:  There is one which, to me, makes things even more interesting:  If I want my DSLs to be monadic (see Effects section of this blog) my interpreters will be comonadic and very un-effectful.  

Basically,  as a software designer,  I have some decisions to make how do I compose my DSLs.  Hey that’s  what I am paid for as a developer!  Wait, I am paid to work on mutating instance variables or something as criminal as Hibernate.  Scratch that idea.

Vertical DSLs.   With the DSL-Interpreter pattern, I have some choices to make.  Do I code a fat DSL and a thin Interpreter or a thin DSL and a fat Interpreter?  My OO mind shouts: fat interpreter, fat interpreter ...   But there is often a better way,  as described here: A Modern Architecture for FP   

In nutshell,  one quite straightforward way DSLs could work together is by implementing higher level DSLs in terms of lower level DSLs.

Intuition: Basically, to implement higher level DSL_HL in terms of DSL1, … DSLn,  I need to implement every instruction in my DSL_HL as a polyglot program using DSL1, … DSLN.  There is quite a bit of structure to this approach as well but I will spare you the theory.

So maybe the answer will be that nothing should ever get fat.  There will just be more vertical DSL layers. This way we could get clear code separation that could accomplish the, so far hopeless, quest for separation of concerns (wikipedia).

So here is the sci-fi for you again:  Think of coding your next app as a layered set of DSLs. Your app functionality, all business logic, everything,  is decomposed into thin layers or single purpose DSLs.    

I hope I did not lose you.  The rest of the post should be easier to read.

 

Unreal FP reality.   If the things I have described in my post look unreal to you, it is probably because your programming background matches mine.

Polymorphic data:   On a daily basis we write code that uses LSP (Liskov substitution) so results of polymorphism that is based on type constraints and 'forall' quantifiers will appear unintuitive and unreal.

Free DSLs:  As OO programmers we work almost only with product types; coproduct types are very much foreign to us. Monadic computing is totally foreign to us even though it is often a better match to real life problems we are trying to model and solve.  The structure we use in our code is comonadic but we do we even know that?  We use some free types like Lists but we think of them as 'arrays' and totally block out different ways of thinking:  like recursive data types.  Y-combinator means a company to us.

Types and Effects:  As OO programmers,  we have no need to learn Type Theory.  As for the 'Trinity of Computer Science' (Proof Theory, Type Theory, Category Theory) we typically know none. Our types lack generality and, thus, are weak. They also do not provide much for code transparency as we can hide almost any implementation under any type signature.  The concept of types that actually restrict programs (at the extreme end, think about a type that cannot be inhabited by any program) feels totally foreign.  So does any computing on types like: type algebra, type recursion, or functors. Basically,  our types just sit there and look pretty.  

There is no magic here, we just lack intuition and conceptual understanding of how the code can be structured.  The unreal becomes real after you figure out the structure that makes it work.   What we are missing out on are concepts not Haskell.

Convictions I have abandoned this summer:  I used to say stuff like this and I hear other people talk like this today:

"There is no such thing as a free lunch,  the code needs to be somewhere to do these things"

The idea is to program with concepts, not with brute force!  Placing computation in a correct computational structure will cut lots of lines of code.   

"KISS" (wikipedia)

Haskell has very different approach to generalization with very different end goals. Generalization in Haskell applies to large chunks of code as well as to all small sections of each line. Generalization yields a stronger typed code,  often uncovers structure of the computation, and opens up the code reuse.  It is a different ball game.  KISS no longer applies.

"Programming is a branch of computer science"

I may have overdone this one a bit :)   Did you know that there is a branch of math devoted to studying, well polymorphism and concepts that generalize computations?  It is called Category Theory.   In this post, I have purposefully silenced evidence of Category Theory use.  Programming using category theory concepts is really polymorphic,  across  different domains of science polymorphic!  Most importantly these concepts really work.

"Anything can be implemented, the question only is how much time is going to take"

Your compiler must be very forgiving.  Try working with GHC ;-).  

This is a manifestation of imperative ‘here are the instructions’ thinking.  OO design provides a recipe to do anything. This expressiveness is a problem.  It yields brute force solutions and not maintainable code.  As imperative programmer I am jack of all trades and master of none.  Imperative is sometimes needed.   But I refuse for this to be all there is.

The End:   Thank you for staying with me for so long and reading my blog.  This post has 8 pages already,  it covered just one path into FP,  there are many other paths and unreal things to explore.  Even the coverage of DSL-Interpreter pattern is far from complete (how about DSL language optimizations?).

There is an amazing FP programming world out there.   The best way to embrace it is to start playing with a true FP language like Haskell.   Nobody will ever learn monads by reading ‘monads in C#’ or ‘monads in Java’ articles.  You will just get a headache.  Get inspired to start playing with FP using FP!   It is worth noting that Haskell has inspired a lot of Scalaz concepts.  That includes Free DSL pattern in Scala.  These people are playing with Haskell!

The Code for this post:  My graph library is on github.

  https://github.com/rpeszek/GraphPlay

  https://github.com/rpeszek/GraphPlay/wiki 

Wiki pages for that project are all literate programs.  Literate programs are better than blogger!  ... But since my typing is terrible, these maybe not as literate as one would have hoped.

Acknowledgements:

To the generous FP community: thank you for spending time to share and, thus, help me learn!  I cannot possibly include here a complete list of excellent sources that should have been acknowledged in this post.  I do feel guilty about it. Here is my very abbreviated list:

 Data types a la carte  original paper by Wouter Swierstra defines many of the presented ideas and describes them in much more detail.

 Cofun with cofree comonads is amazing sequence of posts that includes Free-Coffree pattern by Dave Laing.  These post also contain a better list of due credits.

 Haskell for all by Gabriel Gonzalez is a great learning resource

 A Modern Architecture for FP by John A De Goes is a good read which will make you think about what is really important in all of this.

 Algebra on types very nice introduction by Chris Taylor

And again, thank you for reading.  R.    

 

Wednesday, October 9, 2013

Functional Groovy: Y-Combinator. Learning Groovy from Haskell ;)

Intro

I guess that, Y-Combinator (maybe together with Monad) is now on the Geek-Should-Know concept list.  Also YCombinator is a name of a venture capital firm so even managers may know about it!

I intend this to be a very simple intro to Y-Combinator using Groovy Closures.  There are other blogs about Y-Combinator in Groovy:

My goal is to explain/derive Y-combinator, not to just show how to use it. 
An excellent post with code examples in Scheme can be found here:
 http://mvanier.livejournal.com/2897.html

Combinators and Things: The History

"Computer Scientists do not know their history"  Eric Meijer
My background (education) is mathematics, not CS.  As a mathematician, I was never much into logic.  I am learning this stuff now.  I believe CS (maybe this is changing) education does not typically include topics like Combinatory Logic,  Lambda Calculus or Category/Functor Theory.
Universities produce an army of excellent engineers which think in only imperative terms.
(Correct me if I am wrong, I am sure this is very much school dependent.)

SKI Calculus and combinators (1920-ties, Moses Schönfinkel, Haskell Curry) are older than Turing Machines (1930-ties).  Lambda Calculus (1930-ties Alonzo Church) is concurrent to Turing work. Lots of this is very cool stuff and includes work of many other mathematicians, most notably, David Hilbert.

SKI calculus motivation was logic, not computing, but, in fact, Church's work was largely motivated by the idea of a computer (same as Turing work was).  You can think that lambda calculus was `software oriented` thinking, why Turning machines have been `hardware oriented`.  
That last statement is somewhat questionable: since two base combinators S, K (I can be derived) can express complete mathematical logic, it maybe possible to build CPU with S, K, I  as the instruction set.  That idea has been worked on a lot (however, it is hard to google for it:  "SKI hardware", "SKI instructions set"  ;).
The is some learning involved.  Here are some 'quick' guides I have recently found:

Enough of this into. (I am not sure it even belongs in this post).

Recursion as Fix-point

Functions are often defined using recursion. In this post, I will use Fibonacci sequence with its (super-slow) recursive definition which may look like this in Groovy:

 Closure fib
 fib = {int n -> n < 2 ? n: fib(n-1) + fib(n-2)}
but everything discussed here will generalize to other recursive function definitions.

Here is a different (and interesting) way to look at this.  We can create a functional FIB (a function that takes functions are argument and returns a new function) like so:

Closure FIB = {Closure f ->
  {int n -> n<2 ? n: f(n-1) + f(n-2)} //returns new closure
}
With Fpiglet you could alternatively define FIB like this (implicit currying):

 Closure FIB = FpigBase.f {Closure f, int n -> 
    n<2 ? n: f(n-1) + f(n-2)
 }

Our FIB functional maps functions into functions. The fix-point (if exists and here it does) is a function f0 such that:

 FIB(f0) == f0

It should be obvious that this f0 fix-point must be the same as the fib function defined above and this type of argument can be easily generalized to other recursive definitions.

So the name of the game here is: to find a way to 'solve' functionals like the FIB functional defined above for fix-points.

Fix function:

Would it be cool if we had a 'magic' function called fix which could calculate f0 like so:

 f0 = fix FIB

Obviously, Haskell has one ;) and it is defined like this:

fix :: (a -> a) -> a
fix f = f (fix f)

This is exactly what we need, if you blindly plugin 'FIB' for 'f' and 'f0' for 'fix FIB' the function implementation line becomes:

 f0 = FIB f0

so if fix returns anything it will he a fix-point!

With the above we can define Fibonacci sequence like this (note fib' is the same as FIB functional, we just do not want to use upper case function names in Haskell):

fib' :: (Integer -> Integer) -> Integer -> Integer
fib' f 0 = 0
fib' f 1 = 1
fib' f n = f (n-1) + f (n-2)

f0 :: Integer -> Integer
f0 = fix fib'

numbs = map f0 [0..6] --numbs equals to [0,1,1,2,3,5,8] 

Let us do that in Groovy!  There is a little glitch. Haskell is lazy language so the above implementation of fix does not have to stack overflow,  this simple Groovy equivalent will always stack overflow:

def fix
fix = {Closure f ->
    {x -> (f << fix(f)) x} 
}

but not if you do it like this:

def fix
fix = {Closure f ->
    f({x -> (fix(f))(x)}) 
}

you can run the following to convince yourself that it works:

def f0 = fix(FIB)
assert [0,1,1,2,3,5,8] == (0..6).collect(f0)
assert 8 == f0(6)


The trick is that fix function passes closure {x -> (fix(f))(x)} to the next iteration of FIB and that closure does not need to be invoked if n < 2. 
This is cool but the definition of `fix` itself is recursive.  So we are swapping one recursion for another.  I would be better if we could do recursion without ANY recursion! 

Y-Combinator

Y combinator is, well is a combinator (whatever that means) invented by Haskell Curry.
It it is typically represented by the following lambda expression (whatever that means):

  λf.(λx.f (x x)) (λx.f (x x))

Instead of explaining its meaning, let me try to derive it (using Groovy) and the meaning may become clear. Forget about fix function. I would really like to do something like this:

FIB(FIB)(6) //or FIB(FIB, 6)

This will not work with current definition of FIB but will work if you define it like so:

Closure FIB = {Closure f, int n ->
    n < 2 ? n: f(f, n-1) + f(f, n-2)
}

FIB(FIB, 6) //returns 8

or using curried versions:

Closure FIB = {Closure f ->
  {int n -> n < 2 ? n: f(f)(n-1) + f(f)(n-2) }
}

assert 8 == FIB(FIB)(6)

This is it, this is the main trick behind Y-Combinator.  The rest is simply bunch of refactoring steps.

This trick may look like cheating and to some extent it is.  Instead of being able to call function from itself (which would be recursion), we pass a copy of that function to itself so that function can call it. On one hand this seems like a recursion in disguise,  on the other hand it is function composition.  On some deep level, composing function with itself and recursion are similar concepts.

I like to think that function composition is a stronger concept than recursion.   Y verifies that claim.

I can refactor my code so it looks like this:

Closure c2 = {f ->
   return {x -> f(f)(x)}
}

Closure FIB= {Closure f2 ->
  return {int n -> n < 2 ? n: f2(n-1) + f2(n-2)}
}

c2({Closure f ->
  return FIB(c2(f))
}) (6)  //returns 8

and now I can re-write my code like so:

Closure c2 = {f ->
   {x -> f(f)(x)}
}

Closure FIB = {Closure f ->
  {int n -> n < 2 ? n: f(n-1) + f(n-2) }
}

def Y = {Closure fx -> c2({Closure f ->
   fx(c2(f))
})}  //fx represents functional, f function

def f0 = Y(FIB)
assert [0,1,1,2,3,5,8] == (0..6).collect(f0)

and this is it, we have Y Combinator in Groovy.  And if you look long enough you will see the original lamba expression!
Actually what we got is this (Groovy is strict/eager evaluation language):

 λf.(λx.f (λv.((x x) v))) (λx.f (λv.((x x) v)))

Formal Mathematical Proof

It is actually not very hard.   This PDF: http://www.inf.fu-berlin.de/lehre/WS03/alpi/lambda.pdf
has a good introduction to lambda calculus, but the proof that Y acts as fix-point finder requires more explanation:

Y g = (λf . (λx . f (x x)) (λx . f (x x))) g   //1. definition
= (λx . g (x x)) (λx . g (x x))                //2. β-reduction of λf: applied to g
= g ((λx . g (x x)) (λx . g (x x)))            //3. β-reduction of λx done on x x
= g (Y g)                                      //4. definition again

The only thing that needs explanation is how do we get from 2 to 3.  We basically are substituting 
 (λx . g (x x)) into x x expression on the left!

Useless or useful?

If you visited Hamlet D'Arcy's post on Y-combinator you may have noted that he has classified Y-combinator as cool but useless.
I think, the main applicability of Y-combinator is that it lets you keep your closures as anonymous functions.   You can define recursive closure without giving it a name like this:

Y {Closure f ->
    { ... }
}

I think that could make some of your code more expressive.  Also if pointless programming is important to you, then you should love Y!  Now, pointless and useless are not exactly the same ;).

There is also the thing about recursion being logically dangerous.  You can lookup 'Curry Paradox' and read on that.  But hopefully compilers will catch all of that for us, and I do not think this is related to Groovy programming anyway.

So is this just cool theoretical CS?  My take on this is that, for working programmers, it maybe more important to know what combinators are in general and try to learn to 'think in combinators'. In that Y-combinator probably will not help much.  'Thinking in combinators' will involve higher level combinators like function sets defining Monad, Arrows, Lenses, etc.