Software has two ingredients: opinions and logic (=programming). The second ingredient is rare and is typically replaced by the first.
I blog about code correctness, maintainability, testability, and functional programming.
This blog does not represent views or opinions of my employer.

Saturday, September 3, 2016

How I got addicted to Haskell. And so can you!

  ...Or, what I done this summer.

Let me start with this statement:  I have learned ways to write and structure code I did not ever thought possible. Ways which my OO trained mind has hard time accepting as real or even believable.  Yet, in many ways, these ways are more correct and real than the ways I knew before.  

This reading is intended for OO developers.  This post will talk about some amazing code that is possible in FP.  It will present high level intuitions and concepts behind that code.  It will skip the steep theoretical background.   When you see monad (or whatever) and do not know what it is, simply think it is ‘a computation structure that allow me to do that’ and move on.  The focus is on ‘what is possible’, not on details of ‘how it's done’.

If you are an experienced Haskeller and would like to comment or add to my presentation you are welcome to do so as well.

It all started innocent.  I just wanted to check if I can implement, in Haskell, a particularly complex piece of imperative logic I wrote for my work.  So I did just that.

This logic was about graphs (as in wikipedia).  At work I had to create a lot of graphs to test various special cases in the software.  So,  I decided to check how graph construction would work in Haskell.  If I can build graphs, I should be able to adjust them too....  

I was always interested in polymorphism.  What it is, what it can do, what it can be.  So what exactly is a graph?  Can I define it using Haskell type constraints?   So I worked on  that.

Now I have these polymorphic Haskell graph type classes, I have a type class for readonly graphs, one for buildable graphs, one for adjustable graphs with inheritance going between them.  I also have 5 or six specific graph instance types that `implement` my type classes, like adjacency map (edges-by-vertices) type, adjacency matrix type, simple list of edges and vertices types and so on.  I start to play with graph creation.

Polymorphic data.   Polymorphic what?  So I created a polymorphic diamond (wikipedia) graph data structure. Which instance is it?  It is all of them.  It is polymorphic so it can be whichever you like!  You want to look at adjacency matrix just select AdjacencyMatrix type, you want to look at is as EdgesByVertices map just select that type, and so on. So how weird that is for an OO programmer?  Polymorphic data structure!

But it gets better than that.  How about polymorphic graph distance calculation which can either return integer distance or shortest path depending on which type I select?

To demystify this a bit,  Haskell type class polymorphism is opposite to the OO inheritance.  OO is based on exists quantifier, Haskell on forall quantifier.   That allows Haskell typechecker to enforce that polymorphic code does only things that apply to all instances.  For example, this OO code is not truly polymorphic:  

Graph createDiamond(){def res=new AdjacencyMap(); return res;} 

because that code can manipulate AdjacencyMap, which is a specific instance. Did I muddy the waters even more? Just think about Haskell polymorphism as an upside-down version of OO (whatever that means) and let’s move on.

Now I am thinking:  I have all these implementations of my graph type class (or -sses as there is Graph, DiGraph, EdgeSemantics, AdjacencyIndex and so on...) are they correct?  How do I test these?  

And, I thought, I wrote automated tests before.  What does it mean for a graph to be correct or not correct?  There must be some graphish things that make graphs ... well graphs.  Call them graph laws or graph properties.  I need to define them.  

I came up with a set of graphish properties to verify my implementing instance types.  How do I actually test these? How about I generate lots of random graphs and check that each satisfies the property in question.  How hard would that be?  Easy!  Property based testing appears to be the most popular way to test stuff in Haskell.  The library is called QuickCheck.   What an interesting way to test!  I have always believed that good testing should have some random aspect to it.  And so I did it and got addicted a bit more.

Do-it-yourself polymorphism.     But are type classes the only way to do polymorphism in Haskell?   Do I even need Haskell to do polymorphism for me?  Can I just do it myself?  So what is polymorphism again?  How about:

  Ability to write one program and use it (consume it) in different ways

  Or, write different programs and consume them the same way  

How can I do that without interfaces, type classes, language inheritance?  Here is an idea: I write my code as bunch of specialized DSLs (Domain Specific Languages) and write several interpreters to run these.  DSL programs are abstract syntax trees and nothing more,  interpreters are computations that actually do something (or many things) with these trees.  This way I decouple production (DSL) from consumption (Interpreter) and things should be good.  I will have a do-it-myself polymorphism, strong decoupling of concerns, and more.  

Does the very idea sound weird to you?  Actually, this approach is now a recognizable pattern for structuring code in functional programming.  I may do DSL - interpreter decoupling simply because of code clarity, maintainability and composability benefits.  Composa what? 

By composable DSLs,  I mean DSLs that do not know of each other existence and of the polyglot DSL that uses them.   Similarly,  interpreters need to be agnostic of other  interpreters and compose into a polyglot interpreter.  

Good to pause here.  Think how easy would it be to do this in OO language.  Try to visualize a syntax tree of a polyglot program.  It will look like a multicolor ‘zebra tree’ with complete mishmash of instructions from various DSLs.  How any interpreter could ever make sense of it?  But, they compose just fine...  

I will present several different ways this can be done, all with computation structure totally foreign to OO.  And  all of this in a type system that is strong, second to none strong,  no duck typing.

DSLs for free:    Each DSL language has its own type and inhabitants of that type are simply programs in that DSL. Free is a type, and is (among other things) a language generator.  

Creating a DSL is a simple as typing this:

 type MyDSL  = Free MyDslInstructions

Well, there is also a completely mechanical conversion of instructions into the new DSL type. The cost is minimal and will probably will get automated with Template Haskell at some point.

I can now start writing programs in that DSL and Haskell type checker will verify that they are correct.

This does not look real!  That Free must be some large chunk of code!  Actually it is 1 line 45 characters long.  So what gives?  It is all about finding right computation structure to do the job.  I will try to provide some intuitions.

Intuition:  Start with a set of,  preferably orthogonal, abstract instructions for a language.  That is the  MyDslInstructions type,  it need to be a sum type (more on this later).   Free type definition is recursive.  Free explodes MyDslInstructions type to a type that includes all possible sequences of instructions.  Programs are sequences of instructions and, bingo, we have what we want.

I also need to write an interpreter for my new DSL,  but in this post let me assume this has been done already.   So how could DSL composition work?

   

Computing programs or computing on programs?     I have one DSL that specializes in a graph traversal and another one that knows how to build a tree.  How about I use both to create a spanning (wikipedia) tree for the traversal?   Each traversal step can be mapped to a simple program that adds that step to a tree.  Then, all the simple DSL-Tree programs can be reduced into one big program.  That program is returned as result and interpreted separately.  It builds the tree.  

I just have done map-reduce and my reduction step was done on programs!   How cool!  And I am getting addicted just a little bit more.  

Stacking DSLs  one on top of the other (like an onion) is another way DSLs can work together.  Onion analogy also describes how interpretation would work, it peels layer after layer, working on one DSL at the time. That approach involves a very clever trick:  

Intuition:  Language of type A can host an arbitrary language of type B by providing an instruction that wraps state B (internal computation state of that language) in a 'bubble' that has type A.  This way A has consistent state and can host B instructions.  Clever!

This trick is not specific to DSLs and is widely used in Haskell.  (keyword: monad transformer).

Actually as sophisticated as this is, there are more interesting ways to compose DSLs.  But,  before I can explain what they are,  I need to take a sidetour and introduce 2 other unbelievable concepts:

Sidetour: Adding and multiplying Types.    Imagine that we could define operators that work,  hmm,  on types!  We could do crazy stuff like this (assume that the ‘ingredients’  Word4, Name, Email,… Get, Put, Up  etc are all types):  

  type IpAddress = Word4 :*: Word4 :*: Word4 :*: Word4

  type Contact   = Name :*: Address :*: [Email] :*: [Phone]

  type HttpProtocol =  Get :+: Put :+: Post :+: Delete

  type StockTicker  =  Up Float :+: Down Float  

:*: represents type product (type inhabitants must contain all the ingredients),  

:+: represents a type sum, also known as coproduct (type inhabitants inhabit one of the ingredient types).  Product and coproduct are dual concepts.  If you do not know what dual means, just think about it on intuitive level as ‘exact opposite’.

Even more interesting is the case of parameterized types, it would be nice to have something like this:

  type Collection a = (List :+: Set) a

another words,  Collection Int is either List Int or Set Int but is not Set String.  

This sci-fi is reality.  Type operators are more a syntax sugar,  but the real eureka moment is that types can be product or coproduct and that you can do full fledged algebra on types.   So there is a type representing 0 and type representing 1 and the usual laws like a :*: (b :+: c) == (a :*: b) :+: (a :*: c) (with some relaxed definition of == of course).   Algebras are a good thing, algebras do not have bugs.    

FYI, all or almost all of the types in traditional OO are product.  Java class instances have all the class methods, not one of its methods.   Some coproduct behavior can be modeled with inheritance but that approach is largely missing the boat.  And, incidentally,  OperatorTypes is Haskell language extension so you can write a very similar code to the pseudo-code examples I just wrote.  

I hope you can see that it is hard to stop learning this stuff.   But how is this related to the Free DSL story?   It turns out that Free plays nice with coproduct types.  Free ‘preserves’ coproduct.  We will use this fact in a bit.

Sidetour:  Extensible Effects.  Probably,  the most touted advantage of pure FP is that it provides type safety over programs that can do criminal activity:  the side-effects.  Haskell wraps effects  (IO, variable mutations, etc) in a monad type.  That type discloses the crime and the mark of shame cannot be erased.   Unfortunately these monad types are a bit coarse grained.  In particular, IO monad allows for any side effect there is.

Sidenote: I have heard of only 2 commercial languages that can provide this level of type safety:  Haskell and Purescript (which is very similar to Haskell). But, I know nothing about OCaml, so maybe 3?.

With DLS-Interpreter pattern effect type safety really starts to shine.  Effectfull type signatures distribute nicely between DSL and Interpreter.  Here is how type signature of a program that uses a FileSystem DSL and a REST DSL and returns a String may look like:

     myDslProgram :: Free (FileSystem :+: REST) String

Types are amazing!  There is much more to extensible effects than what I have described,  let’s get back to composing DSLs:.

DSL Injections,  Polymorphic DSLs, and Interpreters are folds.    This idea is conceptually simple to explain on a very high level,  but there is a bit depth to it and it took me most of this summer to get it.  As before, I have:

 type MyDSL  = Free MyDslInstructions

If I can map MyDslInstructions into bigger sum BigLanguageInstructions, it should be intuitive and expected that sequences of MyDslInstructions (programs) become sequences of mapped instructions (programs) in the bigger language.  Thus,  I can ‘inject’ MyDSL into any DSL that is an ‘extension’ of my language (can talk my instructions set).  In particular I should be able to inject MyDSL into:

 type BiggerDSL = Free (MyDslInstructions :+: SomeOtherDsl)

And the same is true about Free SomeOtherDsl language.  This means that DSLs compose and instructions (adjusted with ‘inject’ function) can be used in the polyglot composition. It turns out that the power of Haskell upside-down polymorphic type classes allows me to specify ‘injection’ without even knowing  BiggerDSL!  And I can define MyDSL instructions so they can be  simply be used ‘as is’ in polyglot programs.  

Interpreters are generalized folds.  If you worked with folds before (e.g. map-reduce), this maybe one of these simultaneous aha and wow moments.  Program is a sequence of instructions that need to be folded to produce result.   It turns out that this can be also done polymorphically allowing interpreters to run in the context of a larger, polyglot program.

Sidenote (Edited):  A good intuition for DSL ‘injection’ is to use analogy with Sets.  Sets can be added (sum) and multiplied (cartesian product) and so can Types.  Sets have a partial order (inclusion) defined by injections between sets, well, so have Types!   How is that for polymorphism!?

Sidenote:  There is a good amount of depth here as this solves Expression Problem in Haskell and is a testimony to FP expressiveness (wikipedia and Data types a la carte).

But this is not the end of it.  Now the story gets really unreal. 

Program-interpreter duality.  DSLs add, Interpreters multiply!    This amazing program simplicity is achieved by pairing dual concepts (types).   Product types are dual to coproduct types.  Free type has its dual: the Cofree type. Each of the DSL abstract instruction can be paired with corresponding interpreter that has dual type signature.  By exploring all of these dualities DSLs combine almost without any code!  Interpreters are simply co-programs.

Earlier I have  created a DSL using this line of code:

  type MyDSL = Free MyDslInstructions

For this to work properly MyDslInstructions type needs to be a coproduct type.

There is a dual version of the whole Free story.  In nutshell we get this:

  type MyDSL = Free MyDslInstructions

  type MyInterpreter = Cofree MyInterpretations

MyInterpretations  contains one abstract interpreter per abstract instruction, and as you are probably guessing it needs to be a product type.  In addition, each of the abstract ‘interpretations’ needs to have signature dual to that of corresponding abstract instruction.

What this all amounts to is that languages add and interpreters multiply.  That makes languages and interpreters very composable:

type HighLevelDSL

   = Free (abstrInstructions1 :+: abstrInstructions2 :+: ..)

type HighLevelInterpreter

   = Cofree (abstrInterpretations1 :*: abstrInterpetations2 :*: ..)

This was on type level but on the implementation level things are equally super easy.  

If you would like to see more details, you can find examples in the Dave Laing blog linked below or in my github graph library, also linked.

That does sound to me like the best thing since the sliced bread.  It allows for writing very composable code and for a very nice separation of concerns.  

Limitations:  There is one which, to me, makes things even more interesting:  If I want my DSLs to be monadic (see Effects section of this blog) my interpreters will be comonadic and very un-effectful.  

Basically,  as a software designer,  I have some decisions to make how do I compose my DSLs.  Hey that’s  what I am paid for as a developer!  Wait, I am paid to work on mutating instance variables or something as criminal as Hibernate.  Scratch that idea.

Vertical DSLs.   With the DSL-Interpreter pattern, I have some choices to make.  Do I code a fat DSL and a thin Interpreter or a thin DSL and a fat Interpreter?  My OO mind shouts: fat interpreter, fat interpreter ...   But there is often a better way,  as described here: A Modern Architecture for FP   

In nutshell,  one quite straightforward way DSLs could work together is by implementing higher level DSLs in terms of lower level DSLs.

Intuition: Basically, to implement higher level DSL_HL in terms of DSL1, … DSLn,  I need to implement every instruction in my DSL_HL as a polyglot program using DSL1, … DSLN.  There is quite a bit of structure to this approach as well but I will spare you the theory.

So maybe the answer will be that nothing should ever get fat.  There will just be more vertical DSL layers. This way we could get clear code separation that could accomplish the, so far hopeless, quest for separation of concerns (wikipedia).

So here is the sci-fi for you again:  Think of coding your next app as a layered set of DSLs. Your app functionality, all business logic, everything,  is decomposed into thin layers or single purpose DSLs.    

I hope I did not lose you.  The rest of the post should be easier to read.

 

Unreal FP reality.   If the things I have described in my post look unreal to you, it is probably because your programming background matches mine.

Polymorphic data:   On a daily basis we write code that uses LSP (Liskov substitution) so results of polymorphism that is based on type constraints and 'forall' quantifiers will appear unintuitive and unreal.

Free DSLs:  As OO programmers we work almost only with product types; coproduct types are very much foreign to us. Monadic computing is totally foreign to us even though it is often a better match to real life problems we are trying to model and solve.  The structure we use in our code is comonadic but we do we even know that?  We use some free types like Lists but we think of them as 'arrays' and totally block out different ways of thinking:  like recursive data types.  Y-combinator means a company to us.

Types and Effects:  As OO programmers,  we have no need to learn Type Theory.  As for the 'Trinity of Computer Science' (Proof Theory, Type Theory, Category Theory) we typically know none. Our types lack generality and, thus, are weak. They also do not provide much for code transparency as we can hide almost any implementation under any type signature.  The concept of types that actually restrict programs (at the extreme end, think about a type that cannot be inhabited by any program) feels totally foreign.  So does any computing on types like: type algebra, type recursion, or functors. Basically,  our types just sit there and look pretty.  

There is no magic here, we just lack intuition and conceptual understanding of how the code can be structured.  The unreal becomes real after you figure out the structure that makes it work.   What we are missing out on are concepts not Haskell.

Convictions I have abandoned this summer:  I used to say stuff like this and I hear other people talk like this today:

"There is no such thing as a free lunch,  the code needs to be somewhere to do these things"

The idea is to program with concepts, not with brute force!  Placing computation in a correct computational structure will cut lots of lines of code.   

"KISS" (wikipedia)

Haskell has very different approach to generalization with very different end goals. Generalization in Haskell applies to large chunks of code as well as to all small sections of each line. Generalization yields a stronger typed code,  often uncovers structure of the computation, and opens up the code reuse.  It is a different ball game.  KISS no longer applies.

"Programming is a branch of computer science"

I may have overdone this one a bit :)   Did you know that there is a branch of math devoted to studying, well polymorphism and concepts that generalize computations?  It is called Category Theory.   In this post, I have purposefully silenced evidence of Category Theory use.  Programming using category theory concepts is really polymorphic,  across  different domains of science polymorphic!  Most importantly these concepts really work.

"Anything can be implemented, the question only is how much time is going to take"

Your compiler must be very forgiving.  Try working with GHC ;-).  

This is a manifestation of imperative ‘here are the instructions’ thinking.  OO design provides a recipe to do anything. This expressiveness is a problem.  It yields brute force solutions and not maintainable code.  As imperative programmer I am jack of all trades and master of none.  Imperative is sometimes needed.   But I refuse for this to be all there is.

The End:   Thank you for staying with me for so long and reading my blog.  This post has 8 pages already,  it covered just one path into FP,  there are many other paths and unreal things to explore.  Even the coverage of DSL-Interpreter pattern is far from complete (how about DSL language optimizations?).

There is an amazing FP programming world out there.   The best way to embrace it is to start playing with a true FP language like Haskell.   Nobody will ever learn monads by reading ‘monads in C#’ or ‘monads in Java’ articles.  You will just get a headache.  Get inspired to start playing with FP using FP!   It is worth noting that Haskell has inspired a lot of Scalaz concepts.  That includes Free DSL pattern in Scala.  These people are playing with Haskell!

The Code for this post:  My graph library is on github.

  https://github.com/rpeszek/GraphPlay

  https://github.com/rpeszek/GraphPlay/wiki 

Wiki pages for that project are all literate programs.  Literate programs are better than blogger!  ... But since my typing is terrible, these maybe not as literate as one would have hoped.

Acknowledgements:

To the generous FP community: thank you for spending time to share and, thus, help me learn!  I cannot possibly include here a complete list of excellent sources that should have been acknowledged in this post.  I do feel guilty about it. Here is my very abbreviated list:

 Data types a la carte  original paper by Wouter Swierstra defines many of the presented ideas and describes them in much more detail.

 Cofun with cofree comonads is amazing sequence of posts that includes Free-Coffree pattern by Dave Laing.  These post also contain a better list of due credits.

 Haskell for all by Gabriel Gonzalez is a great learning resource

 A Modern Architecture for FP by John A De Goes is a good read which will make you think about what is really important in all of this.

 Algebra on types very nice introduction by Chris Taylor

And again, thank you for reading.  R.    

 

No comments:

Post a Comment