Opinions and Programming: SCALA

Showing posts with label SCALA. Show all posts

Sunday, August 20, 2017

Comprehensive real life functional programming examples

previous post: Scala what’s wrong with you!

Several years ago, I think at Uberconf, I was listening to yet another talk that showed some list manipulations and called it functional programming. I also remember being very confused about how to use FP in real life. Seeing bunch of examples did not translate, for me, to being able to write end-to-end application using these concepts. I remember talking to my colleague that it would be really good to have a comprehensive example, not just isolated 5 code-liners.

About a year ago, I decided to just do it. I decided to work on a simple CRUD application and implement it in a fully functional way. The idea grew and became: Polyglot application made out of interchangeable front-ends and back-ends implemented using various languages and functional libraries. https://github.com/rpeszek/typesafe-web-polyglot. At this moment I have 3 back-ends (Yesod-Haskell, Servant-Haskell, Http4s-Scala) and 2 front-ends (Elm, Halogen-PureScript). I will write short blogs describing my goals behind each of these projects. This post describes my general goals.

Comprehensive, what do I mean?

These are relatively small apps (a bit over 1000 lines each) that still facilitate the amount of code reuse I would expect to see in a large application. There is quite a bit of similarity between domain objects that expose create-retrieve-update-delete interface, thus, I expect to have code reuse too. These apps do what typical web-apps do today: talk to database, do Ajax communication, deal with Json, etc.

This is just about CRUD so there is no business logic to speak of and things are rather dull, but I think it becomes clear how would adding business logic look like in that code.

What is still not implemented?

Security, better styling (completely ignored in my current PureScript app), more complex (has-many) relationships.

Fully functional way, what do I mean?

Referential transparency (same computation, same inputs == same results) - this ought to be the defining property of FP since it enables everything else. Effectful computations can still be be referentially transparent. All of my apps treat this aspect with due respect. For example, my Scala backend app does only one unsafe operation: creates in-memory H2 DB at startup.

(Side-note: sadly this aspect did not make the cut in design of Java 8 streams - I say streams are based on opinions, not programming).

Reasoning about code - I find that the use of type class polymorphism retains good correspondence to logic. Method signatures look like rotated Gentzen notation with clearly identifiable assumptions (type constraints) and conclusion (result type). Here are 2 examples:

PureScript:

type AppM eff = (Aff (ajax :: AJAX, appconf:: APPCONFIG, nav:: NAVIGATION | eff))

ui :: forall eff model.

Show model =>

EntityRoute model =>

EntityReadHTML model =>

EntityGET eff model =>

Proxy model -> H.Component HH.HTML Query Input Void (AppM eff)

Scala:

case class CrudHandler[K,D,E[_]](uri: String)(implicit evM: Monad[E]

evPersitAsTask: E ~> Task,

evConvertKey: IntId[K],

evPersist: PersistCrud[K,D,E],

evJsonK: Encoder[K],

evJsonD: Encoder[D],

evDecodeJsonD: Decoder[D]) { ... }

Even without knowing what these are or wanting to admit that logic is important in programming it is evident that having the information organized into logical assumptions and conclusions makes things clear. For example (PureScript), I know that I am only using GET REST effects (no PUT, DELETE, etc), I know that my code is NOT using == comparison or has no sorting logic...

Polymorphism and generalization - most of my examples try to be polymorphic in both domain objects (being CRUD-ed) and in effects used. For example, the same code works with database back-ends and STM persistence and, at the same time, works across domain objects. In OO world that approach would probably violate KISS. In FP generalization is essential tool because it limits solution space (parametricity) and makes it harder to write programs that compile and do not work.
(Some projects have ‘simple’ branch which is more ‘hard-coded’ and less polymorphic).

Type safety - I am interested in type safe FP (hence the selection of language choices). All application provide type safety over effects. Unique types are used as much as possible (for example integer IDs used in URLs or as DB keys have unique types instead of using just Integer. Big advantage of using typed functional languages is type safety in deconstruction (pattern matching) often referred to as non-exhaustive pattern match error. This type safety feature comes for free (maybe except for Scala which needs sealed annotation) and my example apps, obviously, take advantage of it.
Just looking at the method signatures shown above, one has to a least entertain possibility that this amount of type information could imply stronger type checking.

Composability - standard monadic and applicative approaches are used to achieve composability. I think real-world examples like these CRUD apps make monadic computations feel right at home and feel very readable.
Side-note: It was somewhat of a revelation for me when I finally realized (Eugenio Moggi∗original https://core.ac.uk/download/pdf/21173011.pdf) that sequential composability == monads. Basically, composability == category, composable computations == Kleisli. You want composable code you need monads. Maybe trivial, but sends shivers down my spine.

Saturday, July 29, 2017

Scala what’s wrong with you!

I want to share this as, hopefully, a somewhat unique take on Scala by a complete Scala’s newb. My team at work is evaluating new technology choices and I was tasked with evaluating Scala.

A little about me: I program mostly Groovy/Grails at work and Haskell at home. Have been coding Haskell for a bit over 5 years and Groovy/Grails for maybe a year longer. I started as a big enthusiast of OO (long time ago), now I enjoy different things (like TAPL book). I am disclosing this because words like important, practical, right, and wrong tend to change meaning depending on the background.

To make my transition to Scala easier, I decided to use Scalaz. I ended up using both Scalaz and Scala std library side-by-side. That let to somewhat shocking realizations:

val ints: List[Int] = List(1,2,3)

val strings: List[String] = List(“hi”, “there”)

val test: List[Int] = ints diff strings

That will never ever subtract anything from the list but compiles just fine and runs without error. Just switch to scalaz IList and nonsense like this disappears. This is disappointing. Code like this is the reason why I want to run away from Groovy. Even C# would not allow such code to compile (C# happens to be the other technology choice evaluated by my group).

Type inference

I get that figuring out a decent type reconstruction algorithm for a language with subtyping was a seriously impressive achievement. But what is the practicality of it? To make programming more error prone? Here are some examples:

val test = "a" :: List(1,2,3)

res: List[Any] = List(a, 1, 2, 3)

val test = List(1,2,3) ++ List("a","b")

res: List[Any] = List(1, 2, 3, a, b)

The pragmatic take on this is: if the inferred upper bound is Scala’s Any a 99% chance this is a program bug and a 1% chance that this is what programmer intended (and I am generous here). I find this really concerning. Refactoring will introduce bugs, think about changing 1-1 relation to 1-many.

You can switch to using scalaz where type inference appears to be magically safe:

val test: = "a" :: IList(1,2,3)

error: type mismatch

found : String

required: Int

"a" :: IList(1,2,3)

val test = IList(1,2,3) ++ IList("a","b")

error: type mismatch

...

Googling on the subject tells me that type safety concerns like these seem to be answered as ‘just use invariant types’. Invariant types help but:

case class TwoOfTheSame[A](x: A, y: A)

val test = TwoOfTheSame(1,"hello") //TwoOfTheSame[Any] = TwoOfTheSame(1,hello)

Equals

In a language where people can (and do) write code like:

list.map(_._2).filter(_._3.id == someId)

Lack of type safety on == is just crazy. Quoting Paul Phillips (youtube) Java has jumped of a bridge and Scala followed. Incidentally, just use scalaz === and you will be fine.

Type safety ... add more types

Adding more types makes things better, at least that ought to be the conclusion from Mars orbiter crash. As an example, database primary keys tend to be Ints or Longs. I could use distinct types instead! Unfortunately, that will just exacerbate the type safety problem with equals:

case class DepartmentPkey(id: Long)

case class EmployeePkey(id: Long)

case class Employee(id: EmployeePkey, name: String, deptId: DepartmentPkey, ...)

val deptId: DepartmentPkey = …

employees : List[Employee] = ..

//all of these compile

employees.filter(_.id == deptId)

employees.filter(_.id.id == deptId)

employees.filter(_.deptId.id == deptId)

employees.contains(depId)

These kinds of issues is what makes some of us very upset (see the above Paul Phillips talk youtube, and also this scalaz ticket). Others do not see these as a problem. People are just so used to achieving correctness by hours of staring at the code, debugging, and testing that most of us do not recognize these as a problem. Correctness by effort is the status quo.

So is Scalaz the only good thing about Scala?

No there is also Cats library. Just kidding! I think Scala has lot of good things going for it. It is the only language option that allows for slow migration path from OO to FP. It has community that mostly tries to work together across the OO-FP divide and seems to benefit from this diversity. (See this infoq video.)

Has DOT calculus that took 8 years of research (youtube) making Scala one of the very few languages that recognize the importance of solid theoretical foundation. Has new dotty compiler project that should result in more maintainable, sane, and much faster compiler. Has plans for language supported effects! (dotty shows Effect in the feature list with status ‘considered’).

In my opinion:

All programming languages (possibly with exceptions of lambda calculi and such) create this closed, limiting environment of thought. Programming is thinking inside the box. Compared to mainstream alternatives Scala make it a larger box.

Scala is this amazing place where OO and FP can be placed side-by-side for an eye opening comparison. There is a need for a library that is more approachable than Scalaz or Cats to champion this. The assumption has to be made that people will not invest time to learn FP unless they see immediate benefits and the learning curve is minimal. It is sad that the standard library lacks basic type safety to play this role.

I am very excited about the possibility of language level support for effects. This will introduce effects into mainstream and other languages will eventually follow. Effects may finally be treated with due diligence! I think about software engineering as a predator with logic as the pray. It is important not to kill for the sake of killing (think mutable variables), kill only for food (think updating DB record). Programming and logic can live happily together and if you do it right they are the same species.

Scala was designed with OO engineer in mind. It is a big compromise for functional programmers (see scalaz ticket linked above). FP-ers choose to code in Scala because it is the only (established) choice on JVM that has things they need (like type classes or higher kinded types).

Scala is also overly complex. Things can be made much simpler if OO is dropped from the table. Solutions like the Eta project could prove a better choice for bridging OO and FP on the JVM. The cost of that bridge is in a more structured FFI and not in the complexity of the language and compiler. Bridging OO and FP can happen using two languages, not one. This option will make functional programmers happy.

I have done two months of Scala. I have worked on 3 small projects using it. One of them is on my github. I am very impressed with the quality of of the libraries I have used: http4s, sql library with funny name: doobie, scalaSTM (I have implemented a minimal set of scalaz bindings for it), shapeless, circe, scalacheck, Scalatags. It was quite inspiring experience after Groovy. Scala can write really nice code.

Thank you for reading.

Friday, April 13, 2012

Imperative curlies 5: good conditionals

Continuation of my previous bashing of curlies.

In the previous posts I have proposed 2 rules: eliminated the curiels for better code and shorten the distance between curlies for better code. This post adds the following rule (3): go an extra mile in eliminating the curlies and cool stuff will happen.

If and switch statements are examples of curlies. I still have hard time wrapping my mind around conditional statements in programming. When are they bad and when they are OK?

Side Note: Conditional statements in imperative and functional worlds are very different beasts. For example, Java if-else statement is all about conditional side-effects. In pure functional programming, if-else statement cannot have side-effects and it is really a function returning values. Clearly, functional ifs are more likable.

If statements have been considered villains for a very, very long time. For example, the Visitor Design Pattern has been advocated as an OO alternative to if/switch statements. The worst offenders are the pieces of code with nested if statements which split imperative logic into logical branches and smaller branches and smaller branches… Code like this is where bugs like to thrive, good test coverage is hard or impossible to achieve, the interaction between different conditions is hard to analyze. Code like this typically implies bad design. We used to call such design procedural and lacking Object Orientation. I believe it is this type of code that gave if/switch statements bad name. But are conditional statements evil PERIOD, or is it how we use them that makes them the bad guys?

Many conditional statements are declarative in nature. The concept of pattern matching is fundamental to functional programming. Similarly to for loops they are sometimes simply unavoidable. A good example is specifying boundary conditions: think of differential equations in math, or a typical recursive function definition. Here is an example, Fibonacci sequence is de-facto the hello-world of recursive functions:

fib(0) = 0
fib(1) = 1
fib(n) = fib(n-1) + fib(n-2) for n=2,3,…

If you wanted to define this sequence in Java you would write something like this (Java):

1: int fib(int n){
2:      switch(n) {
3:          case 0: 
4:          case 1: return n;
5:          default: return fib(n-1) + fib(n-2);
6:      }
7: }

Who cares about Fibonacci numbers? Will I use it on my next programming project? I doubt that I will. The point is, however, that problems in real life have boundary conditions, is handling nulls a boundary condition? Fibonacci numbers are just a good example of boundary condition and recursion all in one.

Instead of looking at memorization or numeric optimization aspects, I will approach the fib sequence from the point of view of curlies: I will try to remove them.

So how can this be done? Is there a way to define fibs without the boundary conditions? One way is to try to replace the recursive definition with a formula (a closed form expression) (if there is one). In the case of fibs there is the Binet formula and more. So if you go this route you will rub your elbows with things like golden ratio end up studying Da Vinci’s work, ancient philosophy and look at flowers. All very cool stuff! Replacing declarative definition of fib with Binet formula (maybe not super computer friendly, but still) gets rid of the curlies and introduces some cool albeit old science!

I came across another way of approaching fib sequence (or similar recursive) problems. Imagine the impossible: computer program understanding the concept of an infinite sequence. Let’s make it even better, in this utopia world you can define infinite sequences recursively! You may want to define fib sequence by saying that it is a sequence which starts with 0 and it is followed by 1 and it is followed by sequence composed by adding corresponding elements of shifted fib sequence (recursive itself) to the left by 2 positions to shifted fib sequence (recursive itself) to the left by one position. So before the impossible dream evaporates in a puff of smoke let me write the SCALA code (example found in Wampler/Payne Programming Scala book):

1: val fib: Stream[Int] = 
2:     Stream.cons(0, 
3:        Stream.cons(1, 
4:           fib.zip(fib.tail).map(pair => pair._1 + pair._2)))
5:
6: fib.take(7).print  //prints: 0, 1, 1, 2, 3, 5, 8, empty

Notice that the conditional statements and curlies have disappeared and we have introduced some deep programming concepts at the same time. It seems like cool stuff happens when curlies disappear!

Short explanation of what this code does: lazy evaluated Streams are infinite sequences in SCALA. The infinity does not come into play because the sequence is not evaluated until needed (take() call). Sequences are like linked lists so they have head and tail. Stream.cons(a, b) returns a stream starting at a and b being a tail. Zip is a functional concept of combining 2 sequences into one sequence of pairs (2 first elements, 2 second elements, etc). SCALA can be even less verbose (from SCALADoc):

1: val fib: Stream[Int] = 0 #:: 1 #:: 
2:        fib.zip(fib.tail).map(pair => pair._1 + pair._2)

There is also a quick ‘Am I an imperative programmer’ self-test here:
We have defined fib(n: Int) without using ‘n’. Test question: Do you have a problem with this?

The point of all of this nonsense if that conditional statements are part of life and have to be a part of my programs. This does not mean I will not try to avoid them. Only, there is no universal solution, no magic pill. Binet’s formula works for Fibonacci numbers but will not help me in writing null pointer exception free code without if statements. For this I may end up using other tricks like SCALAs Options concept, JavaScript && or Groovy ? or maybe even decide to learn what a monad is. Going extra mile in removing the curlies will force a deeper understating of how to write beautiful programs into me.

Thursday, March 29, 2012

Imperative curlies 3: for comprehensions and powder skiing

Continuation of Previous Bashing of Curlies
Over the many year of my involvement in Java I have seen very little code reuse around loops. For loops (and other loops) in Java are yet another category of hard to test, hard to maintain code. By now we know they are no good: they are surrounded by the curlies ; ).

Functional programmers have for loops too, only they call them comprehensions! Functional programming often deals with collections of data so loops are unavoidable. So what is the difference?

The difference is really in the attitude. It is like powder skiing. I am a developer and a ski bum. I am very much into safe (inbounds) powder skiing. Like many other skiers I had hard time to learn how to ski powder at first. Frustrated, I decided that what I need to do is to start pretending. So I started pretending that I am really good: with posture and everything else making sure it appears to look like I know what I am doing. (Side note: this technique is very effective in a very deep powder because no one will see what I am doing anyway ; ) Obviously I sucked big time, I only appeared to be a good powder skier. (Think of this as writing a for loop which looks very pretty.) After some years of pretending I learned that my skiing consists of simple reusable elements such as tipping, retracting, pulling back my feet, etc . So for typical everyday tasks on the snow I now can stop pretending and just do these elemental tasks and ski! (Think of this as not using for loops any more: code reuse). When I need to do something new on skis (like trying teles), I will go back to pretending (or to writing a for loop).

The first step is the acknowledgement that what my for loop is doing should have a single purpose: comprehending a collection. (I also think of this step as admittance of being guilty of using the curlies.) The second step is the code reuse for the tasks we perform often, what kind of comprehensions will we be typically doing?: how about: joining, reducing, folding, mapping, finding (any), finding all, etc.

Assume that we need to produce a custom version of toString(). Let’s look at some old Java first:

1:  public class PackRat { 
2:    private List<String> stuff = new ArrayList<String>(); 

3:    public void addToStuff() { ... }
 
4:    public String toString() { 
5:     StringBuffer res = new StringBuffer(); 
6:     res.append("PackRat: "); 
7:     for(int i=0; i<stuff.size(); i++) { 
8:       res.append(stuff.get(i); 
9:       if(i<stuff.size() -1) { 
10:        res.append(";"); 
11:       }  
12:     } 
13:     return res.toString(); 
14:    } 
15:  }

Same thing done in Groovy, which adds some reusable methods to avoid writing explicit for loops:

1:  class ParckRat { 
2:    List stuff = [] 
3:    def addToStuff … 
4:    String toString() { 
5:     "PackRat "+ stuff.join(";") 
6:    } 
7:  }

Note: Libraries like Guava or Apache Commons provide you with join() method as external utility, sadly Java List interface does not have a join method.

The Groovy code looks much better . But still 2 curlies we should be able to get rid of. Unfortunately, we are trying to override a Java method in Groovy so we are stuck with the limitations the methods have (methods are not closures). So what could we do if method where closures? Let’s look as SCALA where functions are functions, not methods or closures: I want to my code to simply state that my toString function is really the same as a prefixed stuff.join(“;”). I should be able to declare it, not implement it!

So here is the more declarative and curlyless version done in SCALA:

1:  class PackRat {  
2:    private var stuff: List[String] = List[String]() 
3:    def addToStuff = … 
4:    override def toString = "PackRat: " + stuff.mkString(";") 
5:  }

(SCALA glossary: var makes stuff a mutable instance variable, def keyword indicates function definition. List[String] is somewhat different than java List, for example, it is immutable: Note that Java/Groovy code above is not thread safe, SCALAs version does not have such problem.)

Here is yet another proof of the curly count being a good measure of code quality. On one end of the spectrum there is the imperative code with loop spelled out in Java (please count curlies in that code), on the other end there is SCALAs beautiful code where a particular kind of comprehension logic is simply declared!

Many developers think of the fact that SCALA allows you to drop curly brackets from one-liner functions as just a syntactic sugar and find the syntax iffy. My view is the opposite. The curlyless functions support the declarative style of coding and allows developer to define functions using expressions supported by SCALA language (contrast this with implementing all the methods in Java).

Obviously, there are other kinds of reusable comprehensions. For example reducing is far more general than joining. Here is SCALAs version of the above toString function using reduce:

override def toString = "PackRat: " + stuff.reduceLeft(_ + ";" + _)

Might look like compiler sugar (and again conceptually a deep stuff not iffy stuff), here is a spelled out version which is still purely declarative with no curlies:

def myConcatenateStrings(s1: String, s2: String):  String = s1 + ";" + s2
override def toString = "PackRat: " + stuff.reduceLeft(myConcatenateStrings)

Groovy code can be designed with more closures and fewer methods. This could facilitate much more declarative type of coding than methods allow. We have seen some of it in our previous post (Post on Curlies and GORM). The same declarative approach can be used to get rid of many for loops and make Groovy code closer to SCALA.

So what is the point of all of this? First, the obvious, code reuse and testability should be a good thing no matter what is the language; using reusable for loop logic can be done in Java too. It will look like someone is trying hard to be functional in Java (you can always say you are doing fluent coding and no one will know ;) Second, the idea of one-liners being closer to declarative programming warrants more thought and is more that a coding style no matter what is your language.

I hope to write more about it in the future.

Side Note: To all Java programmers (that would include me) I want to point out the obvious big difference between imperative for loops and functional comprehensions: Imperative for loop is really a sequence of side-effects executed in order, pure functional programming cannot have side-effects so comprehensions always return a new collection.

Next Bashing of Curlies.

Opinions and Programming