Opinions and Programming: 2012

Wednesday, December 12, 2012

Polymorphism on Steroids - Eat My Dust Java

I remember an interview questions from one of my employers. It was about 3 things which define Object Oriented Programming. Expected answer was: Encapsulation, Inheritance, and Polymorphism... Well Haskell is NOT an OO language. It supports encapsulation, it has sophisticated inheritance concept called type classes, and it supports polymorphism in ways an OO language could not even dream about. I guess programming is changing, questions and aswers need to change too.

In my learning Functional Programming and Haskell in particular, the most surprising and unexpected eureka moments have all been related to polymorphism.
Functional world offers polymorphism and code reuse in places which are unexpected to an OO programmer (or, at least, unexpected to me). I blogged about code reuse and for-loops (for example: this curly braces post), but that was just scratching the surface.

Consider these examples of functionality you may need to code:
1) Lists viewed as non-deterministic computations: normal computing deals with one value, non-deterministic computing deals with a (discrete/finite) collection of possible values.
The result of operating on several values is typically even bigger collection of different possibilities.
Examples: If a = [1,2,3,4,5,6] represents result of tossing one die,
b =[2, 3, 3, 4, 4, 4, 5, 5, 5, 5, 6, 6, 6, 6, 6, 7, 7, 7, 7, 7, 7, 8, 8, 8, 8, 8, 9, 9, 9, 9, 10, 10, 10, 11, 11, 12] could represent possible sums of tossing two dice. (You may be tempted to write something along the lines of b=a+a.) Also, if you very ambitious you could combine the same values and add probabilities for another good example... but I do not go there.

2) Lists viewed as 'vectors': If we want to add two lists we would add corresponding coordinates forming a vector of the same size. This is the most straightforward view of the 'list' concept.
Zip functions are based on this approach.

3) Maybe concept (Option in SCALA). This concept provides better alternative to null values in traditional languages. You want a type which encapsulates having a value or having Nothing.

4) Improving on Exceptions. If you want something better that exception throwing supported by various OO languages you may want to decide to invent something like the Either type: which can be either right (a result) or left (error).

5) Function Composition. .. if you remember enough math to know what it is ;)

The above list can be made longer. But just these 5 concepts should look like apples and oranges, and some Pepperoni, a Car, and a color Blue. Could you ever envision an OO program designed to handle ALL of the above concepts in the same way? There are many huhh moments in learning functional programming, but this is what makes it so fascinating.

Yes, the above concepts are all very much related in some very deep ways. These deep ways are:
- they all can be made into applicative functors
- they all can be made into monads

These similarities (or, to be more accurate, this similarity because being a monad implies being applicative functor) allows Haskell programmers to write the same code accross all of these 5 concepts (and many concepts not listed here). To me this is a whow! No, it is: a big WHOW!

So I will try to write a bit more about it in the future.

Thursday, October 18, 2012

Curlies 10: recursion, why language matters

This is 10th installment about Curly braces: my arguing that curlies are the reason for all the evil in programming ;) And my team has introduced mustache to my current project :D.

I wrote about for loops and how they compare to functional alternatives before. Today's topic is a bit different but is still relevant to the imperative for loops. The point is that recursion can be a very expressive alternative to the for loop.

Lets look at some code examples. Groovy collections have a method called collect which applies a given closure to each collection element like so:

assert [10,20,30] == [1,2,3].collect { it * 10 }

If Groovy did not have collect method and we wanted to write our own, how would we do that?
Probably with a for loop! In fact this is how the underlying implementation of Groovy method looks like:

public static Collection collect(Object self, Collection collection,
Closure closure) {
for (Iterator iter = InvokerHelper.asIterator(self); iter.hasNext();) {
collection.add(closure.call(iter.next()));
}
return collection;
}

Let's compare this with functional language equivalent (let's do that in Haskell!) So if Haskel did not have a built in Map function, here is how we would write one:

  map :: (a->b) -> [a] -> [b]
map _ [] = []
map f (head : tail) = f head : map f tail

We are done! I do not want to say we 'implemented' or 'coded' map function. This would imply that we told the computer what to do and we did not! We told the computer what map is.
Haskell to English: (1st line) is declaration, in Haskell all functions are curried so to define function c = f(a, b) you write f :: a -> b -> c. So, this declaration says that map accepts both a function: a->b and a list [a]as parameters and returns a list [b].
(2nd line) Says that map of empty list is an empty list no matter what the function is.
(3rd line) Says that to map a non-empty list you apply the function on the first element of that list and apply the map with this function to the remaining elements of the list (recursive induction).
Side Note: a very cool pattern matching is happening. If list [1,2,3] is defined as x:xs, that means that:  x==1 and xs==[2,3]. The parentheses (head : tail) are here only to disambiguate the order of things. In Haskell to call function f on x you just type: f x
I find Haskell version much much readable!

Next lets look at something a bit more complex. Haskell alternative to Groovy's collection findAll method which works like so:

assert [3] == [1,2,3].findAll { it > 2 }

Can you imagine how it is implemented? Obviously with a for loop! I will spare you the details. Here is how you could do that in Haskell (Haskell calls the same operation filter):

filter :: (a->Bool) -> [a] -> [b]
filter _ [] = []
filter p (head : tail)
| p head = tail : filter p tail
| otherwise = filter p tail

Haskell to English: This bar ( | ) things are called guards and are somewhat reminiscent of our old switch statement.
(1st line) is declaration.
(2nd line) filter applied to empty list results in empty list no matter what the predicate function is.
(3rd line and on) for predicate function p and list with first element called head and the remaining elements called tail, if the predicate is satisfied for the head element then the head element is included otherwise it is not.
That is all there is to it!

You can see the expressiveness of Haskell! Recursion allows us to almost translate the definition from English directly to the program! Readability!   Imperative programs tell the computer what to do, declarative programs tell the computer what things are.

So why languages matter? Many programmers (I was one of them) tend to think that the multiplicity of new languages today comes from some form of developer vanity. Hey why not create one more language? Why do we assume that all advancements in computer science can be expressed in, say, Java? Obviously they can't! New work in type systems, in compilers, ... the point is clear but we still do look down on all these new languages!

The recursion is another little reason why new languages are needed. You could try to use recursion in Java but with big collections prepare yourself for stack overflow errors. You need some serious language help for that!
Now I am NOT saying that there are no stack overflow problems in Haskell or SCALA. In fact, knowledge of how to avoid stack overflow errors is part of language learning. However, functional language compilers are armed with bunch of tricks such as tail-call optimization, trampoline... These tools allow language compiler to resolve recursion into a for loop!
Added Note: As I have learned recently Groovy is catching up and you can trampoline your closures (that is not a compiler function and requires explicit coding). Also there is some tail-call optimization support for methods (not closures) in both Groovy and Java. I may write more about it in the future.

Final little point (and I tried to make this point before): I hope I have shown some very expressive code examples. However the terseness and expressiveness was not the goal, rather is was a consequence. Declarative programming yields terse and expressive code, but not all terse and expressive code is declarative or functional. So what is more important expressive or declarative, expressive or functional? But this maybe a good topic for another post some day, maybe.

Wednesday, October 17, 2012

Semantic Web Presentation

This post is dedicated to all who took part in my recent Semantic Web presentation.

There was no time to cover everything I wanted. I will use this post page to touch on some things I should have included but did not.

Here is the link to my presentation:

Semantic Web Presentation on Google Docs

I have mentioned that there are no good books to read, or at least none that I could find. I ended up reading Developer's guide to semantic web. I cannot say I recommend it, it is better than the other one I have tried: Semantic Web Programming. I have also briefly looked at o'reilly book but it seemed it provides less comprehensive coverage of the subject.

Large topic I decided I will have no time to talk about during my presentation is: microformats. I appears that microformats are liked by the REST community and semantic/linked data community is not crazy about them. To me microformat is 'hardcoding' semantic data, RDF is not. I do see the flexibility of RDF as a big advantage. If micoformats are new to you, the idea is to simply add class attributes to your HTML and put the semantic information by using certain agreed upon by community class names which represent semantic info. The catch is that these names have to be what community has agreed to. That agreeing takes time and often is not easy (maybe even not possible).

Quick intro to this idea can be found here: http://microformats.org/get-started

Interestingly (and relevant) is the species microformat: http://microformats.org/wiki/species

It has been proposed in 2006 and is still in a straw-man status. There are several good reasons why it is hard to agree on microformat for species. Consider this simple question: how would you identify a species, would you use, say, scientific name or common name? Both are likely to change and do change... If only microformats had owl:sameAs... But if they did they would not be microformats, would they?

Now add more complex questions of how to handle several different taxonomies, etc.

Species microformat is a good example why the RDF direction maybe a better choice for creation of semantic information about biology and wildlife.

During the presentation, I have decided to focus on the aspect of publishing semantic data. Another big and important topic is how to take advantage of semantic data published internally or by other entities. How do you consume semantic info? That would open a conversation about programming libraries like Jena, Pellet, Sesame..., tools like Protégé, Neon....

The classic idea in this direction would be: find all publications which talk about Bald Eagle needing Water as resource. Now the publication may be tagged under Bald Eagle and River, or Bald Eagle and Lake, or Raptors and Fish. Not your everyday DB query is it?

And finally there is the 'semantic first' and 'semantic last' discussion. It may seem like a topic for later, but is not. The point is that we would be designing software differently if part of the thought process involved semantic considerations. It is hard to design for something you do not understand.

One aspect of such consideration is: is there a relevant ontology out there (example: Dublin Core for published works). If so, should I design my software in a similar way (even if I do not intend to publish RDFs near term)? The broader question in this area is: how do we design the software so that adding semantic presence will not be hard in the future...

Lastly, lots of people think of semantic web/linked data as one big database. Let me reiterate that: whole web is one big database! Will this be a game changer, will we see 'Macro' computing (if that even is a valid term)? Looking at things from a very large scale point of view could be big (so what is relationship between number of cancer cases and geographical latitude? no problem, let me run a query...).
Such uniformity (one database!) will not be achieved with SOAP web services. How about with microformats? Except for few markups community agrees on, unlikely! REST is an idea which may provide needed uniformity (but I doubt it will - see my previous posts on why). With RDF and ontologies: we may get uniformity without uniformity (we do not need to agree on using the same terms as long as the relevant terms are connected). As long as owl.sameAs, rdfs.seeAlso, rdfs.isDefinedBy, foaf.primaryTopicOf, etc are used there still may be uniformity within all the diversity.

Thursday, July 5, 2012

Imperative curlies 9: Haskell

If you read any of my posts about bashing curly braces and you worked with Haskell then I am sure you have thought: wait until someone shows Haskell to this guy. Well you have been right.

I am reading Learn Youa Haskell for Great Good! by Miran Lipovaca. My pride got bruised because of the book subtitle: Beginners Guide. The guy showing me Haskell is a university student from Slovenia. I may have been a bit skeptical when buying a copy but I am more than happy. If only any of the semantic web writers knew how to write as well as Miran (my semantic web reading or struggling through it is another story).

Haskell learning in many ways revalidates my opinions. The concept sitting behind curly braces in Java practically does not exist in Haskell.

If you see curly braces {} in Haskell you probably reading record syntax. The underlying dislike of the imperative code simply permeates throughout the language.

I think learning Haskell is a must do exercise for every imperative programmer like myself. It is an eye opening experience to see a language where if-else statement (even though unpopular in Haskell) is really a function (well so it is in SCALA, but it is more in Haskell ;). You will start thinking of Java if statements as ugly conditional side-effects. You will think of Java for-loops as ordered collections of side-effects. You will because, well, they are.

So go get the book and enjoy it as much as I do.

Friday, June 15, 2012

REST 4: Web App with some REST on a side

Probably my last post of the series inspired by the RESTfulweb services book. This post is about these very few topics where I have problem in fully agreeing with the authors.

JavaScript Ajax

More and more of web apps are Java Script heavy and end up moving data back and forth with Ajax. The book suggest this to be a great opportunity for REST. I have some skepticism.

I can be persuaded that SPA (single page load - applications with only one HTML page sporting an empty page body, and lots of JavaScript) are more likely to be implemented with true REST. (Authors call these type of projects Ajax projects and make a point that these are really true Web Services under the hood.) But I am still skeptical.

Web application development tends to focus on ... well on the user interface development. Unless project has explicit goal to create web service interface easily accessible by the 'rest' of the world, the implementation will end up a REST-RPC hybrid at best.

I am probably repeating myself here. My point is that REST is not on the path of least resistance for Web App development. Simply put: in today's task focused development a true REST will not make into the story line. As long as JavaScript gets the data it needs and sever gets the data it needs from JavaScript the development goals are satisfied. REST as part of Ajax will most likely only mean using GET, POST, PUT, DELETE verbs. The data will likely be send in JSON format and will contain whatever is needed by the UI: and not a meaningful representation of a resource. The URLs will be constructed by JavaScript and will likely use ? and database IDs. status codes 500 and 200 only (well maybe 404 and 401 but that will be about it) ...

Grails withFormat and Rails respond_to

It seems that both Grails and Rails made it not very easy for the programmer to separate controllers serving the same URL base: to have one for serving REST and a twin for serving web pages.

The seemingly easiest and most tempting alternative is to use withFormat (Grails) or respond_to (Rails) blocks of code. The book seems to approve of the Rails respond_to method (Grails is not covered).
This is tempting because you can do stuff like this:

    def list() {
        def books = Book.list()

        withFormat {
            html { [bookList:books] } 
            json { render books as JSON } 
            xml { render books as XML }
        }
    }

I do not like this approach. Here is why:

If your web application is serving HTML pages to the user the controller is already busy doing what is it supposed to be doing in the MVC web framework: flowing the pages (we are not talking Ajax here).

Page flow and serving REST uniform interface are 2 different problems and have very little in common.

Separation of Concerns?: Ironically, my first post about REST compared it to OO.

Let me rephrase this: in an MVC web application serving web pages it is the Controller's job is to decide what the user will see next. In a RESTful Web Service it is the client responsibility to decide what to do or see next. If there is a user sitting behind a REST client (it does not have to be) the UI presentation flow is the REST client responsibility. The difference is that of push vs pull.

Using the same controller for flowing pages and serving REST is mixing apples and oranges.

So my take on this is: If you are writing a RESTfull Web Service and are using Grails or Rails than using withFormat/respond_to is a great idea. You are serving different representations of the same thing and it is convenient to see the rendering logic for all of these representations in one place.

If you are adding REST on a side of a Web app, then keeping your REST logic together with page flow logic is a bad idea.

Concluding Remarks:

Reading the book was overall a great learning experience for me. Hopefully I have managed to stir some unREST into my coworkers and friends with these posts.

I am not a REST expert. I have still a lot to learn. I am looking forward to discussing this stuff.

My next step should be to read up on semantic web but this will need to wait a bit ... I got myself a haskell book...

Thursday, June 14, 2012

REST (3): why NOT easy: the details

Continuation of the previous post motivated by RESTful WebServices book.

In the previous post I have tried to identify the elements of REST and pointed out general problems around software development of RESTful web services. I am considering REST based on the HTTP protocol only.
REST is important concept and can yield many benefits to applications which decide to be strictly RESTful.

Here is a more low level detailed list of REST things which are not very easy to do. (It is my private what I can screw up (or I did screw up) and where I should be careful list):

URLS

Constructing URL is NOT client job: I believe we will see very few applications where this is met and it is obviously hard to achieve. It requires excellent 'connectedness'. Designs where client needs to know how to construct URLs are not RESTful.
? Restrictions: Except for few spelled out exceptional cases using ? is not fashionable any more. Unfortunatelly ?-marks are so easy to use: just throw some parameters in and you done, you never have to rethink you URLs, you can simply add stuff to them to piggyback new functionality. If you do that it is not RESTful.
No Database IDs: Rails with ActiveRecord and Grails with GORM make using IDs easy. These are not very useful especially when the service is not well connected. Imagine a fantasy Active Directory exposing REST interface, you want to get a phone number for jsmith. Would you prefer to issue a GET to: .../users/jsmith or .../users/2001234? Frankly, the second URL format is useless, yet you will find it in most Rails or Grails projects which claim RESTfulness and in fact this is what you get if you use Rails REST scaffold.
Lack of support for templates: (example: /users/{userId}/posts/{topic}). Few tools support templates. GRAILS has UrlMapping file, but there are limitation to it as well. I believe Rails, as I understand it, is more limited than Grails.
No Verbs please: This may require extra effort and even rethinking of what are your resources. URLs represent resources, so URLs containing verbs (.../myserver/mydatabase/runDefragmenter) are not RESTful. How about POSTing to: .../myserver/mydatabase/defragjobs instead? API-like thinking is very much ingrained in us. API and REST should not be used in the same sentence (unless you think of the uniform interface).

Uniform Interface

HTTP protocol is the base for achieving the uniformity. This goes beyond the 4 (or 6, or 8) HTTP verbs and requires understanding of the HTTP protocol itself with advanced use of headers, status codes, etc. Defining HTTP protocol and explaining how it applies to REST is not what I want to do here. I just want to point out that RESTful is not RESTful without it. Here are some examples.

Status Codes: Example 1: consider a PUT to a URL representing a resource with unique constraint on its name. The PUT tries to rename the resource but a resource with the new name already exists. I bet many apps will simply return status 500. Correct code should be: 409 (Conflict).

Example 2: As above, consider a PUT request to rename a resource and assume that resource name is part of the URL. I bet may web services will simply return 200 and a new representation leaving to the client to 'construct' the new URL. This is not optimal use of HTTP protocol and is not RESTful.

Other Header Stuff: There is a bit to learn there and this will probably be overlooked by many developers writing RESTful apps. This could create lack of consistency.

Uniform Interface vs Serving Web Pages to the users Typical 'MVC' web frameworks have you focus on flow of web pages. Programming REST Uniform Interface is different. Making both pieces of logic coexist is harder than it looks. I will explain more in the next post.

Formatting resource representations

Standard Interface helps but its benefits are less inspiring if REST client coded by someone else has no way of understanding the content.

Coming up with a standard vocabulary for your problem domain may be hard, one may simply not exist. Adopting something like Atom Feeds or XHTML may seem artificial or awkward (yet often are the best choices you have).

Grails or Rails make it easy to spill out XML or JSON, but this will not be any standard vocabulary and your resources will not be connected.

Even much simpler problem of assuring that error messages are served in a format expected by the clients are often surprisingly easy to overlook and may not so easy to implement.

Statelessness.

REST has no place for cookies, session objects, etc. Simply put: don't claim your app is RESTful if you store server state between HTTP requests. Hypermedia is the engine of application state in REST.

I may be opinionated on this but to me it is not just about REST: if you programming web app and use session object, you should rethink your design. But then since people use server side state on web apps a lot, they must be finding this technique convenient and the benefits of being stateless not worth the extra effort to them.

My personal opinion is that a possible exception to this rule is use of some standard security infrastructure (like Spring Security) which may have stateful implementation and you have no control over it. This still would violate REST, but you may have not much choice.

POST is the worst of the 6 verbs.

I you can avoid using it you should. But that is obviously not easy.

I am not elaborating on this one ... go and read the book ;)

Connectedness.

If the client has to construct URLs then the server is not RESTful. Ability to discover URLs is what differentiates REST from API.

I wrote a bit about it in my previous post. This is the hardest thing to do right and will impact your decisions on representation formats. This topic is out of scope for this post.

My next post will focus on the coexistence of Web App and REST

Sunday, June 10, 2012

REST (2): why NOT easy

Continuation of the previous post motivated by my reading of the RESTful Web Services book.

This post is a high level overview of the topic. In my next posts I will provide more concrete examples of why REST is not so easy. I am talking about REST in the context of HTTP protocol only. I assume the reader has basic familiarity what are the 6 HTTP verbs (TRACE and CONNECT do not count ;) and what REST is or at least what is typically painted to be.

When you develop RESTful web services you will be dealing with

Resources (think: flight reservation as a resource example)
URLs (which are the resource addresses, think: http://.../{userName}/travel/reservations/flights/{tripName})
Resource Representations (for example some XML vocabulary, think: XHTML document describing flight reservation, seat assignments, etc)

Thus, designing your web service application you will need to think about:

How to define URLs (some ways are more RESTful than others)
What standards to use to define resource representations (some standards are more RESTful than others)

In addition, the book identifies following properties which make things RESTful (or as the book calls them ROA-Resource Oriented):

Addressability (measure of how well you can access your resources with URLs, ideally each resource should have a URL)
Uniform Interface (requires that the web service exposes subset of the 6 HTTP verbs and uses HTTP protocol itself to manage resources, think: GET gets current flight reservation, PUT makes changes to it, DELETE deletes it, think of using HTTP header and status codes)
Statelessness (no state on the server, state is in hypermedia)
Connectedness (measure how well are your resource representations linked, think: link or a tags linking to user account with list of other reservations this user has made, link to the airline flight itself, link or form to a ticket resource (to buy the ticket for this reservation), maybe a form tag defining what can be changed using PUT, etc)

The first expense of adding REST to your project will be some design time: (1) split your application into many resources, (2) create URLs for all the resources, (3) map your problem domain to the Uniform Interface (that includes using standard status codes, user of more advanced HTTP header stuff), (4) design your representations (make formatting decisions), (5) connect the resources with some type of hypermedia.

There are 4 big problems categories which make it hard:

PROBLEM 1: Tool Support:

Most tools including Grails or Rails are designed to do something else with ease (to quickly create web apps) and the simplifying decisions which made them successful also makes doing other things (like creating a RESTful web service) hard. The book lamented about Rails (Grails) has not been covered.

If you are using tools designed for REST (like restlet) this point does not apply.

PROBLEM 2: Finding Standard Vocabulary for Resource Representation:
Some form of XML seem to be best bet, but a vocabulary designed for your problem domain simply may not exist. Using XHTML maybe a good option. There may be some relevant microformats for your problem domain.

PROBLEM 3: How to implement Connectedness:

There are very few tools or even standards which could help you to connect your resources. Connectedness may be the most important part of being RESTful and is the hardest to implement. The point is that without it REST it is just another API only with fewer verbs and many, many nouns. Sure there is a benefit of not needing to know which verb to use (you have to when using APIs), but there is a tradeoff of having to know many nouns (URLs). Connectedness implies that REST client can discover the URLs and it does not need to know them and should not be constructing them!

Here is where semantic web becomes important. Things like microformats, RDF, etc.
XHTML has link and a tags and forms (HTML 5 forms get better). Atom feeds/APP, etc also define cross linking between resources. Microformats play a big role in qualifying the 'connections' between resources.

Future RESTful clients will be able to semantically know the links.

HTTP protocol itself can help in connectedness a bit with its location header and various status codes: 201 (Created), and many in the 3xx (Redirection) range. But mostly connecting resources will be the job of your resource representations. No free lunch here. To connect your resources some more serious work needs to be done and you will be looking at exposing you data trough things like Atom feeds.

PROBLEM 4: Confusion:

Remember the big picture from my last post?: APIs are all different, REST is trying to be the same. The uniformity will not happen if everyone is confused about what REST is, yet claims to support it.

REST is often understood as simply some level of support for the 4 verbs in HTTP (GET, POST, PUT and DELETE). I maybe repeated myself here, but this trivialized view is demonstrated by most frameworks (including Ext JS, Rails, Grails). Adopting this simplified view point creates confusion and probably impedes support for more meaningful REST functionality these frameworks could provide.

Example: RESTful JSON. JSON is not connectable (at least not in any standard way), so how can web service be truly RESTful if it is serving JSON only?

This was my high level overview. Next post will deal with more concrete examples of why it is hard to REST (and maybe easier to stay BUSY ;)

Saturday, June 9, 2012

REST (1): misunderstood and important, how and why

I want to take a break from declarative/curly posts and write a bit about REST. I hope this will help spawn some discussion among developers I work with and know.
I am on last pages of (the rightfully acclaimed) RESTful web services, by Richardson and Ruby book. The reading felt a bit repetitive, it felt like hammering of ideas into my head, but they probably needed hammering. Overall a great book, one that opened my mind to what REST is about (or should be about).

Somewhere half way through the book things started to click. (I am trying to share some of these clicks in this and in a next few posts.) I read quite a bit about REST before reaching for the book. The first question is why did it take so long to click?

I came out of this reading with a much better understanding of the HTTP protocol itself, so this must be a part of the learning curve. But, I also blame my slowness on two things: first reason is the amount of confusion surrounding REST, second reason is harder to explain: REST is somewhat complex if you look at it as a set of implementation guidelines (the how), it is also quite straightforward if you look at it as what it is for (the why). It is hard to understand the how until you get the why.

Developers like to focus on the mechanics of things (the how). Developers also have deadlines and work in a task-oriented environment where there maybe many whys but the goals of Uniform Interface, ease of future integration and benefits of semantic linking are probably not likely to be among the whys which pay their paychecks. REST become such a buzzword that it ends up being used even if the why is not on the radar. REST gets half-implemented but holds on to its full name. This fuels confusion.

REST Confused:

The term REST is used a lot these days, I hear it at work, I see a lot about it on the web. Many developers are very interested in it and know a lot about it (way more than I do). Last year's Uberconf REST offerings were very well attended. At the same time, there is quite a bit of misunderstanding and confusion about what REST is. (I was among the confused before reading this book.) And I think that the industry-wide confusion on this topic will be winning.
The book authors decided to stay away from the REST term most of the time and have used the term ROA (Resource Oriented) to avoid the confusion. (I stay with the terms REST and RESTful.)
The book also introduced the therm REST-RPC Hybrid to describe many existing web services and to differentiate them from the true REST. There is quite a bit of wit in this self-contradicting term.

It is just like Object Oriented
A somewhat forgotten old paper by Tim Rentsch (back in 1982) included a prophecy about what OO will become: "Every manufacturer will promote his products as supporting it. Every manager will pay lip service to it. Every programmer will practice it (differently). And no one will know just what it is."

One more old OO related quote stays in my brain and does not want to leave: I was at some sort of a conference and talking to vendors in the booth area. One of them (he knew he is talking to a developer) was doing his spiel and said this: '... and we will be object oriented by the end of the next quarter'. I am not trying to contribute ideas for a next Dilbert episode. The scary thing is: that does not sound like something the marketing group would came up with on their own. I fear it came from the developers.

I think REST became like OO from these last 2 quotes. Part of the problem is that, like OO, everyone wants to claim it, yet it is not so easy to implement (my next posts will talk more why it is hard) but the lip service is easy so the term is used a lot.

Here is a quote from the above book: "Both REST and web services have become buzzwords. They are chic and fashionable. These terms are artfully woven into PowerPoint presentations by people who have no real understanding of the subject".

Side NOTE: a similarity between REST and OO: both are often used as check boxes and both are really progress bars.

THE WHY:
I believe the fact that the term REST is misused and misunderstood causes lack of its true adoption. We are missing out on some amazing opportunities. REST could be a big step towards fulfilling a fantasy in which computer programs can surf the web the way humans do today.

Why REST is such a good idea?: REST is the opposite of API. APIs are all different, REST services are trying to be all the same.

Let me quote the book:

"One alternative to explaining everything is to make your service look like other services. If all services exposed the same representation formats, and mapped URIs to resources in the same way... well, we can't get rid of client programming altogether, but clients could work at higher level than HTTP."

I came up with this imprecise, sort-of definition of REST focusing on the why: You wrote a very RESTful app if you have exposed your app functionality to the programmable web without API specification and without ambiguity.

I think the problem is that people think that REST is all about using GET, PUT, POST, DELETE (maybe OPTIONS, HEAD), some will know that POST as it stands today is better not used, but that is where it often ends. We think of the mechanics and loose the big picture. The net result is that we use the term REST a lot, but what we develop typically has nothing to do with being easily accessible by the 'rest' of the programmable web.

Example:

Say, you are developing a web application serving web pages and need to expose Ajax data from server to your Java Script library in a few places (just a GET request for simplicity).

JSON is not very RESTful (point one: developers should be aware of the fact) but, if you decide on using JSON, still ask yourself these questions:

How cryptic is the URL, does it look anything like a location of a resource? Any database IDs in it? (Would you expect external REST client to know your database IDs?)
Can someone not very familiar with this project get any information from your JSON?
How much is this JSON specific to the current version of the UI:

Is it just the fields that you need or a more meaningful representation?
Are you pre-formatting the data for your UI on the server?

How is your error handling and status codes? (say, you listing employees in a department and someone will issue a URL GET with invalid department: will your service just return 500 with the error message in plain text? - hey you know your JavaScript would never do that so why care about correct status codes or consistent content type?...).
And finally how can another developer discover this URL? Do you expect them to simply construct the URLs?

Lots of REST is simply common sense if you understand the why.

We are missing on big opportunities:

The first reason is so other developers can access/integrate easily with your application.

The second reason is so that the future semantic web will find the data exposed by your app.

Such goal should be very important for businesses which want their products to be 'known' or educational institutions, organizations defining standards, many government institutions which want their data to be transparent and available to programs at large who may want to use it, study it, analyze it.

These 2 reasons should create a big push for writing RESTful apps, but the reality is the exact opposite. Today businesses run hundreds of applications, these applications do not know how talk to each other. Any integration is complex and expensive. New applications are being developed (I participated in developing some of them) and REST is last thing on anyone's mind.

The targets of semantic web today are not government, science, or even manufactured goods (although, Best Buy made recent news to the contrary), but tweeter, facebook, blogger, etc. Who likes whom on facebook will benefit from all of this, but is there any semantic web development for, say, endangered species? This is not to criticize social web, kudos to them for innovating, this is to criticize everyone else who should be part of this and decides to stay behind.

Next post will have more mechanics:

I think a big part of the problem is that good REST is not so easy to implement. If it is hard and it is a buzzword people are bound to overuse it. Buzzwords loose their meaning very fast, just like OO term did.

My next post will be about what is so hard about REST.

Monday, May 28, 2012

Imperative curlies 8: Why Am I Writing This Stuff

A bit over one year ago I started working on my first Groovy/GRAILS/lots of Java Script project. Before, I worked with Java, C#, C++ and I was a very, very imperative programmer. I was so imperative that I even did not know what imperative programming means (I did not know there can be something else out there).

I think I am not the only one who is going trough this change. I am writing these posts to help myself by clarifying my thoughts on declarative program design and maybe to help someone make the transition I am trying to make.

How did my first Groovy or Java Script code looked like? Obviously I jumped to using closures ASAP, but using closures for the sake of using closures does not create a beautiful code. I read Groovy in Action by Koning at al. The book sure showed a lot of cool stuff you can do with Groovy, but I was still thinking in Java and just applying more groovy syntax when coding. In parallel to the groovy book, I read Java ScriptThe Good Parts by Crockford. Sometime during reading these 2 books, I realized that to be good in these languages, I need to change the way I think. I also sensed that, today's programming coolness maxims: dynamic typing, fluent programming, terseness, are not really it.

Reading the Programming Scala by Wampler and Payne was a great learning and an eye opening experience for me. I think I learned more about Java, JS and Groovy reading this book than I learned reading anything else and this book is not even about Java, JS or Groovy. I think that learning Java made me a better programmer in whatever language I was using in 1997-98. Learning SCALA makes me a better programmer in any of the languages I am using today (that includes Java, Java Script and Groovy). To get better in SCALA I will need to broaden my functional horizons and probably will need to learn haskell. So I have a full queue of reading waiting for me. That will definitely include reading on haskell and reading more on SCALA.

So why am I writing these posts? I got fascinated by the benefits of declarative programming. Yet declarative programming is not something the community pays much attention to.

If you ask a developer in the next office about what functional programming is about, you may hear a lot about immutable state (great) but declarative aspect is probably not going to make it into the conversation.

There are many reasons why you may not want SCALA on your next project, but, what I hear a lot is that, developers do not like SCALA syntax, they miss not having curly brackets around function implementations. This is yet another proof that software community has conditioned itself into imperative thinking.

Most developers want to program in Ruby or Groovy, few are left who still prefer Java. The most quoted reasons are the new programming coolness maxims (dynamic typing, fluent programming, and terseness).

One by one, if you drill into these maxims of today, they are all questionable.

Dynamic Typing: You need dynamic language to have cool features of Ruby and Groovy: You find a lot of opinions like this. In particular, my current reading (RESTfull web services by Ritchardson and Ruby - an otherwise excellent book) keeps claiming that the Ruby goodness stems from its dynamic nature. The same stuff can be done equally well or better in SCALA which is, of course, statically typed.

Fluent programming: One often acknowledged problem with this type of approach is that it is easy to use but HARD to implement. Somewhat overlooked fact is that immutable collections in a functional language are ready for fluent programming. Functional programming implies fluent! Think functional!

Terseness: Java is bad because it is verbose, Groovy is good because it is terse. This over-simplifies the issue. Ideally language works towards better code. The fact that code is short does not mean it is good. I like when language supports terseness as a reward for doing something right. SCALA allows you to define short one-liner functions without curlies. This allows for declarative definitions of functions similar to how you would declare a function in math (good). SCALA or Groovy support of pattern matching is another example of good terse syntax complementing good programming style. But is the # or ## method name in SCALA such a great thing?

How about Groovy Elvis operator?: it simplifies the code around handling of nulls. But then is null such a great language concept? It has been called a billion dollar mistake by the guy to invented it... Elvis it is better than nothing, but an even better solution would be to get rid of the concept of null.

As I try to become a better programmer, I would love to have a rule of thumb on what makes my programs good and what does not. Declarative programming became such as guide for me. At least for now.

To make this guide even more explicit I came up with a quantitive measure: count the number of curly braces in your code, the fewer the better.

To me it is not about dynamic typing, fluent programming or terseness. It is about how declarative my code is.

Sunday, May 27, 2012

Imperative curlies 7: switch statements

My programming is changing. One of the visible aspects of this change is fewer if statements and more switch statements. During work hours I program in Groovy and GRAILS, but I think the changes in my coding style are more motivated by reading about functional/declarative programming and SCALA than any reading on Groovy. Still, I am happy with Groovy, the language fits my coding evolution very well.
(Side Note: if Groovy was more functional I might have still liked if statements, ifs are a different beast in functional languages.)

The functional programming concept of pattern matching is something I have learned from SCALA. Groovy has a great support for switch statements so pattern matching can be implemented in Groovy quite well.

Here is how I might have coded a year ago (Test is a class representing a test or an exam, it knows a test score):

01: float score = test.getScore();//some object representing an exam or
                                    a test with a score
02: String grade = null;
03: 
04: if(score >= 90) {
05:   grade = "A";
06: } else if(score >= 80) {
07:   grade = "B";
08: } else if(score >= 70) {
09:   grade = "C";
10: } else if (score >=60) {
11:   grade = "D";
12: } else {
13:   grade = "F";
14: }

This might have been my Java code, C# code, or my first Groovy code. Today I would prefer this code: (The rest of the post is in Groovy.)


01: Closure scoreForA = { Test t->
02:  t.score >= 90;
03: }
04: Closure scoreForB = { Test t->
05:  t.score >=80 && t.score < 90
06: }
07: Closure scoreForC = { Test t->
08:  t.score >=70 && t.score < 80
09: }
10: Closure scoreForD = { Test t->
11:  t.score >=60 && t.score < 70
12: }
13: Closure scoreForF = { Test t->
14:  t.score < 60
15: }

16: Test test = . . .
17: String grade

18: switch(test) {
19:  case scoreForA: grade = 'A'; break
20:  case scoreForB: grade = 'B'; break
21:  case scoreForC: grade = 'C'; break
22:  case scoreForD: grade = 'D'; break
23:  case scoreForF: grade = 'F'; break
24: }

As per my previous posts, one liner functions do not contribute to the curly count.

So yes, I have a significant reduction in the number of curly braces. So what are the benefits?

One obvious difference is that I have separated the declaration of test score conditions for different grades from the conditional logic which calculates the grade. This allows me to manage these independently. Think of scoreForX closures as instance fields on my grade assignment class, think of them as something that can be set/dependency injected/configured without any changes to the test grade assignment logic itself.

So, to have some fun with this lets define the following:

01: //class defining rages 
02:class Curve {
03: ObjectRange rangeForA 
04: ObjectRange rangeForB
05: ObjectRange rangeForC
06: ObjectRange rangeForD
07: ObjectRange rangeForF
08:}

and

01: //generic function defining score to grade conversion
02:Closure gradeAssignment = {Curve curve, String testgrade, Test t ->
03:   curve."rangeFor$testgrade".containsWithinBounds(t.score)
04:}

This code should illustrate the power of the declarative programming (note my conditional grade calculation logic has not changed):

01: def curve = new Curve(rangeForA: (89.0..100.0),
                          rangeForB: (79.0..<89.0), 
                          rangeForC: (70.0..<79.0), 
                          rangeForD: (60.0..<70.0), 
                          rangeForF: (0.0..<60.0))


02: Closure scoreForA = gradeAssignment.curry(curve, "A")
03: Closure scoreForB = gradeAssignment.curry(curve, "B")
04: Closure scoreForC = gradeAssignment.curry(curve, "C")
05: Closure scoreForD = gradeAssignment.curry(curve, "D")
06: Closure scoreForF = gradeAssignment.curry(curve, "F")

07: Test test = . . .
08: String grade
09: 
10: switch(test) {
11:   case scoreForA: grade = 'A'; break
12:   case scoreForB: grade = 'B'; break
13:   case scoreForC: grade = 'C'; break
14:   case scoreForD: grade = 'D'; break
15:   case scoreForF: grade = 'F'; break
16: }

I probably should have added a default: handler in my switch statement, but I omitted it for simplicity.

Except for incorrect use of functional terminology (curry) Groovy has done OK in this example.
Groovy gets lots of credit for being less verbose than Java. I look at this differently: I like when the language rewards me for doing the right thing. Groovy has done just that with its cool switch statement. It awarded me with more compact and more readable code, it awarded me for thinking in more functional pattern matching terms.

Monday, April 23, 2012

Imperative curlies 6: removing ifs

In last few posts I tried to argue that there is little or no code reuse around for loops. There is one notable exception, however, the code reuse by adding lots of additional curlies called if statements. This post contains my thoughts on how to change imperative ifs used in this way.

Often the developer is faced with two obvious options, repeat very similar (but not identical) imperative logic in several places or code the logic in one place and include lots of if statements within this ‘generalized’ logic to handle the differences. The generalized logic is often a private implementation method with several Boolean parameters. It is invoked from various public methods which will set the booleans to trigger the needed side-effects within the private method. Obviously many programming techniques have been developed to avoid such code (Template Method Design Pattern, or even the concept of polymorphism itself), but still the if statements are often easier to use.

Here is a piece of code taken directly from the open source Ext JS 4.0.7 (fragment of Ext.form.Basic):

01: getValues: function(asString, dirtyOnly, includeEmptyText, useDataValues) {
02:  var values = {};
03: 
04:   this.getFields().each(function(field) {
05:      if (!dirtyOnly || field.isDirty()) {
06:         var data = field[useDataValues?'getModelData':'getSubmitData'](includeEmptyText);
07:         if (Ext.isObject(data)) {
08:             Ext.iterate(data, function(name, val) {
09:                 if (includeEmptyText && val === '') {
10:                     val = field.emptyText || '';
11:                 }
12:                 if (name in values) {
13:                     var bucket = values[name],
14:                         isArray = Ext.isArray;
15:                     if (!isArray(bucket)) {
16:                         bucket = values[name] = [bucket];
18:                     }
19:                     if (isArray(val)) {
20:                         values[name] = bucket.concat(val);
21:                     } else {
22:                         bucket.push(val);
23:                     }
24:                 } else {
25:                     values[name] = val;
26:                 }
27:             });
28:         }
29:     }
30:   });
31:  
32:   if (asString) {
33:      values = Ext.Object.toQueryString(values); 
34:   }
35:   return values;
36: }

Ext.form.Basic is an Ext class defining base/reusable behavior of Ext Forms. Ext JS uses the above method internally:

01: getFieldValues: function(dirtyOnly) {
02:    return this.getValues(false, dirtyOnly, false, true);
03: },

Also, the last argument (useDataValues) is not documented, making getValues behave as an old style form submit value retrieval. Instead of repeating the same logic and providing separate implementations to retrieve only dirty data from the form vs. all the data, retrieve data using new model semantics vs. using old form submit, etc, etc, the Ext library codes the logic once and provides if conditionals within the logic to handle different cases.

Looking at the above snapshot, can you easily see what the code is doing?

The idea or re-implementing the code by the removing the curlies/shortening distance between curlies is somewhat unfair to the reimplementer. After all the method signature has all these Booleans (or JavaScript truthies), and I think the API designers had an imperative implementation in mind when defining the signature of this method. Declarative/functional thinking vs imperative thinking will impact the APIs.

Lets do it anyway! I am dropping the asString parameter and I will type it as a separate function. I personally dislike function result set type changing based on the parameters. Standard function composition can be defined this way:

f: X -> Y
g: Y -> Z
g Compose f: X -> Z

(g Compose f(x)) = g(f(x))

The more I code with JavaScript the more I find that it is more convenient to do this type of composition (compose with function which is aware of the original argument set):

f: X -> Y
g: Y, X ->Z
g ComposePlus f: X -> Z

(g ComposePlus f) (x) = g(f(x), x)

Assume that we have bunch of utilities coded for things like flexible array concatenation, and that the ‘getFields()’ method returns a rich JavaScript collection object loaded with nice methods like map, fold, filter. My imaginary filterIf(condition, fn) method will return original collection if condition is not met and filter otherwise. So here is the new version of the code:

01: getValues: function(dirtyOnly, includeEmptyText, useDataValues) {
02:   var dirtyOnlyAdjustF = function(field) { return field.isDirty(); };
03:    
04:   var fieldRetrievalF = useDataValues ? 
05:                           function(field) {return field.getModelData(includeEmptyText);} :
06:                           function(field) {return field.getSubmitData(includeEmptyText);};
07:
08:   
09:   var emptyFieldAdjustF = function(data, field) {
10:      Ext.iterate(data, function(name, val) { 
11:         data[name] = (val === '') ? field.emptyText : '';
12:  }                                                                          
13:      return data;
14:   };
15:
16:   if(includeEmptyText) {
17:      fieldRetrievalF = FunctionUtil.composePlus(emptyFieldAdjustF, fieldRetrievalF);
18:   }
19:
20:   var foldingF = function(data, aggregator) {
21:      Ext.iterate(data, function(name, val) {
22:        var bucket = aggregator[name];
23:        aggregator[name] = bucket ? ArrayUtil.concatenate(bucket, val) : val;  
24:     }
25:   };
26:     
27:   var values = this.getFields()
28:                      .filterIf(dirtyOnly, dirtyOnlyAdjustF)
29:                        .map(fieldRetrievalF)
30:                          .fold({}, foldingF);   
31: },
32:
33: getValuesAsString: FunctionUtil.compose(Ext.Object.toQueryString, this.getValues);

Lots of developers will say that the second code is better because it is fluent. I think the fluency is more of a side-effect resulting from changing from imperative to declarative programming style.

Looking at the above code, how easy is to figure out what the code is doing? Actually the code says what it does, if statements never do that.

Many developers do not like the ternary operator. I have used it here to emphasize the declarative aspect. If the distance between the curlies is short, then we are close to being declarative. I like when the language lets me drop the curlies to emphasize that the code has been sufficiently refactored from the imperative. In this example JavaScript ternary operator does just that!

I hope the code speaks for itself so I will not say anything more about it.

Friday, April 13, 2012

Imperative curlies 5: good conditionals

Continuation of my previous bashing of curlies.

In the previous posts I have proposed 2 rules: eliminated the curiels for better code and shorten the distance between curlies for better code. This post adds the following rule (3): go an extra mile in eliminating the curlies and cool stuff will happen.

If and switch statements are examples of curlies. I still have hard time wrapping my mind around conditional statements in programming. When are they bad and when they are OK?

Side Note: Conditional statements in imperative and functional worlds are very different beasts. For example, Java if-else statement is all about conditional side-effects. In pure functional programming, if-else statement cannot have side-effects and it is really a function returning values. Clearly, functional ifs are more likable.

If statements have been considered villains for a very, very long time. For example, the Visitor Design Pattern has been advocated as an OO alternative to if/switch statements. The worst offenders are the pieces of code with nested if statements which split imperative logic into logical branches and smaller branches and smaller branches… Code like this is where bugs like to thrive, good test coverage is hard or impossible to achieve, the interaction between different conditions is hard to analyze. Code like this typically implies bad design. We used to call such design procedural and lacking Object Orientation. I believe it is this type of code that gave if/switch statements bad name. But are conditional statements evil PERIOD, or is it how we use them that makes them the bad guys?

Many conditional statements are declarative in nature. The concept of pattern matching is fundamental to functional programming. Similarly to for loops they are sometimes simply unavoidable. A good example is specifying boundary conditions: think of differential equations in math, or a typical recursive function definition. Here is an example, Fibonacci sequence is de-facto the hello-world of recursive functions:

fib(0) = 0
fib(1) = 1
fib(n) = fib(n-1) + fib(n-2) for n=2,3,…

If you wanted to define this sequence in Java you would write something like this (Java):

1: int fib(int n){
2:      switch(n) {
3:          case 0: 
4:          case 1: return n;
5:          default: return fib(n-1) + fib(n-2);
6:      }
7: }

Who cares about Fibonacci numbers? Will I use it on my next programming project? I doubt that I will. The point is, however, that problems in real life have boundary conditions, is handling nulls a boundary condition? Fibonacci numbers are just a good example of boundary condition and recursion all in one.

Instead of looking at memorization or numeric optimization aspects, I will approach the fib sequence from the point of view of curlies: I will try to remove them.

So how can this be done? Is there a way to define fibs without the boundary conditions? One way is to try to replace the recursive definition with a formula (a closed form expression) (if there is one). In the case of fibs there is the Binet formula and more. So if you go this route you will rub your elbows with things like golden ratio end up studying Da Vinci’s work, ancient philosophy and look at flowers. All very cool stuff! Replacing declarative definition of fib with Binet formula (maybe not super computer friendly, but still) gets rid of the curlies and introduces some cool albeit old science!

I came across another way of approaching fib sequence (or similar recursive) problems. Imagine the impossible: computer program understanding the concept of an infinite sequence. Let’s make it even better, in this utopia world you can define infinite sequences recursively! You may want to define fib sequence by saying that it is a sequence which starts with 0 and it is followed by 1 and it is followed by sequence composed by adding corresponding elements of shifted fib sequence (recursive itself) to the left by 2 positions to shifted fib sequence (recursive itself) to the left by one position. So before the impossible dream evaporates in a puff of smoke let me write the SCALA code (example found in Wampler/Payne Programming Scala book):

1: val fib: Stream[Int] = 
2:     Stream.cons(0, 
3:        Stream.cons(1, 
4:           fib.zip(fib.tail).map(pair => pair._1 + pair._2)))
5:
6: fib.take(7).print  //prints: 0, 1, 1, 2, 3, 5, 8, empty

Notice that the conditional statements and curlies have disappeared and we have introduced some deep programming concepts at the same time. It seems like cool stuff happens when curlies disappear!

Short explanation of what this code does: lazy evaluated Streams are infinite sequences in SCALA. The infinity does not come into play because the sequence is not evaluated until needed (take() call). Sequences are like linked lists so they have head and tail. Stream.cons(a, b) returns a stream starting at a and b being a tail. Zip is a functional concept of combining 2 sequences into one sequence of pairs (2 first elements, 2 second elements, etc). SCALA can be even less verbose (from SCALADoc):

1: val fib: Stream[Int] = 0 #:: 1 #:: 
2:        fib.zip(fib.tail).map(pair => pair._1 + pair._2)

There is also a quick ‘Am I an imperative programmer’ self-test here:
We have defined fib(n: Int) without using ‘n’. Test question: Do you have a problem with this?

The point of all of this nonsense if that conditional statements are part of life and have to be a part of my programs. This does not mean I will not try to avoid them. Only, there is no universal solution, no magic pill. Binet’s formula works for Fibonacci numbers but will not help me in writing null pointer exception free code without if statements. For this I may end up using other tricks like SCALAs Options concept, JavaScript && or Groovy ? or maybe even decide to learn what a monad is. Going extra mile in removing the curlies will force a deeper understating of how to write beautiful programs into me.

Friday, March 30, 2012

Imperative curlies 4: shorten the distance.

Continuation of my previous bashing of curlies.
My previous posts can be summarized by stating the following rule: Look at curly brackets surrounding imperative code and redesign your code so they disappear. There are obviously various tricks for disappearing the curlies and I will write more as I keep discovering them.

Here is a somewhat relaxed version of this rule: If you cannot get rid of curlies, redesign your code so the distance between curlies is as short as possible. (Redesign your code so the imperative part between curly brackets does as little as possible). Let me clarify my postion: It is not about squeezing as much as possible into one line, the goal is to simply isolate reusable code, create reusable utilities, and keep the client imperative code to minimum. If you cannot be declarative, at least make sure that the imperative code is reduced to minimum. The distance between curlies is simply a guide to measure the progress.

We have seen in the previous post that SCALA let’s you treat one-liner functions in declarative fashion by removing curly brackets. This allows you to define functions in a math formula-like style. Examples without type inference and without much syntactic sugar of using ‘_’:

1:  def doubleIt(d: Double): Double = 2*d  
2:  def halfIt(d: Double): Double = 0.5 *d  
3:  def sortOfIdentity(d: Double): Double = halfIt _ andThen doubleIt 
//Note current SCALA compiler needs ugly _ with andThen

Language like Groovy will not let you be as elegant. Here are the Groovy equivalents (also without using much syntax sugar):

1:  def composition = {f, g, x -> return f(g(x))}//composition helper  
2:  Closure doubleIt = {Double d -> 2* d}  
3:  Closure half = {Double d -> 0.5 * d}  
4:  Closure sortOfIdentity = composition.curry(halfIt, doubleIt)

(I am repeating myself here , but note that the term curry in not used correctly in Groovy.)
The point is that the above code snapshots are equivalent. What is important is that the benefits on testablility and maintainability are the same.

Example of Java code from an open source ArrayUtil found here: http://www.java2s.com/Code/Java/Collections-Data-Structure/Sumallelementsinthearray.htm

1:  public class ArrayUtils {  
2:  ...  
3:   public static long sum(  
4:     int[] source  
5:    )  
6:    {  
7:     int iReturn = 0;  
8:       
9:     if ((source != null) && (source.length > 0))  
10:     {    
11:       int iIndex;  
12:         
13:       for (iIndex = 0; iIndex < source.length; iIndex++)  
14:       {  
15:        iReturn += source[iIndex];  
16:       }  
17:     }  
18:     return iReturn;  
19:    }  
20:  }

Big distance between curlies in the implementation of the sum method. How reusable is this method code? What if we wanted to create ArrayUtil.multiply(int[] source) or ArrayUtil.max(int[] source), or ArrayUtil.min(…)?

So how can I shorten the distance between the curlies? Let us move to Groovy to get some ideas. The fact that we have a for loop suggests that we can try use one of the reusable pieces of logic available to us as a replacement for explicit for loops (see my previous post).

(Note: Functional programming uses the term folding, for some reason Groovy calls the equivalent method inject.) Compare reusability of the above Java code with the following code in Groovy:

1:  def myInts = [1,2,3,4,5];  
2:    
3:  def multipliedInts = myInts.inject(1) {acc, val-> acc * val}  
4:  def addedInts = myInts.inject(0) {acc, val -> acc + val}  
5:  def minFromInts = myInts.inject(Integer.MAX_VALUE) {acc, val ->
6.                                    Math.min(acc, val) }  
7:  def maxFromInts = myInts.inject(Integer.MIN_VALUE) {acc, val ->
8.                                    Math.max(acc, val) }

(Side Note: There is a problem here: what is the value of minFromInts if myInts was an empty array? Functional programming introduces concepts of monads, SCALA has Option Some and Option None concepts, but these are not in the scope for this post.)

Note that the distance between curlies in the above Groovy example is as close as one can get to defining functions using plain math like formulas. However, personally I tend to prefer a more declarative style shown below:

>1:  class MathFormulas {  
2:   def add = {acc, val -> acc + val}  
3:   def multiply = {acc, val -> acc + val}  
4:  }   
5:    
6:  def multipliedInts = myInts.inject(1, MathFormulas.multiply)  
7:  def addedInts = myInts.inject(0, MathFormulas.add)   
8:  def minFromInts = myInts.inject(Integer.MAX_VALUE, Math.&min)  
9:  def maxFromInts = myInts.inject(Integer.MIN_VALUE, Math.&max)

You can find specialized methods in Groovy for all of these tasks, however, the point here is the code reuse aspect. The Groovy's inject method is clearly reusable, while the for loop in Java is clearly not. How would you change the Java code above following these Groovy examples?

Is this Groovy magic? Not really, sure having closures helps, in Java we can work with reusable interfaces (similar to some that can be found in Functional Java ). We can write simple Collection Utility with fold method (it will be more verbose) :

1:  //direct from Functional Java project ...  
2:  public interface F2<A, B, R> {  
3:    R f(A a, B b);  
4:  }  
5:    
6:  //our own utility to see what needs to be done ...  
7:  public class MyCollectionUtils {  
8:   static <T, L> T fold(Iterable<L> list, 
9:                       T iniValue, 
10:                      F2<? super L, ? super T, ? extends T> fun){  
11:    T res = iniValue;   
12:    for(L l: list){  
13:       res = fun.f(l, res);  
14:    }   
15:    return res;   
16:   }  
17:  }  
18:    
19:  F2<Integer, Long, Long> multiply = new F2<Integer, Long, Long>() {  
20:   public Long f(Integer a, Long b) {  
21:     return b * a;  
22:   }  
23:  };  
24:    
15:  List<Integer> list = ...  
26:  long multipliedInts = MyCollectionUtils.
27.                <Long, Integer>fold(list, 1l, multiply);

The difference is largely in how programmers think. New book title idea: Unlearn Java in 24 days?

Let me sum up what happened to Java code: we went from no code reuse to high code reuse. The measure of distance between culries is about 10 lines for the original code and 1 line for a function interface F2 we would need to implement in the final code:
return a + b;

I hope to continue the curlies bashing soon.

Thursday, March 29, 2012

Imperative curlies 3: for comprehensions and powder skiing

Continuation of Previous Bashing of Curlies
Over the many year of my involvement in Java I have seen very little code reuse around loops. For loops (and other loops) in Java are yet another category of hard to test, hard to maintain code. By now we know they are no good: they are surrounded by the curlies ; ).

Functional programmers have for loops too, only they call them comprehensions! Functional programming often deals with collections of data so loops are unavoidable. So what is the difference?

The difference is really in the attitude. It is like powder skiing. I am a developer and a ski bum. I am very much into safe (inbounds) powder skiing. Like many other skiers I had hard time to learn how to ski powder at first. Frustrated, I decided that what I need to do is to start pretending. So I started pretending that I am really good: with posture and everything else making sure it appears to look like I know what I am doing. (Side note: this technique is very effective in a very deep powder because no one will see what I am doing anyway ; ) Obviously I sucked big time, I only appeared to be a good powder skier. (Think of this as writing a for loop which looks very pretty.) After some years of pretending I learned that my skiing consists of simple reusable elements such as tipping, retracting, pulling back my feet, etc . So for typical everyday tasks on the snow I now can stop pretending and just do these elemental tasks and ski! (Think of this as not using for loops any more: code reuse). When I need to do something new on skis (like trying teles), I will go back to pretending (or to writing a for loop).

The first step is the acknowledgement that what my for loop is doing should have a single purpose: comprehending a collection. (I also think of this step as admittance of being guilty of using the curlies.) The second step is the code reuse for the tasks we perform often, what kind of comprehensions will we be typically doing?: how about: joining, reducing, folding, mapping, finding (any), finding all, etc.

Assume that we need to produce a custom version of toString(). Let’s look at some old Java first:

1:  public class PackRat { 
2:    private List<String> stuff = new ArrayList<String>(); 

3:    public void addToStuff() { ... }
 
4:    public String toString() { 
5:     StringBuffer res = new StringBuffer(); 
6:     res.append("PackRat: "); 
7:     for(int i=0; i<stuff.size(); i++) { 
8:       res.append(stuff.get(i); 
9:       if(i<stuff.size() -1) { 
10:        res.append(";"); 
11:       }  
12:     } 
13:     return res.toString(); 
14:    } 
15:  }

Same thing done in Groovy, which adds some reusable methods to avoid writing explicit for loops:

1:  class ParckRat { 
2:    List stuff = [] 
3:    def addToStuff … 
4:    String toString() { 
5:     "PackRat "+ stuff.join(";") 
6:    } 
7:  }

Note: Libraries like Guava or Apache Commons provide you with join() method as external utility, sadly Java List interface does not have a join method.

The Groovy code looks much better . But still 2 curlies we should be able to get rid of. Unfortunately, we are trying to override a Java method in Groovy so we are stuck with the limitations the methods have (methods are not closures). So what could we do if method where closures? Let’s look as SCALA where functions are functions, not methods or closures: I want to my code to simply state that my toString function is really the same as a prefixed stuff.join(“;”). I should be able to declare it, not implement it!

So here is the more declarative and curlyless version done in SCALA:

1:  class PackRat {  
2:    private var stuff: List[String] = List[String]() 
3:    def addToStuff = … 
4:    override def toString = "PackRat: " + stuff.mkString(";") 
5:  }

(SCALA glossary: var makes stuff a mutable instance variable, def keyword indicates function definition. List[String] is somewhat different than java List, for example, it is immutable: Note that Java/Groovy code above is not thread safe, SCALAs version does not have such problem.)

Here is yet another proof of the curly count being a good measure of code quality. On one end of the spectrum there is the imperative code with loop spelled out in Java (please count curlies in that code), on the other end there is SCALAs beautiful code where a particular kind of comprehension logic is simply declared!

Many developers think of the fact that SCALA allows you to drop curly brackets from one-liner functions as just a syntactic sugar and find the syntax iffy. My view is the opposite. The curlyless functions support the declarative style of coding and allows developer to define functions using expressions supported by SCALA language (contrast this with implementing all the methods in Java).

Obviously, there are other kinds of reusable comprehensions. For example reducing is far more general than joining. Here is SCALAs version of the above toString function using reduce:

override def toString = "PackRat: " + stuff.reduceLeft(_ + ";" + _)

Might look like compiler sugar (and again conceptually a deep stuff not iffy stuff), here is a spelled out version which is still purely declarative with no curlies:

def myConcatenateStrings(s1: String, s2: String):  String = s1 + ";" + s2
override def toString = "PackRat: " + stuff.reduceLeft(myConcatenateStrings)

Groovy code can be designed with more closures and fewer methods. This could facilitate much more declarative type of coding than methods allow. We have seen some of it in our previous post (Post on Curlies and GORM). The same declarative approach can be used to get rid of many for loops and make Groovy code closer to SCALA.

So what is the point of all of this? First, the obvious, code reuse and testability should be a good thing no matter what is the language; using reusable for loop logic can be done in Java too. It will look like someone is trying hard to be functional in Java (you can always say you are doing fluent coding and no one will know ;) Second, the idea of one-liners being closer to declarative programming warrants more thought and is more that a coding style no matter what is your language.

I hope to write more about it in the future.

Side Note: To all Java programmers (that would include me) I want to point out the obvious big difference between imperative for loops and functional comprehensions: Imperative for loop is really a sequence of side-effects executed in order, pure functional programming cannot have side-effects so comprehensions always return a new collection.

Next Bashing of Curlies.

Wednesday, March 28, 2012

Imperative curlies 2: GRAILS/GORM

Continuation of Previous Curlies bashing
GRAILS Domain Classes and GRAILS plugins can provide phenomenal examples of declarative programming where a simple declaration adds lots of functionality without any coding. Still the programmer is faced with choices, for example, custom domain class validation can be either done by hand (with curlies) or in a nice reusable way using more functional and declarative programming.

Here is a simple domain class to start:

1:  class Meeting {
2:    Date start
3:    Date end
4:    String title
5:  }

Looks like little ventured, little gained, but these looks are very deceiving. The above domain class is a feature rich hibernate DAO. You can do with it things like:

Meeting.findAllByStartBetween(new Date() -7, new Date())

Meeting.findByTitleLikeAndStartGreaterThan(…)

On a side note: this is a true Groovy magic. These methods do not really exists, in Groovy a class responds to a method, it does not necessarily have a method.

Add the following plugins to your grails project: audit-trail, spring security core, and searchable. Add simple declarative changes to the Meeting class:

1:  @gorm.AuditStamp
2:  class Meeting {
3:    Date start
4:    Date end
5:    String title
6:  
7:    static searchable = true
8:  }

New magic has happened: Meeting class has new fields representing (these names are configurable) whoCreated, createdDate, whoUpdated, updatedDate and obviously finding all Meetings that I have created in last 7 days is as simple as calling


Meeting.findAllByWhoCreatedAndStartGreaterThan(...).

The searchable plugin allows you to do things like Meeting.search(...) or easily search across different domain classes with a similar declarative configuration. GRAILS/GORM provides you with a phenomenal declarative power!

The list of plugins that can be added goes on and on and the functionality you can add to your domains in this declarative fashion is boundless.

GOING BACK TO EARTH: By default all fields are not nullable and GRAILS has no way of knowing that some data does not make sense, for example start Date should be always before end Date! So lets make the corrections:

1:  class Meeting {
2:    Date start
3:    Date end
4:    String title
5:
6:    static constraints = {
7:     end nullable: false, validator: {value, record ->
8:         if (value && record.start && value < record.start) {
9:               'endDateNotAfterStart'
10:        }
11:      }
12:    }
13:  }

The new version provides a custom validation for the end date. The logic simply returns a message string (to be translated by GRAILS i18n infrastructure) if the end date was entered, start date was entered, and end date is not after start date.
(Side note on Groovy: notice that not all paths return a value in the validating closure, this seems to be Groovy’s take on partial functions. As I understand this, Groovy allows this type of coding to support less verbose code and the concept of partial functions is not fully supported as such.)

It is imperative to have curlies. The alarm bell starts ringing: I have imperative logic which now is a part of my domain class, yuck! Can I make code improvements and get rid of these curlies?
Note that each time I have a domain class with start and end timestamps I will probably need to write a similar closure on that domain class and I will have to write separate unit test for it, I will have to maintain the code in many places. So yes, if I could 'declare' endAfterStart validation on my domain class the code would benefit:

1:  class Meeting {
2:    Date start
3:    Date end
4:    String title
5:
6:    static constraints = {
7:       end nullable: false, validator: ValidationUtil.endAfterStart.curry('start')
8:     }
9:    }
10:  }
11:
12:  class ValidationUtil {
13:    static def endAfterStart={String startDateFieldName, Date endDate, record ->
14:     if (endDate && record."$startDateFieldName" &&
15:                endDate < record."$startDateFieldName") {
16:      'endDateNotAfterStart'
17:     }
18:    }
19:  }

Note that Groovy is clunky with functional programming terms, ‘curry’ should be really called ‘partial’. But the logic is clear, I am declaring my validation by using a function (Groovy closure) declared in a reusable (hopefully unit tested) utility class. Expected signature of validating closure is:

 {value, record -> . . .}

The declared reusable validation needs additional information so its signature is:

{startTimestampFieldName,  value,  record -> . . .}

So I need to convert one signature to another in functional terms this is called partial application. Groovy calls it (incorrectly) curry.

Now my domain class is purely declarative. Is it better for it? I believe so!
I think you should see a benefits of this declarative improvement! I hope to write more about curlie evil in future posts.

Next bashing of curlies

Opinions and Programming