Software has two ingredients: opinions and logic (=programming). The second ingredient is rare and is typically replaced by the first.
I blog about code correctness, maintainability, testability, and functional programming.
This blog does not represent views or opinions of my employer.

Tuesday, February 27, 2018

Why read CTFP? Why study category theory?

Why read CTFP? Why study category theory?

Why read CTFP? Why study category theory?

This is a pandoc copy of introduction markdown from my notes-milewski-ctfp-hs project. This project contains literate Haskell notes from my reading of Bartosz Milewski’s excellent book Category Theory for Programmers.

I finished my second (I am sure not the last) reading of Milewski’s book. This note is about what motivated me to read CTFP and why I think we need this book.
At work, my group is in the midst of a very serious refactoring and a technology shift. That change follows months of technical discussions. My experiences at work have motivated this note as well. I cannot help but ask myself: how different would all of this have been if my coworkers have read CTFP? And, since we use Java at work, I will use the Java ecosystem as a source of examples.

We live in the google and stackoverflow times. Only very few read programming books. Deadlines are deadlines, there is just no time for reading and linear learning. Yet, there are limits to google search learning. Books are needed!

Why this particular book?
There are other books about category theory written for programmers. In my opinion, CTFP is unique in 2 ways:

  • it covers lots of ground, more than any of the other books
  • it manages to stay very intuitive and relevant to programming

Personally, I really appreciate the second bullet.

Why learn category theory?
Category theory is not intuitive (not to me). It is different from ground-up. It redefines everything starting with basic concepts like sum, product, exponent. Being different is good. Think about a toolbox with only one type of screwdriver in it, I want more tools in my toolbox.

Category theory uncovers common structures in computations. These structures allows us to understand computations in ways not possible otherwise. Category theory allows to identify similarities between seemingly disparate concepts. It tells us that the code has properties (obeys laws) and what these properties (laws) are. But it gets even more fundamental than this.

Trinity of Computer Science, Practical Trinitarianism

The term was coined by Robert Harper (post) and is relatively new (2011) but the concept is old (Curry-Howard correspondence between programming and logic can be placed between 1934-1969, Curry-Howard-Lambek correspondence that includes category theory is early 1970s). At some point in the future (when we discover and understand more) trinitarianism will become quadrinitarianism but that is another story.

To me, trinitarianism means that I have 3 tools when designing and developing programs:

  • Proof Theory
  • Type Theory
  • Category Theory

They may all be equivalent, but as toolsets each offers a unique set of benefits. They are the three manifestations of the notion of computation.

Philip Wadler distinguishes languages that were created (like Java) from languages that were discovered (like various Lambda Calculi). Other than the scope, this is equivalent to Harper’s concept of the divine.
Software engineers love arguing which code is better (can you get better than a divine?). Correspondence to logic is really the best comparator is in such arguments.

We can question many things, say, the use of Hibernate (a library in the Java ecosystem) or even something like the Java language design itself. Similar question do not even make sense with respect to the Lambek correspondence. These 3 manifestations of code simply are. Questioning that correspondence is like questioning Pythagorean theorem. What we can and often do, however, is to just ignore it!

Software engineering is founded on a belief that logical problems can be solved without any training in logic. Despite being self-contradictory, this approach works surprisingly well for many software products. The limitation of this approach is that it does not scale well with logical complexity. Majority of software engineers do not acknowledge this limitation. We all have a tendency to de-emphasize importance of things we do not understand. Learning the trinity (including category theory) is admitting to the existence of that limitation and admitting to the importance of logic.

Divine computation sounds like something very academic and out of reach of a mortal programmer. It is not! Remember, we live in the stackoverflow times and we copy-paste programs. Copy-paste of a divine is still divine. The only problem is how to recognize that divine aspect?
Hmm, only if there were books about it, or if there was some catalog of divine computations (like something called a category theory)…

Beginner Trinitarianist

First steps are as simple as a change in the attitude. When designing my app I look for soundness and strong theoretical properties. I want to use building blocks that are logically sound. I want to write code that is type-safe.

Let me start by discussing what the trinity is not.
Take type safety as an example and think about Java Object class with its 11 methods. These methods can be invoked on any object, "boo".equals(5) compiles just fine, comparison between a String and an Integer always returns false, and there is almost 100% chance that comparing a String to an Integer is an escaped bug!

Take strong theoretical properties and soundness as the next example. Category theory offers a set of programming tools such as monoids, monads, natural transformations. Any application has them, you really have no choice about it, the only choice that you have is whether to ignore these computational structures or not. All of these come with strong theoretical properties or laws.
Laws are what is important. Consider this code that uses Java standard library (I am not indicating the Java version, I am quite sure this will never get fixed)

groovy>  //using Groovy console as a replacement for REPL to see what Java will do
groovy>  import java.sql.Timestamp
groovy>  import java.util.Date
groovy>   
groovy>  Date d = new Date()
groovy>  Timestamp t = new Timestamp(d.getTime())
groovy>  [t.equals(d), d.equals(t)]

Result: [false, true]

nothing good can possibly come out of equals not being symmetric!
Exercise for Java coders: play with Java comparators using Dates and Timestamps, is < antisymmetric for objects in Java standard library?
Better Exercise: try to fix it (fix Java source for Date and Timestamp so equals is symmetric). It is not that easy!

Here is a fun example violating requirements defined by java.util.Set.equals, using Groovy (tested with Groovy v.2.4.10):

groovy>  def x= [] as Set
groovy>  def y= "".split(",") as Set
groovy>  
groovy>  [x == y, x, y]

Result:  [false, [], []]  

Bugs like this impact thousands (Groovy) to millions (Java) of software engineers, the economic cost must be significant.
Side-note: in the last example x and y are two different implementations of Java Set. I am exploiting a common pattern, binary relations and object inheritance do not mix well.

There is more than one point here:

  • Laws are important and so is proving/certifying them
  • Wadler was right and one way to spot a trinitarianist is by his/her selection of the computing language environment.

Trust. I had tried to convey this idea with a limited success. As an engineer I do not need to understand monoids, monads, functors, etc, to use them! I only need to trust that they are important and learn their plumbing. I still will rip the benefits of code that has better correspondence to logic, it will be less buggy and easier to maintain. This speaks to the very idea of good engineering. Engineering is about applying science and mathematics to solve practical problems. Great engineers do not need the depth of mathematical knowledge. What they need is a level of trust in The Queen.

CTFP is a great companion here. It covers lots of ground while being intuitive and interesting to read. It delivers a list of useful concepts and helps building that trust. CTFP replaces fear of formalism with curiosity. Curiosity is what writes interesting programs.
My goal, in my first reading of CTFP, was to just get the concepts down and to start applying them in my code (often blindly).

Intermediate Trinitarianist

Next steps are as simple as defining the application types and then thinking what kind of theorems I can prove about these types. Consider this very simple (and incomplete) example

As an engineer I can think that I am implementing HTML rendering using polymorphism and leveraging other code. As a trinitarianist I see theorems and proofs (well, not proofs because I was lazy and have used undefined).
instance definition is a theorem and so is the type signature of fancyTreeToHTML. For a trinitarianist, -> and => are the modus ponens!
Trinitarianist views code like this as proving ToHTML theorems about the application types.
The above code may differ or go beyond the straightforward mapping in the Curry-Howard correspondence, but from a pragmatic standpoint, the mapping to logical derivation remains clear and effective.
The slogan is: types are theorems and programs are proofs, and this code reflects the slogan well.

Continuing with the equals saga: not everything can be compared with everything, being able to compare elements of type a is a theorem Eq a. For simple types (called ADT) that theorem is a simple boilerplate but not so for more complex types.
So, for example, removing duplicates form [a] is a theorem that assumes evidence of Eq a

That sounds like “this’s just semantics”. Right on! It is just semantics, but only for code that is close to logic. Pick some Java program at random and try to think about it as theorems and proofs. That will not work so well, will it?

Since logic and programming are one and the same, it is nice when the programming language supports logical concepts such as universal or existential quantification (the forall and exists quantifiers in logic). These are the System F type constructions that also have important categorical representations (for example, see (co)limits, N_P2Ch02a_LimitsColimitsExtras).
Trinity complements each other.

Remember, the end goal is to have more correct or even certifiable programs.
More advanced uses of types allow compiler to verify all kinds of things. Here are some examples I have played with in this project: * N_P2Ch02b_Continuity compiler checks type cardinality N_P2Ch02b_Continuity, * N_P3Ch15b_CatsAsMonads compiler helps me verify that 2-cell mu in the bicategory of spans is the same as the composition in the underlying category.

There are limits to what compiler can do and pencil and paper proofs are still needed. However, this situation is changing, see this blog about proofs of laws,
or this example in my IdrisTddNotes project Functor laws, Idris vs Haskell.

A lot of programming is about verifying that programs are equivalent, replaceable. Finding if given 2 lambda expressions are equivalent is undecidable. Static analysis tools (AI or not) are unlikely to automate this. One big ticket item here is the task of refactoring where, typically, things should work the same before and after (think performance refactoring). A trinitarianist will use tools like equational reasoning, or maybe some form of structural induction to certify correctness of refactoring. Category theory augments this with a nice set of ready refactoring tools (like the Yonada transformation).

Or, instead of me refactoring, the compiler could rewrite the code and optimize it. This idea extends nicely to fusion/code rewrite optimizations available in Haskell’s Core (Haskell compiles to an intermediate language which is a lambda calculus and is programmable). Respect for logic == high performance! Tools like Yonada transformation are especially important in languages where fusion is not an option.

Any trinitarianist will be interested in the language design. Designing domain specific languages allows programmers to define their own semantic rules. These are the rules for formal reasoning on the code. Category theory supplies DSL creating tools such as free monad and cofree comonad. The power of trinity is in combining all of the 3!

More advanced use of categories means that I

  • Prove laws about my computations.
  • Use a toolbox of certified refactoring solutions.
  • Have certified tools for creating DSLs.
  • Use category theory to come up with how to do stuff, for example program with diagrams! See N_P1Ch08b_BiFunctorComposition, N_P3Ch02a_CurryAdj, or N_P3Ch02c_AdjProps
  • Understand why some code looks so similar to other N_P3Ch11a_KanExt
  • Better understand computational structure, for example iso-recursion N_P3Ch08a_Falg
  • Use categories in defining business requirements (that is something I would like to research more.)

and so on..

Current industry and social bias against formalism and mathematics may prevent people from writing code that looks like proofs. At the same time, I am very convinced that anyone capable of learning how to program can learn how to write proofs. The hardest part about either task is being able to survive the confinement of removed ambiguity (no hand-waving). Unfortunately, can does not mean will. Category theory offers a partial work-around here. It allows engineers to implement code using logically solid building blocks while someone else has certified the correctness. Again it is about that trust in the queen idea I wrote about earlier.

CTFP is a great companion to an intermediate trinitarianist. I can re-read parts of it, it builds base knowledge, prepares me to do more research. I do not want to admit to how many times I re-read some of the chapters.

Conclusions

There appear to be 2 very different parts to writing software: programming and software engineering.
They are different because programming has a direct correspondence to logic, software engineering does not.

Programming is about logic, proofs, types, categories.
Programming is about certifiable software and provable correctness.
Programming is about audacity of using logic when solving logical problems.
Software engineering is pragmatic, suspicious of formalism, and has deadlines.
Software engineering is about correctness by a lot and lot of testing effort (and, often, ignoring provable incorrectness).

Programming is a very small share of the overall software development (looking at language usage statistics suggest about 1%). My personal interest and bias is clearly on the programming side but I do think of myself as an engineer too. I think current balance point is just wrong, there needs to be more that 1% market share of programming in software!

Besides, that 1% just proves my point. Writing code is like skiing, driving a car, or anything else. Just look at what everyone is doing, do exactly the opposite and you will be just fine.

Many software engineers feel personally offended by some of this. If anywhere, the blame should be placed on education. I wish I had something like CTFP when I started learning how to program. I was a 3rd year student immersed in pure mathematics. The instruction made no use of anything I knew already. Not even mid school/high school algebra. Think about (a ^ b) ^ c = a ^ (c * a) or a ^ (b + c) = a^b * a^c we know as currying and pattern match, nope. The opportunity to leverage concepts I knew well was lost. I think CS education has failed me and has failed most of us. Hopefully this situation will change (CMU new curriculum). For us old-timers, there are some good books that connect the dots, books like CTFP.

Category theory is one of the big three. It offers logical solutions to programming problems. It offers understanding of the code structure. CTFP is currently the best book to learn it if you are a coder and not a working mathematician. It is not a very hard reading. What are you waiting for?

Would we be making different technology decisions if we read CTFP? Yes.

35 comments:

  1. Really great post, I simply unearthed your site and needed to say that I have truly appreciated perusing your blog entries. I want to say thanks for great sharing.
    Rprogramming Training in annanagar

    r-programming Training in marathahalli

    r-programming Training in rajajinagar

    r-programming Training in bangalore

    ReplyDelete
  2. I just see the post i am so happy the post of information's.So I have really enjoyed and reading your blogs for these posts.Any way I’ll be subscribing to your feed and I hope you post again soon.I just see
    best sap simple finance online training institute in hyderabad

    ReplyDelete
  3. Inspiring writings and I greatly admired what you have to say , I hope you continue to provide new ideas for us all and greetings success always for you..Keep update more information..

    rpa training in electronic-city | rpa training in btm | rpa training in marathahalli | rpa training in pune

    ReplyDelete
  4. I have read a few of the articles on your website now, and I really like your style of blogging. I added it to my favourites blog site list and will be checking back soon.
    Python training in marathahalli | Python training institute in pune

    ReplyDelete
  5. Its really an Excellent post. I just stumbled upon your blog and wanted to say that I have really enjoyed reading your blog. Thanks for sharing....

    java training in chennai | java training in electronic city

    java training in marathahalli | java training in btm layout

    ReplyDelete
  6. Your very own commitment to getting the message throughout came to be rather powerful and have consistently enabled employees just like me to arrive at their desired goals.
    Data Science training in Chennai
    Data science training in Bangalore
    Data science online training

    ReplyDelete
  7. Nice explanation and clearly mentioned the topic
    Best Play and Pre School for kids in Hyderabad,India. To give your kid a best environment and learning it is the right way to join in play and pre school were kids can build there physically, emotionally and mentally skills developed. We provide programs to kids like Play Group, Nursery, Sanjary Junior, Sanjary Senior and Teacher training Program.
    Preschool in hyderabad

    ReplyDelete
  8. Excellent blog information by the author
    Sanjary Academy is the best Piping Design institute in Hyderabad, Telangana. It is the best Piping design Course in India and we have offer professional Engineering Courses like Piping design Course, QA/QC Course, document controller course, Pressure Vessel Design Course, Welding Inspector Course, Quality Management Course and Safety Officer Course.
    Piping Design Course in Hyderabad ­

    ReplyDelete
  9. Good blog information provided of the author I liked it

    Pressure Vessel Design Course is one of the courses offered by Sanjary Academy in Hyderabad. We have offer professional Engineering Course like Piping Design Course,QA / QC Course,document Controller course,pressure Vessel Design Course,Welding Inspector Course, Quality Management Course, #Safety officer course.
    Document Controller course
    Pressure Vessel Design Course
    Welding Inspector Course
    Safety officer course
    Quality Management Course
    Quality Management Course in India

    ReplyDelete
  10. Good blog information
    Yaaron Studios is one of the rapidly growing editing studios in Hyderabad. We are the best Video Editing services in Hyderabad. We provides best graphic works like logo reveals, corporate presentation Etc. And also we gives the best Outdoor/Indoor shoots and Ad Making services.

    Best video editing services in Hyderabad,ameerpet
    Best Graphic Designing services in Hyderabad,ameerpet­
    Best Ad Making services in Hyderabad,ameerpet­

    ReplyDelete
  11. Your info is really amazing with impressive content..Excellent blog with informative concept. Really I feel happy to see this useful blog, Thanks for sharing such a nice blog..
    If you are looking for any Data science Related information please visit our website Data science courses in Pune page!

    ReplyDelete
  12. Your info is really amazing with impressive content..Excellent blog with informative concept. Really I feel happy to see this useful blog, Thanks for sharing such a nice blog..
    If you are looking for any Data science Related information please visit our website Data science courses in Pune page!

    ReplyDelete
  13. resolver
    We are an MRO parts supplier with a very large inventory. We ship parts to all the countries in the world, usually by DHL AIR. You are suggested to make payments online. And we will send you the tracking number once the order is shipped.

    ReplyDelete
  14. It’s always so sweet and also full of a lot of fun for me personally and my office colleagues to search your blog a minimum of thrice in a week to see the new guidance you have got.

    Best PHP Training Institute in Chennai|PHP Course in chennai
    Best .Net Training Institute in Chennai
    Oracle DBA Training in Chennai
    RPA Training in Chennai
    UIpath Training in Chennai

    ReplyDelete
  15. Thanks for sharing informaton
    "Pressure Vessel Design Course is one of the courses offered by Sanjary Academy in Hyderabad. We have offer professional
    Engineering Course like Piping Design Course,QA / QC Course,document Controller course,pressure Vessel Design Course,
    Welding Inspector Course, Quality Management Course, #Safety officer course."
    Piping Design Course
    Piping Design Course in India­
    Piping Design Course in Hyderabad
    QA / QC Course
    QA / QC Course in india
    QA / QC Course in Hyderabad
    Document Controller course
    Pressure Vessel Design Course
    Welding Inspector Course
    Quality Management Course
    Quality Management Course in india
    Safety officer course

    ReplyDelete
  16. This comment has been removed by the author.

    ReplyDelete
  17. I think you did an awesome job explaining it. Sure beats having to research it on my own. Thanks
    Maharshi Dayanand Saraswati University BCOM 1st, 2nd & Final Year Time Table 2020

    ReplyDelete
  18. We develop free teaching aids for parents and educators to teach English to pre-school children. For more info please visit here: English for children

    ReplyDelete
  19. Nice & Informative Blog !
    you may encounter various issues in QuickBooks that can create an unwanted interruption in your work. To alter such problems, call us at QuickBooks Phone Number 1-855-974-6537 and get immediate technical services for QuickBooks in less time.

    ReplyDelete
  20. Hey Nice Blog !
    Our team at QuickBooks Customer Service have been serving QuickBooks users for the past few years in light of the Corona Crisis. 

    ReplyDelete
  21. Hello, this weekend is good for me, since this time i am reading this enormous informative article here at my home.

    ReplyDelete
  22. HVAC & Plumbing Services
    Air Star Heating guarantees reliability and quality for all equipment and services.

    Air Star Heating specialists always try to deliver the most excellent quality of services to our customers at an affordable price. It is understood that every client has different needs and different problems. We try to accomplish the needs of every client according to their requests. We are having considerable experience in this field. Our specialists understand very well how things work. It doesn’t matter in which field of industry you are looking for services.
    Plumbing & HVAC Services in San Diego. Call now (858) 900-9977 ✓Licensed & Insured ✓Certified Experts ✓Same Day Appointment ✓Original Parts Only ✓Warranty On Every Job.
    Visit:- https://airstarheating.com

    ReplyDelete
  23. HVAC & Plumbing Services
    Air Star Heating guarantees reliability and quality for all equipment and services.

    Air Star Heating specialists always try to deliver the most excellent quality of services to our customers at an affordable price. It is understood that every client has different needs and different problems. We try to accomplish the needs of every client according to their requests. We are having considerable experience in this field. Our specialists understand very well how things work. It doesn’t matter in which field of industry you are looking for services.
    Plumbing & HVAC Services in San Diego. Call now (858) 900-9977 ✓Licensed & Insured ✓Certified Experts ✓Same Day Appointment ✓Original Parts Only ✓Warranty On Every Job.
    Visit:- https://airstarheating.com

    ReplyDelete
  24. Hey! Mind-blowing blog. Keep writing such beautiful blogs. In case you are struggling with issues on QuickBooks software, dial QuickBooks Customer Service Phone Number . The team, on the other end, will assist you with the best technical services.

    ReplyDelete
  25. Heyy!!!!
    I am so happy to service.If you want a best service. So quick go Quickbooks Customer Service you can contact us at.+1 346-414-8256,TX

    ReplyDelete
  26. You have lots of great content that is helpful to gain more knowledge. Best wishes.

    B.Com 1st 2nd 3rd year exam date sheet

    ReplyDelete
  27. Exploring software opinions and programming is fascinating, but delving into data analytics through courses in Glasgow can elevate your IT career to new heights, empowering you with the skills to extract valuable insights from data. Please also read Data Analytics courses in Glasgow

    ReplyDelete