C# and FP – episode 1 – Why?

Introduction

Functional programming is how programming should be. We want behaviours (functionalities), that receive an input and produce an output. Simple as that. Of course we might need to process again and again the data in hand, but this is also part of the expected behaviour: one’s function output is the other one’s input.

The ultimate goal is to deliver a software product built with reliable code, and the best way to do that is simplicity. Therefore programmers’ main responsibility is to to reduce code complexity. The whole picture is that OOP doesn’t deliver as excepted, nor in code quality nor in deadlines. It looks good in diagrams, but once the complexity starts increasing ,things, slower or faster, are getting out of hand. Especially when the state is mutable and shared, then a chaos is on the loose. Even full test coverage worth nothing, if the source code is complex and not maintainable.

In the seventies, the idea of “real OOP” was hugely powerful, but what was implemented was far from a complete set of ideas, especially with regard to scaling, networking, etc. How dynamic objects intertwined with ontologies and inference was explored by Goldstein and Bobrow at Parc. Their four papers on PIE and their implementation were the best extensions ever done to Smalltalk, and two of the ideas transcended the Smalltalk structure and deserved to be the start of a new language, and perhaps have a new term coined for it.

Alan Kay’s original idea about OOP

The term “Object Oriented Programming” was first coined in 1996 by Alan Kay. Based on his answers in 2003, via email Stefan Ram’s (a German computer science professor in Berlin at that time), his ideas on OOP were completely different from what we have today in our hands as OOP languages.

At Utah sometime after Nov 66 when, influenced by Sketchpad, Simula, the design for the ARPAnet, the Burroughs B5000, and my background in Biology and Mathematics, I thought of an architecture for programming. It was probably in 1967 when someone asked me what I was doing, and I said: “It’s object-oriented programming”.

– I thought of objects being like biological cells and/or individual computers on a network, only able to communicate with messages (so messaging came at the very beginning – it took a while to see how to do messaging in a programming language efficiently enough to be useful).

– I wanted to get rid of data. I realized that the cell/whole-computer metaphor would get rid of data, and that “<-” would be just another message token (it took me quite a while to think this out because I really thought of all these symbols as names for functions and procedures.

– My math background made me realize that each object could have several algebras associated with it, and there could be families of these, and that these would be very very useful.

Alan Kay answering to Paul Ram in 2003

So far, based on Alan Kay’s answers, he focuses on cells (objects) exchanging messages to each other. His true goal was messaging.

The term “polymorphism” was imposed much later (I think by Peter Wegner) and it isn’t quite valid, since it really comes from the nomenclature of functions, and I wanted quite a bit more than functions. I made up a term “genericity” for dealing with generic behaviors in a quasi-algebraic form. […]

OOP to me means only messaging, local retention and protection and hiding of state-process, and extreme late-binding of all things. It can be done in Smalltalk and in LISP. There are possibly other systems in which this is possible, but I’m not aware of them.

Alan Kay answering to Paul Ram in 2003

Inheritance and polymorphism are not even mentioned! In the end, according to Alan Kay, the three pillars of OOP are:

  1. Message passing
  2. Encapsulation
  3. Dynamic binding

Combining message passing and encapsulation we try to achieve the following:

  • Stop sharing mutable state among objects, by encapsulating it and allow only local state changes. State changes are at a local, cellular level rather than exposed to shared access.
  • A messaging API is the only way the objects communicate. Thus the objects are decoupled. The messages sender is loosely or not coupled at all to the message receiver.
  • Resilience and adaptability to changes at runtime via late binding.

In the “The Early History Of Smalltalk” Alan Kay writes the following:

[…] the whole point of OOP is not to have to worry about what is inside an object. Objects made on different machines and with different languages should be able to talk to each other […]

Alan Kay – The Early History Of Smalltalk

This sentence actually is talking about distributed and concurrent systems. Objects hide their states from each other and they just communicate (“talk to each other”) by exchanging messages. In simple words, objects should be able to broadcast that they did things (changed their state actually) and the other objects can ignore them or respond. This concept reminds of agents modelling or even actors. The key point that can improve the isolation among objects, is that the receiver is free to ignore any messages it doesn’t understand or care about.

Finally, let’s remember one more of Alan Kay’s quotes

I made up the term object-oriented, and I can tell you I did not have C++ in mind.

Alan Kay

It was in the eighties that “object-oriented languages” started to appear. C++ was part of a set of ideas starting around 1979 by Bjarne Stroustrup. C++ was designed to provide Simula’s facilities for program organization together with C’s efficiency and flexibility for systems programming. His approach was via “Abstract Data Types”, and this is the way “classes” in C++ are generally used. C++ was a pre-processor to C language. “Classes” were program code structuring conventions but didn’t show up as objects during run time.

Hence the quote, as Alan Kay states in Quora, which is not so much about C++ per se but about the term that we had been using to label a particular approach to program language and systems design.

OOP and human cognition

In 1995 a paper was published by Bill Curtis under the name ” Objects of Our Desire: Empirical Research on Object-Oriented Development“. Among others, there is the following sentence:

In careful experiments, Gentner (1981; Gentner & France, 1988) showed that, when people are asked to repair a simple sentence with an anomalous subject-verb combination, they almost always change the verb and leave the noun as it is, independent of their relative positions. This suggests that people take the noun (i.e. the object) as the basic reference point. Models based on objects may be superior to models based on other primitives, such as behaviours.

Objects of Our Desire: Empirical Research on Object-Oriented Development, Bill Curtis

So a paper published in the nineties cites experiments that were run in the eighties, to support OOP concept. Well, in the nineties that might made sense, based on how the software industry was at that time, but not today. Today’s software is moving towards serverless applications, which are functions as a service, rather than to complicated objects communicating to each other.

The line of business software in our times is so complex, that OOP + TDD, OOP + DDD or OOP + BDD are concepts that programmers still struggle with. What is the right number of objects? How deep the granularity of objects should be? How mutable the objects should be? What is the right architecture to follow? Although there are tons of books and articles about those issues, software projects fail, due to complexity.

Additionally Bill Curtis paper includes the following:

Miller (1991) described how nouns and verbs differ in their cognitive organizational form. Nouns – and hence the concepts associated with them – tend to be organized into hierarchically structured taxonomies, with class inclusion and part-whole relations as the most common linkages. These are also, of course, the most common relations in OO representations.

In human cognition, these hierarchies tend to be fairly deep for nouns – often six to seven layers. These hierarchies support a variety of important cognitive behaviours, including the inheritance of properties from super ordinate classes. In contrast, verbs tend to be organized in very flat and bushy structures. This again suggest a central place for objects, in that building inheritance hierarchies will mirror the way humans represent natural categories only if the basic building blocks are objects rather than processes or behaviours.

Objects of Our Desire: Empirical Research on Object-Oriented Development, Bill Curtis

So through linguistics principles, the paper supports OOP, that is objects (nouns), verbs (methods) and the hierarchy among them. But a thing missing here, is what about those programmers, whose native tongue is not English? How their brains work? Can they adapt easily, to the OOP logic or not? Even today sometimes I see variables named in other languages that English.

You will not find a single medical article that denotes that the human brain thinks, organizes, structures based on objects. We carry a “todo list”, not an “item hierarchy list”. Human brains can only hold about five items at a time in working memory. It is much easier to explain a piece of code based on what it does, rather than based on what variables change around the source code. Each language has a set or rules to constraint you, in order to speak and write correctly, it is called grammar! On the other hand in OOP programming you have so many options to solve the same problem, that in the end you can just throw any “grammar” out of the window.

Additionally, OOP code is non- deterministic. You can verify that by installing a cyclomatic complexity extension to your IDE and run it. Dependencies, null checking, type checking, conditional statements, all of them combined produce more outputs than expected. Let’s not forget Mock object in unit testing, you have to predefine its behaviour. So even if you have an a grammar, a structure of your objects, there is no guarantee that the functionalities implementation is going to be according to the grammar.

Finally, we have dependencies hell. And it”s only about nuget packages or maven dependencies. It’s the source code’s internal hierarchy. Inheritance, methods, constructors parameters, Law of Demeter, etc. So how nouns and verbs and objects hierarchy are equal to simple code without extra complexity is still a mystery.

Why Functional Programming (FP)?

Functional programming is a programming paradigm: a different way of thinking about programs than the mainstream, imperative paradigm you’re probably used to. FP is based on lambda calculus. Functions tend to provide a level of code modularity and reusability It manages nullability much better, and gives us a better way of error handling.

FP provides the following:

Power.—This simply means that you can get more done with less code. FP raises the level of abstraction, allowing you to write high-level code while freeing you from low-level technicalities that add complexity but no value.

Safety. This is especially true when dealing with concurrency. A program written in the imperative style may work well in a single-threaded implementation but cause all sorts of bugs when concurrency comes in. Functional code offers much better guarantees in concurrent scenarios cause of immutability, so it’s only natural that we’re seeing a surge of interest in FP in the era of multi core processors.

Clarity. We spend more time maintaining and consuming existing code than writing new code, so it’s important that our code be clear and intention-revealing.

So how functional a language is C#? Functions are first-class values in C#. C# had support for functions as first-class values from the earliest version of the language through the Delegate type, and the subsequent introduction of lambda expressions made the syntactic support even better. There are some quirks and limitations, but we will discuss about them in time.

Today we have LINQ. Language-Integrated Query (LINQ) is the name for a set of technologies based on the integration of query capabilities directly into the C# language. Traditionally, queries against data are expressed as simple strings without type checking at compile time or IntelliSense support. With LINQ, a query is a first-class language construct, just like classes, methods, events. You write queries against strongly typed collections of objects by using language keywords and familiar operators. The LINQ family of technologies provides a consistent query experience for objects (LINQ to Objects), relational databases (LINQ to SQL), and XML (LINQ to XML).

Query expressions are written in a declarative query syntax. By using query syntax, you can perform filtering, ordering, and grouping operations on data sources with a minimum of code. You use the same basic query expression patterns to query and transform data in SQL databases, ADO .NET Datasets, XML documents and streams, and .NET collections.

The disadvantage in the C# + FP try, is that everything is mutable by default, and the programmer has to put in a substantial amount of effort to achieve immutability. Fields and variables must explicitly be marked read-only to prevent mutation. (Compare this to F#, where variables are immutable by default and must explicitly be marked mutable to allow mutation.) Finally, collections in the framework are mutable, but a solid library of immutable collections is available.

To highlight the difference between are in OOP and FP, I provide an example: You run a company and you just decided to give all your employees a $10,000.00 raise.

OOP (imperative way)FP
1. Create Employee class which initializes with name and salary, has a change salary instance method

2. Create instances of employees

3. Use the each method to change salary attribute of employees by +10,000
1. Create employees array, which is an array of arrays with name and corresponding salary

2. Create a change_salary function which returns a copy of a single employee with the salary field updated

3. Create a change_salaries function which maps through the employee array and delegates the calculation of the new salary to change_salary

The FP approach uses pure functions and adheres to immutability by using map With OOP, we cannot easily identify if the object has had the function called on it unless we start from the beginning and track if this has happened, whereas in FP, the object itself is now a new object, which makes it considerably easier to know what changes have been made.

FP leans heavily on methods that do one small part of a larger job, delegating the details to other methods. This combining of small methods into a larger task is composition. In our example, change_salaries has a single job: call change_salary for each employee in the employees array and return those values as a new array. change_salary also has one job: return a copy of a single employee with the salary field updated. change_salaries delegates the calculation of the new salary to change_salary, allowing change_salaries to focus entirely on handling the set of employees and change_salary to focus on updating a single employee.

Conclusions

I believe that anyone of you have understood, that the main key words are code simplicity, state immutability, messaging. From distributed mutable state around the source code (objects), to code organized by expected behaviours (functions).

FP is a programming parading as OOP is. OOP is alive and it will be for next years. But can we rely on it anymore? After 10 years as a software developer, I believe not anymore. Maybe I am wrong!

But instead of asking a better OOP language, I try to smoothly move to FP. Unfortunately I can’t completely move away from C#, due to business restrictions, but I do my best to find a better alternative.

The next episode

In the next episode, the topic is “OOP today” and an analysis about objects state.

References