TL;DR

Divide by zero because of parameters is deterministic, and so is pure (outputs only depend on inputs).

Database connection lost exception is not deterministic and depends on the environment rather than the parameters. Also calls to external services and so on.

Cats IO monads have side effects and that means the FP folks understand not all code can be pure (outputs depend only on inputs).

Cats and FP Monads should be used as follows:

  • throw new Exception or .raiseError when the use case cannot be completed. Handle the exception at the top, where the event arrives.
  • Option.fold or Option.getOrElse or if then else if the use case has a green path, and a single exception path which also results in some sort of execution (state update, events to other processes etc.)
  • Either or another structure when there is a single main green path to the use case, but multiple other green paths which can result in different successful outcomes.

Do not pollute your code base with Either[Throwable, _] as you are bringing back functions with checked exceptions and all of scala and the world knows Java was wrong about this.

Why write this

I am reviewing a completed project and trying to tease out lessons learnt, so we can simplify the code for the maintenance devs, rather than make the code cleverer and harder to read.

While thinking about this I decided its the conditional processing and failure handling where we could do better, so tried to set down my reasoning about why Either is often really bad.

What is a Monad?

Because its a really bad and stupid word which stopped me reading a lot of this stuff when I was learning scala, here is my attempt to describe it.

A Monad is code which executes code. Its code that calls a callback, often in a loop.
However, it doesn’t have to do it now, often it builds up a pipeline which will only be executed later, when an event comes in.

In scala that means it implements .map and a few others. Rather than an imperative loop which calls your function, you write .map - which internally has a loop which calls your function. In Cats you will very quickly use the IO monad. Each IO is a function which cats will call in its main runloop.
So

def foo() : IO[Int] =
  for {
    i <- someIO()
    j <- someIOAgain()
  } yield j

Composing foo into your code base will NOT call it until the Cats run loop calls it, and then it could schedule someIO and someIOAgain functions onto the runloop if they are async. Which means you can get parallel execution for free, all handled by Cats who manage execution thread pools and so on for you. Basically you get a lot for free, rather than doing it yourself. ‘for free’ actually means reading and studying for weeks to join the cats club.

The Checked Exception debate

Long ago, Java said exceptions are part of the function signature and everyone who calls a method should know the contract. Exceptions are exceptional, but they could also be used for validation (eg assert, or config missing etc.)

There were also environmental JVM type exceptions - OOM and so on. Checking these in every function sig would be silly, so we have checked exceptions and unchecked.

Time passes, and lots of devs get sick of having to have boiler plate and long function signatures when in reality the exceptions were basically errors shown to the user, or error messages sent back to the caller. Either way the handling tends to be done at the top level in the code which is called by some sort of incoming event.

So, languages dropped checked exceptions - C#, Scala and so on.

Functional programming

Scala devs don’t do checked exceptions.

When using Cats it is encouraged to think about pure functions far more, so having a stack unwind in Scala because of nested call exceptions is not great, even if it is deterministic it is not considered to be great because it would be better to use Either or Validated… which is basically shorthand for checked exceptions again.

Cats

In Cats Either or Option could be used to short circuit execution of your Monad pipeline.

something
  .map(a=> fn(a))
  .map(a=> fn1(a))

i.e.

for {
  s <- something
  s1 <- fn(s)
  s2 <- fn1(s1)
} yield s2

If you want a condition to break out of the pipeline then return an Either, which is right biased. ie if you return a left then the .map will not happen.

If you use Exceptions within the Either then it pollutes your entire code base because you have to return Either all the way to the top event handler which then handles it.

Checked exception madness again.

IO Monads

IO is used for side effects, and side effects typically suffer from all sorts of environmental failures - disk full, database down, system X not contactable etc.

With Cats IO you can throw an Exception or you can IO.raiseError. In fact throwing an exception is the same as raiseError as the Cats IO runloop catches them and does it.

You can then handle the error someplace in your pipeline

// Example of local error handling to the green path, and an effectful failure path, and then using
// an Either to show the user some sensible message
private def handleOrder1(itemId:Int, userNm:String) = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO{Left[String, IO[Int]](s"Out of Stock of ${goods.name}, we will order another [$ordered] soon")}
      })
      .map(_=>Right[String, IO[Int]](getInventory(itemId)))
  } yield updateRes
}

In fact an Exception with a top level interpreter would be easier to read, you could even argue the user doesn’t care you have ordered more - as their order failed.

private def handleOrder1Again(itemId:Int, userNm:String):IO[Int] = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO.raiseError(OutOfStockException(s"Out of Stock of ${goods.name}, we will order another [$ordered] soon"))
      })
      .flatMap(_=>getInventory(itemId))
  } yield updateRes
}

FP Cats conditional activity based on state

Sorry, but state is not all in the parameters, state is usually in the data store or possibly the accumulated event stream materialised in your KTable etc.

For instance, the event is to purchase a widget, so you check the inventory and the widget is either in stock or not. If in stock then sell it and update the inventory, or if not in stock tell the customer to go someplace else, but perhaps update our own stock system to order in some more. And so on.

So, as our previous example showed:

private def handleOrder1Again(itemId:Int, userNm:String):IO[Int] = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO.raiseError(OutOfStockException(s"Out of Stock of ${goods.name}, we will order another [$ordered] soon"))
      })
      .flatMap(_=>getInventory(itemId))
  } yield updateRes
}

Our code is now a horrible mess rather than the lovely examples we usually see:

for {
  s <- something
  s1 <- fn(s)
  s2 <- fn1(s1)
} yield s2

We have dbLookup - presumably this could fail, maybe it returns an Either with a Left where the itemId doesn’t exist at all.

updateInventory uses Doobie (say), and Doobie loves to throw exceptions, so our DAO followed that pattern and does an IO.raiseError or throw new Exception(“”), Which is a branch point to either Success (in the code above we call getInventory), or a call to placeOrder to get more in and then an exception with a user message.

All wrapped in an IO monad as this means our code is now ‘functional programming’.

If you log your exceptions will never give you a full call stack, but only the stack for this little lump of code (ie function) which the Cats runloop is executing. To be fair, Akka and other frameworks which schedule functions all suffer from this, and Java SpringBoot exceptions are utterly terrible.

What the heck?

So, given all programming takes in events and then has to tie those events to current state (in whatever form), is FP even worth it? It is as you will start to write pure functions, and your code will tend away from the monolith into function pipelines which are very appropriate for elastic scaling on the cloud, which should mean you save money if you do it right - no idle CPU cycles to cope with peak demand.

But don’t use Either too much as it can really hit your code base hard, bring back in checked exceptions to your functions.

So, what to do to avoid the .handleErrorWith all over the place (GoLang error approach)?

Is there an Answer?

Stop worrying. Most systems these days are Lambdas disguised as web servers. ie you write a monolith (or microlith) which has 25 rest end points, which deal with incoming events. Each of these events then calls a processing pipeline independent of the rest of your code.

Create some definition of the URLs mapped to the handlers
Wrap these routes in loggers and security checkers, with error handlers
Bootstrap web sever

Hopefully you have all played with AWS lambdas and know that web servers are dead and you simply have 25 lambdas - relying on the infra for security and cors and so on.

Now, in this new world, state is all external as Lambdas are short lived (they are often on warm standby, but no state is maintained in memory ideally) which means Events are often processed against external state (OK, some processing has all the data in the event, but not where I work).

So, should you use FP? Is cats useful? Yes, even though if can be obscure. Exceptions will exist (doobie uses them), do not wrap them all in Either - the unchecked exception debate is over and the world went away from them.

The Answer

Given all your AWS Lamba functions will use IO monads, then IO.raiseError or throw new Exception is fine if it really is a reason to fail the green path for this event. If you have conditional processing then Either could be handy for a use case with multiple green code paths, or an Option if there are only two. However, hide it away in a function so you do not pollute your entire code base with them.

Conditional programming should not be done like this

private def handleOrder1Again(itemId:Int, userNm:String):IO[Int] = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO.raiseError(OutOfStockException(s"Out of Stock of ${goods.name}, we will order another [$ordered] soon"))
      })
      .flatMap(_=>getInventory(itemId))
  } yield updateRes
}

but you should try to have your main function pipeline reading like the FP examples, just use really descriptive function names, and keep the functions small.

private def handle(itemId:Int, userNm:String) = {
  for {
    goods <- dbLookup(itemId)
    res <- updateInventoryOrFailAfterPlacingOrderForMore(goods, userNm)
  } yield res
}

i.e. as in OO, keep your code readable, do worry about code complexity even when pipelining your functions. Code complexity increases with more branches and more nesting in a function.

p.s. both examples above are fine - its a small branch and you could argue both are very readable, but long function pipelines with multiple error handlers are terrible to read for any poor support coders. I have found small pipelines soon become long ones, so get used to writing lots of little descriptive functions - hiding away the branch points in the use case with well named functions.

Small functions are great, small pipelines are great, tests are great.