Getting the code to work

Scala 3 Rest Service with Http4s and Tapir - January 2022

2022-02-06T00:00:00+00:00

Tapir Http4s with Scala 3

I wanted to try out a Scala 3 rest service, and Rho is not ready for Scala 3 yet, so Tapir wins.

There isn’t much to say really, other than whats in the readme on Github:

https://github.com/PendaRed/scala3-service

FS2 Cribsheet - August 2021

2021-08-01T00:00:00+01:00

Very quick reference into the FS2 incantations that make up their API.

Critical that you understand IO or SyncIO are basically classes holding your function - which means all the libraries can chain operations together or place them in queues and so on. Rather than the jvm calling your code, the cats effect and fs2 libraries will build up queues of operations in IO’s which they can then get the JVM to run on a thread at some point.

IO: Something with a side effect, ie it’s a class which holds your function, so that it can be invoked synchronously or asynchronously with the evenual unsafeRunSync or the unsafeRunAsync. Critically an IO has a start api, so you can create a new fibre any time you like and schedule the work. But try not to…the fs2 streams API is where you should be heading.

SyncIO; Just like IO but should be used for all the synchronous effects. ie if your code should just be run and not done async then use this

IO.as or SyncIO.as; This allows you to change the content and type of an effect, and yet the original effect is still invoked. This can be very useful if you have an effect producing the wrong type of output, which you want to execute, but then use like it does something else. In fact this is how evalTap works inside. Why do you care?

// Calling processor will still invoke the IGetAString side effect.
// otherwise you would have to start your own fibre, ie IO.start.
def iGetAString(c:String) : IO[String] = IO{println(s"${Thread.currentThread().getName()} $c :iGetAString");"Some DB Value"}
def processor():IO[Unit] =
  iGetAString("processor").as(())
def badProcessor():IO[Unit] =
  IO {
    println(s"${Thread.currentThread().getName()} inside badProcessor")
    val fibre = iGetAString("badProcessor").start // Note IO.start is possible, syncIO.start is not an API
    fibre.as(())
  }.flatten


IO error Handling: Nice notes (here)[https://guillaumebogard.dev/posts/functional-error-handling/]. About half way down, he explains you should NOT throw exceptions inside IO’s, but should instead do IO.raiseError(new Exception()). Then later you use the .handleErrorWith(), or, to raise the error again but do some processing you can use onError.

Simple API’s

See the code here
Pure stream

fs2.Stream(1,2,3)
Stream.emits(List(1,2,3))

Effectful stream

def Random() = SyncIO{ Math.random()*10}
val l = Stream.eval( Random())

‘Running’ the stream

Example of emits

Example of stream from list

Example of eval

Example of Zip

EvalTap

Say you have a stream of IOT house light switch state, and when they turn off the last light in the house you invoke a cartel service which returns the burglary time. IO[LocalDateTime], but your original stream is of [LightState].

So from LightState you invoke another effect returning a LocalDateTime, but you want to leave the original stream unchanged. Well EvalTap lets you evaluate a totally different effect, and then pretend that it didnt have another type or effect after all.

EvalTap example

EvalTap to run a nested stream

When you compile and drain it down to an IO then its now perfect for EvalTap to run a finite sub stream.

Nested sub finite stream

Using Signals and Topics

You can share state between streams in FS2. So for instance, you could get a Kafka topic to disconnect and then restart from another Kafka topic when a command arrives.

Again, see the code

If you want to start simpler, then look at these

Pull explained

See the code with notes

Normally when you read a stream you only deal with this item, or chunks of items. Pull lets you write code which maintains state internall and calls itself - so the state is on ‘stack’ and immutable within each call. The clever bit is that from this function which calls itself you can emit whatever result you want whenever you want back onto the stream. And then continue to call yourself.

In fact there is an operator » which you use to invoke yourself later, which means its fs2 based call and not a stack based call. So it works for infinite streams.

Best thing to do is read the code linked above.

a simpler non annoted pull

Revisiting fs2 stream start

Its simple to start a stream, but you could call start and get the Fibre and then cancel it, rather than use the stream takeWhile.

Example showing fibres vs compile.drain

The output is inlined in comments, again look at the thread names.

Parallel streams

Concurrent fs2 streams

An example using effectful and non effectful streams - 4 of them, all running at once using concurrently.

Example of infinite concurrent streams

Merge fs2 streams

Do not wait, take the first value from either…

import cats.effect.IO
import cats.effect.unsafe.implicits.global
import fs2.Stream
import fs2.concurrent.SignallingRef

import scala.concurrent.duration.*
@main def StartingParStreams =
  val s1 = fs2.Stream.eval(IO{1}).metered(100.millis).repeatN(20)
  val s2 = fs2.Stream.eval(IO{2}).metered(100.millis).repeatN(20)

  val s3 = s1.merge(s2)
  val s4 = s3.map(x=>{println(x); x})
  val s5 = s4.compile.drain.unsafeRunSync

This displays

1 2 1 2 2 1 1 2 2 1 1 2 2 1 .....

So it merges them, but whichever emits first gets displayed.

Interleave fs2 streams

import cats.effect.IO
import cats.effect.unsafe.implicits.global
import scala.concurrent.duration.*

@main def InterleaveFs2Streams = {
  val s1 = fs2.Stream.eval(IO{1}).metered(100.millis).repeatN(20)
  val s2 = fs2.Stream.eval(IO{2}).metered(100.millis).repeatN(20)

  val s3 = s1.interleave(s2)
  val s4 = s3.map(x=>{println(x); x})
  val s5 = s4.compile.drain.unsafeRunSync
}

This displays

1 2 1 2 1 2 1 2 1 2 1 2 1 2 1 2 .....

FS2 chunking, impact of sleep and …

Example of chunking and sleeps

This example shows an effect being invoked which then sleeps (eg maybe it calls a db connection pool or something). You can see how each compute thread gets invoked with calls running in parallel. Look at the thread names!

FS2 Error Handling

Example of error handling

When to throw an excemption and when to raise an error, and how the different handlers are invoked - all in the example above.

More on parallel

This is still coming, if I get time I will update it…..

Oh to code like Fabio

Sadly, I am still not fluent like Fabio.

Example of Fabio fs2

It’s from his talk.

Scala3 Scalajs - July 2021

2021-07-27T00:00:00+01:00

TL;DR

Click here for the github project

An example project in July 2021 using sbt, scala 3, scala js, javascript, react hooks, axios and ag-grid. No typescript, but scala being used in the UI and on the server in a GCP environment.

Why?

Having to learn typescript and then implementing reactjs hooks, and so on to create crud type UI’s is fine. But since I have a scala team its actually boring having to train them all up into typescript. So, lets see if we can use scala.js.

As ever, this is a reminder/notes for myself, taken from other parts of the web and my own hacking. Maybe you will find it useful.

links

I will not be repeating set-up stuff from the Scala js site

Whats the target state

I have decided to stick with React Hooks, axios and scss, which means javascript reducer and context and so on. This is because it gives you component libraries, and I want to use fluentui as I mostly do data intensive single page apps. I will pull in ag-grid to confirm this choice, rather than fluentui but the principle is the same.

Scala for GCP functions, Scala.js for client logic, shared cross compiled code for the occasional shared logic. Javascript, React, sccs for the UI layout stuff.

This means I want an npm project as well as a client scala.js project.

Project structure

The question for me is where to stick the node and scala projects. I want one repo as its just simpler.

project
   client-js          -- client only logic, exporting javascript modules
       src/main/scala   -- scala 3 code
   shared             -- dto's and logic running in both client and server
       src/main/scala   -- scala 3 code
   gcp-functions      -- for serverless functionality
       src/main/scala   -- scala 3 code
   project            -- for sbt build stuff
   reactapp           -- My node project NOT part of the scala project

So in project/client-node I initialise the node project as here.

cd project
npm init react-app ./reactapp
cd reactapp
npm start

Adding the output to node

I am happy to have my client-js monitoring for changes using

sbt ~clientJs/fastLinkJS

This generates javascript in client-js\target\scala-3.0.0\client-js-fastopt. It took me ages to find this, but the incantation to make it generate into the reactapp is:

lazy val clientJs = (project in file("client-js"))
  .enablePlugins(ScalaJSPlugin)
  .settings(
    name := "client-js",
    scalaJSLinkerConfig ~= { _.withModuleKind(ModuleKind.ESModule) },
    Compile / fastLinkJS / scalaJSLinkerOutputDirectory := new File( baseDirectory.value, "../reactapp/src/scalajs"),
  )

Making my new react app work

So I have npm start working, and it complains about BigInt undefined, which is an eslint issue apparently, so the next incantation in the reactapp dir is to add .eslintrc.json

{
  "env": {
    "es2020": true
  }
}

Having done this, every time I edit the client scala code, it generates the .js files in the reactapp, and npm then regenerates automatically and the page refreshes.

I also have an axios call, so I run the gcp functions locally in intellij, so the react app in Chrome gets refreshed, calls the gcp function, gets data and displays it in the ag-grid. Perfect.

No more typescipt for me.

As mentioned above, the working code is in my github project

Scala 3 FS2 - July 2021

2021-07-17T00:00:00+01:00

I used alpakka and kafka streams, and our http4s kafka streams hit a brick wall, so it is time to properly explore what fs2 is, or discard it.

All details below are cribbed for my own reference, but maybe you will find them useful.

Lets see some Scala

First off, the scala 3 fs2 libs up to 3.0.6 (the latest) are all broken. So unless you download the zip, stick with the 2.13 versions.

build.sbt

import sbt.Keys.libraryDependencies

val scala3Version = "3.0.0"

lazy val root = project
  .in(file("."))
  .settings(
    name := "scala3-fs2",
    version := "0.1.0",

    scalaVersion := scala3Version,

  // https://mvnrepository.com/artifact/co.fs2/fs2-core
	libraryDependencies += "co.fs2" % "fs2-core_2.13" % "3.0.6",
	// optional I/O library
	libraryDependencies += "co.fs2" % "fs2-io_2.13" % "3.0.6",

	// optional reactive streams interop
	libraryDependencies += "co.fs2" % "fs2-reactive-streams_2.13" % "3.0.6",
  )

Emit

So, no side effects, ie no IO, just a Stream of [Pure, A].

You can lift this stream into a stream of [IO, A] using .covary[IO] operation.

See the fs2 main docs for examples, it pretty easy.

Streams and Topics

There is example code below written for Scala 3 and Cats 3 in July 2021.

General examples of fs2 streams
fs2 Stream starting and stopping
Using fs2 topics and signals to coordinate
fs2 Pulls explained - maybe

If you have two Kafka consumers, one is a command channel and one is a reader. The command channel can ask the main consumer to disconnect and reconnect at another offset.

So, start an FS2 Topic, when the command channel gets the reset message, republish on the fs2 topic.

In the main consumer, create a signal, and two streams, one listens to the fs2 topic, and one to the kafka topic. When the fs2 topic gets the reset then it can signal the main consumer to terminate.

Wrap this second set - ie the fs2 topic consumer, and the kafka consumer in another stream which repeats.

So, the flow is now

Get the kafka command reset message
publish it to the fs2 topic
receive fs2 topic reset, signal the main consumer of business message to terminate.
repeat the fs2 topic and business message consumers.

So, how to code it - look here: https://github.com/PendaRed/scala3-fs2/tree/main/src/main/scala/com/jgibbons/fs2/c

Sorry its a dummy Kafka, to simplify the code….

what is fs2 pull good for

Streams are pure, and emit values. Or they are effectful (ie have an IO type) and also emit values.

If you want to maybe aggregate a sequence of values in the stream and then continue, how to do that - well pull is your friend. You must know the right API to call to change a stream into a pull, then you can call more chunks from the stream inside the pull, until you have what you want, then convert the pull back to a stream so it can continue.

eg collect all the characters until you have a full line. Or collect up to 5 items from the stream. This idea of working on multiple elements at once is called ‘stateful’ in the docs. Streams provide many APIs that do much of the above anyway, but the implementation for them all use Pulls -ie click into them and you will see.

The pull description is hard to get - talking about monadic in results (or sometimes resources, because the docs are confused). ie if you call stream.pull.uncons you get a stream with results, which you can then call things like map, flatmap etc - ie it is monadic on the results. eg

val g: Pull[IO, INothing, Option[(Chunk[Int], Stream[IO, Int])]] = Stream.eval(IO{1}).repeatN(20).pull.uncons

Look at the pull API for functions which can takeN or echo or etc.

Once you have your Pull, you want to do stuff, before getting back into the stream (maybe). The crucial APIs are:

Pull.output() - this can be given chunks and will give you a pull with output rather than results. This is critical as you can then call .stream on it to get back to streaming stuff.
pull.stream - You need a pull with output - which you can create with Pull.output(). This converts the pull back to a stream.
stream.through - passes the stream into the function and gets a pipe back
pipe is simply a function that takes a stream of one type and returns a stream of another type.
pull >> - this is a lazy append, if the first pull works, append the second pull. Good for appending the terminator Pull.done, but even better at appending a function (to a helper for instance) which will get run sometime - but critcally not using stack based recursion.
Pull.pure(None) >> somethingelse - This emits nothing, but lets you then use the >> operator, so you can have a function call after >> which FS2 will defer to be executed, avoiding stack based recusion.
evalTap - Magically takes the stream element and then will schedule any SyncIO or IO of any type without impacting the stream itself! The magic is that IO{yyy}.as(xxx) will return an IO{xxx}, so you can call one effectful function and then pretend it returned something else. Or as the documentation says, keep the effect but replace the return type. Or, if you are me, it is the same as running the effectful function, but then replacing the return type, which is how evalTap does its magic.
compile - gets a projection of the stream which you can drain into an IO[Unit] needed by IOApp, or convert toList etc. Now it isn't needed on a pure stream with no side effects.
compile.drain again - say you have a stream, for instance the kafka assigned partitions on a rebalance, and you want to do an evalTap(consumer=>someFn(consumer)), and your someFn function uses the assigned partitions stream, but you just want the stream to run and give you gave an IO[Unit], well, thats when the compile.drain above is great. Putting it simpler, if you use evalTap, and you want to run a stream inside it, then you create the stream and compile.drain which gives an IO[Unit] which evalTap will then change to be an IO of the stream type.

So lets put it together, in pseudo code The stream code, lets call it streamA

streamA.through( aPipe).compile.drain

The pipe code:

def aPipe : Pipe =
  def myRecursiveHelper(s:Stream) returns a pull = {
    // from the input stream, convert to a pull where we can flatmap // the results - uncons waits for some chunks to have been
    // emited, ie they are the results in the pull.
    s.pull.uncons.flatMap {
        do some conditional stuff resulting in either a
        call to myRecursiveFunction
        or a Pull.output(some resutls) >> flatMap(Pull.done)
        or a Pull.pure() >> flatMap(Pull.done) to remove an element from the stream
        or a Pull.done
    }
  }
  inStream => myRecursiveHelper(inStream).stream

The critical magic is that calling Pull.output(foo) gives you a pull whch you can then call .stream on. ie a Pull can contain an effect, output, results. Only a pull with output can go back to a stream.

Gotcha with evalTap and infinite streams

Do not do an evalTap for a nested stream which doesn’t end. If you do then the evalTap will not end, ie that stream will sit there emitting. This bite me for the fs2 kafka consumer.assignmentStream, which is a stream will will emit on every revoke/assign. ie it doesn’t just emit once, but stays active. We played with calling .start to create a new fibre, but obviously the correct thing is to run it concurrently with the main message consumer stream.

Debate

Nothing about the fs2 api is obvious. If you look at the documentation and think its all inpenetrable you are not alone.

So rather than try to understand it, you just have to remember the magic incantations above.

Pull! Got it?

If you want to work on multiple elements in a way not supported by Stream out of the box use a pull.

When you are done, use {Pull.output() » helpFn()}.stream to get back to the stream and continue.

Actually, there is too much to say on Pull, so look at the code example here: fs2 Pulls explained - maybe

fs2 notes from youtube

Fabio Labella has some great you tube talks - so watch them. Some of his quotes:

Streams are great for things that will not fit into memory.

A stream is a lazy emission.

IOs are your words, streams are your sentences.

FS2 is about dealing with the emitted items and causing effects. Each efect is an IO, and this is put into a fibre. A fibre has a stack of things in order to execute, and a fibre simply asks the execution context or scheduled execution context (ie thread pool) to run the effect and then yields every so often so other fibres can get some work done.

Zip will merge two stream, but it will wait for something to be emitted from each.

FS2 is all about effectful streams, ie things happen which cause other things to happen.

In cats an effect is called an IO.

IO is a monad, so F[_] in scala 2 can be used in your type classes, or you could just use IO.

Supports resource safety - ie auto closing them, and also concurrency constructs such as semaphores, ref, queues etc.

Simple pure streams have no side effects. You can lift them into IO using .covary[IO]. ie lifts it into an effectful stream.

Compile will take a Stream[IO, A] and change it to a single IO[Unit]. You can then call .unsafeRunSync on it (but actually, extending IOApp.Simple is the way to go).

Compile will compile to a single effect, so you could call .toList after compile instead of .drain.

IO[Unit] (eg after compile) describes running the stream of IOs and dealing with all the effects.

This is a gppd description of the components https://devon-miller.gitbook.io/test_private_book/sstream_model

But the description of pull is basically mad.

Scala 3 Google Cloud Function - July 2021

2021-07-10T00:00:00+01:00

GCP Cloud functions in Scala 3

Lets face it, Scala 3 is excellent, but how to make it run in the cloud.

Lets start by running the cloud function locally.

package com.jgibbons.eg1

import com.google.cloud.functions.HttpFunction
import com.google.cloud.functions.HttpRequest
import com.google.cloud.functions.HttpResponse
import java.io.BufferedWriter
import java.io.IOException

// https://cloud.google.com/functions/docs/first-java#gradle
@main def RunLocallyHelloWorldScala3 : Unit =
  println(s"Test Runner for HelloWorldScala3")
  com.google.cloud.functions.invoker.runner.Invoker.main(Array("--target", classOf[HelloWorldScala3].getName))

class HelloWorldScala3 extends HttpFunction :
  override def service(request: HttpRequest, response: HttpResponse) =
    val writer = response.getWriter()
    writer.write("Hello World from Google Cloud Function in Scala 3!")

That was easy.

And we all use SBT and don’t get pulled back to gradle or maven, so, surely that is hard.

build.sbt

ThisBuild / scalaVersion := "3.0.0"
ThisBuild / organization := "com.jgibbons"

lazy val gcpKowFunctions = (project in file("."))
  .settings(
    name := "gcp-kow-functions",
    libraryDependencies += "org.scalatest" % "scalatest_3" % "3.2.9" % Test,

    // Every function needs this dependency to get the Functions Framework API.
    libraryDependencies += "com.google.cloud.functions" % "functions-framework-api" % "1.0.1",

    // To run function locally using Functions Framework's local invoker
    libraryDependencies += "com.google.cloud.functions.invoker" % "java-function-invoker" % "1.0.0-alpha-2-rc5",
  )

project/build.properties

sbt.version=1.5.3

project/plugins.sbt

addSbtPlugin("com.eed3si9n" % "sbt-assembly" % "1.0.0")

So, well, the rest of it is the normal build and deployment using gcloud commands which are well covered by others.

eg https://cloud.google.com/functions/docs/first-java#gradle

But, for my notes:

Run the assembly target to create:

target/scala-3.0.0/gcp-kow-functions.assembly-0.1.0-SNAPSHOT.jar

cd target/scala-3.0.0/
gcloud functions deploy scala3HelloWorld --entry-point com.jgibbons.eg1.HelloWorldScala3 --runtime java11 --trigger-http --memory 512MB --allow-unauthenticated

gcloud functions describe scala3HelloWorld

Look for the url, and browse to it.

Magic, have a local runner for the function, and a deployed cloud function.

Time to ditch Kubernetes, Http4s, Rho, and the rest of it. Back to simpler Scala 3, and joy.

On last thing

Obviously the java-function-invoker isn’t needed in the cloud, so change the build.sbt to have % Test on the end.

    libraryDependencies += "com.google.cloud.functions.invoker" % "java-function-invoker" % "1.0.0-alpha-2-rc5" % Test,

Then move the runner into the test package, such as

package com.jgibbons.eg1

// https://cloud.google.com/functions/docs/first-java#gradle
@main def RunLocallyHelloWorldScala3 : Unit =
  println(s"Test Runner for HelloWorldScala3")
  com.google.cloud.functions.invoker.runner.Invoker.main(Array("--target", classOf[HelloWorldScala3].getName))

And that reduced the jar down to 6.73M from the original 12.2M.

Scala 3 migration - July 2021

2021-07-03T00:00:00+01:00

I have made my scala 3 terse notes here http://gibbons.org.uk/terse-scala3-notes-2021.

Now time to migrate my small crud rest app, how hard can it be? Well state of play in June 2021 is that loads of scala 2 libraries use macros, and they have not moved yet.

https://docs.scala-lang.org/scala3/guides/migration/compatibility-classpath.html So, I read this, and decided to just give it a go with only scala 3 libraries….

So, this is what I had to change: Change all your dependencies so no longer %% to match your scala version, instead append _2.13 or _3 or whatever maven central says is the supported version.

Ditch:

"com.typesafe.scala-logging" % "scala-logging_2.13"   % ScalaLoggingVersion,

Instead bringing in

"ch.qos.logback"             % "logback-classic"      % LogbackVersion,
"ch.qos.logback"             % "logback-core"         % LogbackVersion,
"org.slf4j"                  % "slf4j-api"            % "1.7.30",

Scala test has version 3 for scala 3.0.0, so import it using:

“org.scalatest” % “scalatest_3” % ScalaTestVersion % Test,

Ditch scala mock.

Circe does have scala 3 versions:

val jsonCircelibraryDependencies = Seq(
  "io.circe" % "circe-core_3",
  "io.circe" % "circe-generic_3",
  "io.circe" % "circe-parser_3"
).map(_ % CirceVersion)

Ditch all those little helpers, which don’t actually give you much

//    "io.chrisdavenport"          %% "log4cats-core"   % Log4CatsVersion,
//    "io.chrisdavenport"          %% "log4cats-slf4j"  % Log4CatsVersion,
//    "io.circe"                   % "circe-config_2.13"    % CirceConfigVersion

http4s, is available, but rho-swagger isn’t, so drop swagger for now. Hand crank it if you need swagger.

val Http4sVersion = "1.0.0-M23"
val KowMsCatsDependencies = Seq(
  "com.zaxxer"      % "HikariCP" % HikariCpVersion,
  "org.http4s"      % "http4s-blaze-server_3" % Http4sVersion,
  "org.http4s"      % "http4s-blaze-client_3" % Http4sVersion,
  "org.http4s"      % "http4s-circe_3"        % Http4sVersion,
  "org.http4s"      % "http4s-dsl_3"          % Http4sVersion,
//    "org.http4s"      % "rho-swagger_2.13"         % RhoSwaggerVersion
) ++lamCommonDependencies

and fs2 is available

val fs2Version = "3.0.4"
"co.fs2" % "fs2-core_3" % fs2Version

On the first day I tried this scala logging was not available for scala3, but the next day it was. So things are moving.

Code Stuff

Along the way it didn’t like lambda parameters out of (), so below I had to parenthesise req: Request[IO]

GET / "swagger-ui" |>> { (req: Request[IO]) => fetchResource(swaggerDir + "/index.html", req, blocker) }

And then 3 hours later….

circe in scala 3 mostly doesnt work. Http4s has trouple, intellij is complaining about TASTY versions. Its a massive mess.

So, for now, sadly, its too early to port projects if you are using any of the normally available libraries. And I’ve wasted my Sunday. grrr.

Try again

Later on, I started from a simple scala3 project and added in the blase dependenceies. IntelliJ could not cope, and blaze did not work.

Try again again

I reverted the code, then repeated the steps above. Then I created circe encoders and decoders for all of my classes received and sent from http4s, and stuck them in a seperate object which I imported only as needed.

It worked! I have my microservice back in scala 3 - no swagger, but I can live without it.

ie Circe with http4s circe was not working without encoders and decoders for each nested case class of the json.

object RenderAsJson :
  import io.circe._
  import io.circe.generic.semiauto._
  import io.circe.syntax.EncoderOps

  implicit val  myMapPointEncoder : Encoder[ClientMapPoint] = deriveEncoder
  implicit val  myMapPolyEncoder : Encoder[ClientMapPoly] = deriveEncoder
  implicit val  myMapTerrainEncoder : Encoder[ClientTerrain] = deriveEncoder
  implicit val  myMapEncoder : Encoder[ClientMapDesignDto] = deriveEncoder

  implicit val  myMapPointDecoder : Decoder[ClientMapPoint] = deriveDecoder
  implicit val  myMapPolyDecoder : Decoder[ClientMapPoly] = deriveDecoder
  implicit val  myMapTerrainDecoder : Decoder[ClientTerrain] = deriveDecoder
  implicit val  myMapDecoder : Decoder[ClientMapDesignDto] = deriveDecoder
}

JDBC is back, Doobie is gone - Feb 2021

2021-01-27T00:00:00+00:00

TL;DR

Pending Scala 3 release, the choice for the next service is:

Http4s, Circe, jdbc.

Dropping doobie

Given my one year dive into Cats and Doobie and Http4s, and that it is all in production and works… time to change my choices.

IO{} is simply a bad name for a wrapper around a callback. Much like Monad is a bad name for convertElements (or something similar). Not only that but you lose the stack and have to train the entire team up to understand non software terminology.

Basically, it isn’t worth it. The code is not simpler, it’s harder to understand.

Dropping cats

If we drop Doobie do we drop Cats - well as much as possible. But Circe is great. So we are keeping Cats in the project. Just not using it unless it’s used by a library.

Dropping Http4s?

This is tricky. Akka actors are overkill and have a large learning curve as well. Given microservices or Lambdas we do not need to do any of that stuff, it is far too easy to write your own without learning Akka.

If you remove Actors from Akka you are left with a smallish rest server, but the spray-json is rubbish compared with http4s json. Also Rho swagger support in http4s is better than akka swagger support - less boiler plate.

Keeping Http4s!

So, we like http4s, we like circe, and we like Rho. In the past we tried Slick and we looked at Quill, but basic JDBC with a few helper classes remains the most transparent code - and it’s easy to get to the Oracle batch and transactions if we need to do things that the Database frameworks don’t provide.

2021 Scala microservices

Pending Scala 3 release, the choice for the next service is:

Http4s, Circe, jdbc.

What does the DAO look like

Well, I don’t need a factory of factories or any of that 1998 overkill.

So my database helper could be:

case class DbCon(hikaryConfig:HikariConfig) extends LazyLogging {
  val ds = new HikariDataSource(hikaryConfig)
  sys.addShutdownHook(ds.close())

  def open(): Try[Connection] = Try( ds.getConnection())
  def close(con:Connection) :Unit = con.close()

  def using[A](debugStr:String, f:Connection=>A):Try[A] = {
    Try{
      open() match {
        case Success(con) =>
          try {
            val t0 = System.currentTimeMillis()
            val ret = f(con)
            logger.debug(s"$debugStr completed ${System.currentTimeMillis() - t0} ms")
            ret
          } catch {
            case e: Exception =>
              logger.error(s"$debugStr Failed to execute DB operation", e)
              throw e
          } finally {
            close(con)
          }
        case Failure(e) =>
          logger.error(s"$debugStr Could not get db connection, please abort", e)
          throw e
      }
    }
  }
}

and then I can bootstrap the Hikari pool by having a section of application.conf containing string properties, which I can change to HikariConfig like this:

def toProperties(config: Config): HikariConfig = {
  val properties = new Properties()
  config.entrySet.forEach((e) => properties.setProperty(e.getKey, config.getString(e.getKey)))
  new HikariConfig(properties)
}

ie application.conf is

// db is done as Hikari properies - which is why they are all string
db {
    hikariProperties {
          jdbcUrl : "jdbc:postgresql://localhost:5432/kow"
          username: "postgres"
          password: "mypassword"
    }
}

Then the DAO can have a case class which has DbCon but a companion which takes a Connection parameter - this allows my functions in the companion to be composable within my own transaction boundaries, much like a ConnectionIO in Doobie, but simpler, and I keep the stack trace.

case class GameMapDesignDao(dbCon: DbCon) {
  def findById(id: Long): Try[Option[GameMapDesignDbDto]] =
    dbCon.using(s"Find by id [$id]", con => GameMapDesignDao.findById(con, id))
}

object GameMapDesignDao extends LazyLogging {
  val CommonColumns = "MAPNAME, JSON, STATUS, VERSION, CREATED_AT, CREATED_BY, UPDATED_AT, UPDATED_BY"

  def findById(con: Connection, id: Long): Option[GameMapDesignDbDto] = {
    val ps = con.prepareStatement(s"SELECT ID, $CommonColumns FROM GAME_MAP_DESIGN WHERE ID=?")
    ps.setLong(1, id)
    executeQuery(ps)
  }

  // We will NOT write loads of functional helpers, as we prefer easy to read
  // and exposing JDBC to writing our own in house framework which new devs have to learn.
  private def executeQuery(ps: PreparedStatement): Option[GameMapDesignDbDto] = {
    val rs = ps.executeQuery()
    val ret = if (rs.next) Some(GameMapDesignDbDto(rs)) else None
    rs.close()
    ps.close()
    logPs(ps, s", returned $ret")
    ret
  }
}

You will notice that I wanted statement logging with some helpers - logPs, and because I am using postgres I am lucky the prepared statement will render to a string, if you are using Oracle you have to write a helper to generate the sql, or use another library.

The closing of result sets and prepared statements is really optional as the connection close will close them down. But since we want our DAO functions to be composable it is probably worth closing them on the green path. Exceptions can still close them right away - Which is done in the using clause.

import java.sql.PreparedStatement
import com.typesafe.scalalogging.LazyLogging

package object db extends LazyLogging {
  def logPs(ps: PreparedStatement): Unit = logPs("", ps, "")

  def logPs(ps: PreparedStatement, postFix: String): Unit = logPs("", ps, postFix)

  def logPs(prefix: String, ps: PreparedStatement, postfix: String): Unit =
    logger.debug(s"Query $prefix: ${ps.toString.replace("\r\n", "").replace("\n","")}$postfix")
}

Whenever you blog with code it looks like loads of code, but it isn’t. The Doobie code is about the same size as the JDBC code, with no need for IO or ConnectionIO stuff.

Obviously if you are writing CRUD operations for tens of tables this will create tens of DAOs which means lots of typing - or you write a code generator from the table definitions. Either way, because the code is SIMPLE, after a couple of days of typing you are done, and the new team members can easily understand the DB layer.

But how?

So the missing part, is how to invoke non cats code from the RhoRoutes? Well, say each route delegates to a handler function, and those now are just Scala which return Try[T]. You can do this IO.fromTry, so your RhoRoutes could be:

class KowRhoRoutes(handler: KowHandler) extends RhoRoutes[IO] {
  private def internalServerError(e: Throwable, uri: String) = {
    val tstamp = System.currentTimeMillis()
    logger.error(e)(s"[ID: $tstamp] FAILED $uri")
    InternalServerError("Internal Server Error, please try again later. [ID: $tstamp]")
  }

  private def logReq(req: Request[IO], debugStr: String = ""): String = {
    val uri = s"${req.method} - ${req.uri} $debugStr"
    logger.info(s"Received $uri")
    uri
  }

  private [kowms]def postMap(req: Request[IO], map: ClientMapDesignDto) = {
    val uri = logReq(req)
    IO.fromTry(handler.createMapDesign(map))
      .flatMap(n => Ok(GameMapDesignCodec.convDbToClient(n)))
      .handleErrorWith (internalServerError(_, uri))
  }
}

Two helper functions and then a post handler, taking in client facing JSON Dto’s which are converted to database Dtos. I always separate the client Json from the Database objects, because if you change the database you do not want to have to change all the clients.

This class is where cats ends and the sever side code begins. The server side is now simple scala with normal spec tests and so on, no cats to be seen anywhere.

Scala Cats Functional Programming, Exceptions or Either - January 2021

2021-01-02T00:00:00+00:00

TL;DR

Divide by zero because of parameters is deterministic, and so is pure (outputs only depend on inputs).

Database connection lost exception is not deterministic and depends on the environment rather than the parameters. Also calls to external services and so on.

Cats IO monads have side effects and that means the FP folks understand not all code can be pure (outputs depend only on inputs).

Cats and FP Monads should be used as follows:

throw new Exception or .raiseError when the use case cannot be completed. Handle the exception at the top, where the event arrives.
Option.fold or Option.getOrElse or if then else if the use case has a green path, and a single exception path which also results in some sort of execution (state update, events to other processes etc.)
Either or another structure when there is a single main green path to the use case, but multiple other green paths which can result in different successful outcomes.

Do not pollute your code base with Either[Throwable, _] as you are bringing back functions with checked exceptions and all of scala and the world knows Java was wrong about this.

Why write this

I am reviewing a completed project and trying to tease out lessons learnt, so we can simplify the code for the maintenance devs, rather than make the code cleverer and harder to read.

While thinking about this I decided its the conditional processing and failure handling where we could do better, so tried to set down my reasoning about why Either is often really bad.

What is a Monad?

Because its a really bad and stupid word which stopped me reading a lot of this stuff when I was learning scala, here is my attempt to describe it.

A Monad is code which executes code. Its code that calls a callback, often in a loop.
However, it doesn’t have to do it now, often it builds up a pipeline which will only be executed later, when an event comes in.

In scala that means it implements .map and a few others. Rather than an imperative loop which calls your function, you write .map - which internally has a loop which calls your function. In Cats you will very quickly use the IO monad. Each IO is a function which cats will call in its main runloop.
So

def foo() : IO[Int] =
  for {
    i <- someIO()
    j <- someIOAgain()
  } yield j

Composing foo into your code base will NOT call it until the Cats run loop calls it, and then it could schedule someIO and someIOAgain functions onto the runloop if they are async. Which means you can get parallel execution for free, all handled by Cats who manage execution thread pools and so on for you. Basically you get a lot for free, rather than doing it yourself. ‘for free’ actually means reading and studying for weeks to join the cats club.

The Checked Exception debate

Long ago, Java said exceptions are part of the function signature and everyone who calls a method should know the contract. Exceptions are exceptional, but they could also be used for validation (eg assert, or config missing etc.)

There were also environmental JVM type exceptions - OOM and so on. Checking these in every function sig would be silly, so we have checked exceptions and unchecked.

Time passes, and lots of devs get sick of having to have boiler plate and long function signatures when in reality the exceptions were basically errors shown to the user, or error messages sent back to the caller. Either way the handling tends to be done at the top level in the code which is called by some sort of incoming event.

So, languages dropped checked exceptions - C#, Scala and so on.

Functional programming

Scala devs don’t do checked exceptions.

When using Cats it is encouraged to think about pure functions far more, so having a stack unwind in Scala because of nested call exceptions is not great, even if it is deterministic it is not considered to be great because it would be better to use Either or Validated… which is basically shorthand for checked exceptions again.

Cats

In Cats Either or Option could be used to short circuit execution of your Monad pipeline.

something
  .map(a=> fn(a))
  .map(a=> fn1(a))

i.e.

for {
  s <- something
  s1 <- fn(s)
  s2 <- fn1(s1)
} yield s2

If you want a condition to break out of the pipeline then return an Either, which is right biased. ie if you return a left then the .map will not happen.

If you use Exceptions within the Either then it pollutes your entire code base because you have to return Either all the way to the top event handler which then handles it.

Checked exception madness again.

IO Monads

IO is used for side effects, and side effects typically suffer from all sorts of environmental failures - disk full, database down, system X not contactable etc.

With Cats IO you can throw an Exception or you can IO.raiseError. In fact throwing an exception is the same as raiseError as the Cats IO runloop catches them and does it.

You can then handle the error someplace in your pipeline

// Example of local error handling to the green path, and an effectful failure path, and then using
// an Either to show the user some sensible message
private def handleOrder1(itemId:Int, userNm:String) = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO{Left[String, IO[Int]](s"Out of Stock of ${goods.name}, we will order another [$ordered] soon")}
      })
      .map(_=>Right[String, IO[Int]](getInventory(itemId)))
  } yield updateRes
}

In fact an Exception with a top level interpreter would be easier to read, you could even argue the user doesn’t care you have ordered more - as their order failed.

private def handleOrder1Again(itemId:Int, userNm:String):IO[Int] = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO.raiseError(OutOfStockException(s"Out of Stock of ${goods.name}, we will order another [$ordered] soon"))
      })
      .flatMap(_=>getInventory(itemId))
  } yield updateRes
}

FP Cats conditional activity based on state

Sorry, but state is not all in the parameters, state is usually in the data store or possibly the accumulated event stream materialised in your KTable etc.

For instance, the event is to purchase a widget, so you check the inventory and the widget is either in stock or not. If in stock then sell it and update the inventory, or if not in stock tell the customer to go someplace else, but perhaps update our own stock system to order in some more. And so on.

So, as our previous example showed:

private def handleOrder1Again(itemId:Int, userNm:String):IO[Int] = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO.raiseError(OutOfStockException(s"Out of Stock of ${goods.name}, we will order another [$ordered] soon"))
      })
      .flatMap(_=>getInventory(itemId))
  } yield updateRes
}

Our code is now a horrible mess rather than the lovely examples we usually see:

for {
  s <- something
  s1 <- fn(s)
  s2 <- fn1(s1)
} yield s2

We have dbLookup - presumably this could fail, maybe it returns an Either with a Left where the itemId doesn’t exist at all.

updateInventory uses Doobie (say), and Doobie loves to throw exceptions, so our DAO followed that pattern and does an IO.raiseError or throw new Exception(“”), Which is a branch point to either Success (in the code above we call getInventory), or a call to placeOrder to get more in and then an exception with a user message.

All wrapped in an IO monad as this means our code is now ‘functional programming’.

If you log your exceptions will never give you a full call stack, but only the stack for this little lump of code (ie function) which the Cats runloop is executing. To be fair, Akka and other frameworks which schedule functions all suffer from this, and Java SpringBoot exceptions are utterly terrible.

What the heck?

So, given all programming takes in events and then has to tie those events to current state (in whatever form), is FP even worth it? It is as you will start to write pure functions, and your code will tend away from the monolith into function pipelines which are very appropriate for elastic scaling on the cloud, which should mean you save money if you do it right - no idle CPU cycles to cope with peak demand.

But don’t use Either too much as it can really hit your code base hard, bring back in checked exceptions to your functions.

So, what to do to avoid the .handleErrorWith all over the place (GoLang error approach)?

Is there an Answer?

Stop worrying. Most systems these days are Lambdas disguised as web servers. ie you write a monolith (or microlith) which has 25 rest end points, which deal with incoming events. Each of these events then calls a processing pipeline independent of the rest of your code.

Create some definition of the URLs mapped to the handlers
Wrap these routes in loggers and security checkers, with error handlers
Bootstrap web sever

Hopefully you have all played with AWS lambdas and know that web servers are dead and you simply have 25 lambdas - relying on the infra for security and cors and so on.

Now, in this new world, state is all external as Lambdas are short lived (they are often on warm standby, but no state is maintained in memory ideally) which means Events are often processed against external state (OK, some processing has all the data in the event, but not where I work).

So, should you use FP? Is cats useful? Yes, even though if can be obscure. Exceptions will exist (doobie uses them), do not wrap them all in Either - the unchecked exception debate is over and the world went away from them.

The Answer

Given all your AWS Lamba functions will use IO monads, then IO.raiseError or throw new Exception is fine if it really is a reason to fail the green path for this event. If you have conditional processing then Either could be handy for a use case with multiple green code paths, or an Option if there are only two. However, hide it away in a function so you do not pollute your entire code base with them.

Conditional programming should not be done like this

private def handleOrder1Again(itemId:Int, userNm:String):IO[Int] = {
  for {
    goods <- dbLookup(itemId)
    updateRes <- updateInventory(itemId)
      .handleErrorWith(_=>{
        val ordered = placeOrder(itemId, userNm)
        IO.raiseError(OutOfStockException(s"Out of Stock of ${goods.name}, we will order another [$ordered] soon"))
      })
      .flatMap(_=>getInventory(itemId))
  } yield updateRes
}

but you should try to have your main function pipeline reading like the FP examples, just use really descriptive function names, and keep the functions small.

private def handle(itemId:Int, userNm:String) = {
  for {
    goods <- dbLookup(itemId)
    res <- updateInventoryOrFailAfterPlacingOrderForMore(goods, userNm)
  } yield res
}

i.e. as in OO, keep your code readable, do worry about code complexity even when pipelining your functions. Code complexity increases with more branches and more nesting in a function.

p.s. both examples above are fine - its a small branch and you could argue both are very readable, but long function pipelines with multiple error handlers are terrible to read for any poor support coders. I have found small pipelines soon become long ones, so get used to writing lots of little descriptive functions - hiding away the branch points in the use case with well named functions.

Small functions are great, small pipelines are great, tests are great.

Language choice 2021

2020-09-25T00:00:00+01:00

tl;dr

You can start writing Scala as better Java/Kotlin/Javascript/Python. Then you can take the journey to learn the very best modern software engineering all within one language stack.

But really, Scala remains the most fun you can have in your day job Which is why it wins again and again compared to every other language I have tried in the last decade (or 3). In particular it wins against GoLang, Java, Kotlin, Typescript and Python which I have had to use in the last 3 years.

Language choice 2021

Scala vs Kotlin 2021 Java vs Kotlin 2021 Typescript vs Javscript 2021 GoLang vs Java 2021 Typescipt vs Scala 2021 Python vs Kotlin 2021

Or: ‘ Scala is not too hard’

I have been a working dev in languages from Pascal to C to C++ to Java, via Ruby, into Javascript, then Scala, Python, Typescript, GoLang, Kotlin… not rust (no time yet).

Scala is still my language of choice. I have been reading about its pending death for the last 3 years, ‘The death of Scala!’. This is certainly a great tag line for a blog.

However, they always end up saying that Scala is still the most powerful language. It has the most features but lots of the web say these features are ‘hard’, so use myNewLanguage instead. Then they go on about all the hard bits of their own language - structural types, null interop with Java, write only code - or the totally terrible set of tools you need to download in order to get HelloWorld onto a web page.

I was starting to agree that Scala is hard because so many people have been saying it. They say it right at the start of their docs about why you should learn their less fun language. Or why you should pay for their Kotlin training.

Even the functional Scala folks attack Scala at conferences, and some of their reasoning is totally strange. It is not Haskell! Err. Well, its not Haskell, but Scala is adaptable so if you want to you can treat it like Haskell by downloading some libraries. Better yet these folks could stop digging at Scala and go program in Haskell.

FP! I found this quote:

Michael Feathers, Nov 3, 2010
OO makes code understandable by encapsulating moving parts.  
FP makes code understandable by minimizing moving parts.

Along my recent journey I found ReactJs Hooks has pure functions - in the context and redux. Immutable state passed in with the event - so much so that in dev mode they replay events just to prove you have no state held in objects floating about. Meanwhile Typescript is pushing Classes, as well as supporting the functional side of Javascript. This push me pull me between FP and OO. It seems everyone designing languages wants to support the OO and the FP approach - which is one of the Scala initial tag lines.

The quote sums it up nicely, both approaches are about making the code understandable. If you add unit tests then both approaches are proven, and your code is great. Neither approach is ‘better’.

So FP vs OO is yet another religious war which is actually irrelevant. All the languages (well most) now embrace both. Scala also allows you to write imperative (if then else) logic and then move through that to functional (.filter.map.recover). If you love being a software engineer Scala still remains the most fun language to learn, and keep on learning.

Scala 3 has changed Scala quite a lot. This is a positive - this is why Scala is great. Java is pulling in more and more features of Scala, they have been playing catch-up for a decade now and will continue to. Improving the language to respond to changes in the field of software engineering is a really good thing, and rewriting software systems to take advantage of features is also a good thing. Scala 3 looked at Python, looked at Kotlin, looked at itself and the feedback and changed some Syntax and I look forward to how Tasty could be used in the future.

So, Scala vs. The argument remains that lots of web sites state/argue Scala is too big and powerful, too featured. They take the good bits (that they like) and then add their own complexities to their own smaller languages. The counter argument is to ‘Just Say No’. Scala is not too big or too powerful or too hard, if I can do it then you can too.

But Scala is too big!

So, you still feel this is true? In which case you should all use Javascript. There is no other choice. I don’t think the argument is ‘Scala is too big’, it is really ‘I can’t be bothered to learn it and I want to stick with easy language X’. The easiest language on the market right now is Javascript - you can write server and UI. So if your argument is about Bigness or Hardness or whatever, then you can only choose Javascript - it has the biggest dev base, and the cheapest resources.

FUD, Fear Uncertainty and death

Back in the Solaris Windows wars there was lots of Fake News. Back then it was called FUD. The one I hear over and over is ‘Scala is an academic language and it shows compared with Y which is commercial.’ Scala has been run by a commercial company for a decade now, and is ALSO having features designed with academic theory behind them. I want my software engineering language to have solid foundations in academia, and to also feed in commercial push back - purely theoretical languages do tend to fail.

Ignore the FUD (or Fake News). We want theory to back our languages.

Scala is only as big as you choose.

Don't worry about the bits you can't understand - Roald Dahl

Do the bits you can do, once you have that mastered, and you fancy learning again, do the next bits. You will end up being a totally modern software engineer who can work with any other language in the field.

It remains the most fun you can have in your day job - in either OO or FP.

Cats in use - August 2020

2020-08-08T00:00:00+01:00

Read the Book of Doobie and the Scala with Cats

also Effect Cheatsheet

why is there nothing here

Why I (almost) banned Cats and all its libraries and returned to Scala.

Normally I get my head round a tech and make notes.

It turns out I simply disagree with making Scala like Haskell. ‘If then else’, or ‘match case’, or even Option.fold beats Cats Alternative because everyone understands it. Cats uses too many Category Theory terms and not enough Software terms. The classes are named after Category Theory terminology, rather than as a well named library for achieving an aim in software. This is the tail wagging the dog.

The competition in Software languages is too fierce to get derailed by understanding the worth of an Applicative, Semigroupal or other. It is a spell book written in an obscure language.

How to do an if then else in Doobie

My best example on why Cats is obscure rather than readable is below.

I want to call a find function returning an Option[Person], and if its there I want to perform an action, and if its not there I want to do nothing.

// Written long hand, not as terse as I can make it....
def conditionalCall(personId:Option[Long]) : ConnectionIO[Option[AddressDto]] = {
  val ret : ConnectionIO[Option[AddressDto]] = personId match {
    case None => Async[ConnectionIO].liftIO(IO{Option[AddressDto](null)})
    case Some(pid) => AddressDao.updateResident(pid)
  }
  ret
}

or the full code, without transactor, ConnectionIO means you can compose your own transactions which is really useful.

import cats.effect.{Async, IO}
import doobie.ConnectionIO

object ConditionalDoobie extends App {
  println(conditionalCall(None))  // Not a useful println as no transactor
  println(conditionalCall(Some(1)))  // Not a useful println as no transactor

  // The code, the Dao would normally do some doobie sql.
  case class AddressDto(rd:String)
  object AddressDao {
    def updateResident(id:Long) = {
      Async[ConnectionIO].liftIO(IO{Option[AddressDto](AddressDto("a road"))})
    }
  }

  def conditionalCall(personId:Option[Long]) : ConnectionIO[Option[AddressDto]] =
    personId
      .fold{
        println("The None branch")
        Async[ConnectionIO].liftIO(IO{Option[AddressDto](null)})
      }{pid=> {
        println("The Some branch")
        AddressDao.updateResident(pid)
      }}
}

ie this incantation to create a None inside a ConnectionIO

Async[ConnectionIO].liftIO(IO{Option[AddressDto](null)})

I only found that after loads of googling, not by working it out.

Back to the discussion

I understand moving effects into IO and scheduling them off Stack, I understand how Futures are eagerly run, and so on. The class names and the Category Theory convention is really turning me off this stuff, it does not make the code easy to read and maintain.

I should add that in my new world of services and serverless functions I had ditched Akka for http4s and doobie. I’m in a real dilemma about suggesting these again, even though I have now worked through all of the pain.

Scala syntax remains fabulous. Just cats syntax is too obscure for the stated benefit.

Scala is FP, Cats is just a library written by Haskell and/or maths folk who want to work in that domain.