New G8 Template: Kafka Streams

TL;DR: Kafka Streams Scala Application: sbt new monksy/kafka-streams.g8


I’ve been working on a new project, which I’ll give more details on later, that involves a few Kafka Streams applications. As with any new Scala project, you should use a G8 template to start out with it. Starting out with a premade template is just a good practice. Additionally, templates can do a lot of work for you when it comes to good practices, structures, and extra compilation tools.

Unfortunately, with the Confluence/Kafka world, there aren’t many Kafka-Streams Scala based templates. On Github I found two that came up in the search:

idarlington/kafka-streams.g8
sv3ndk/kafka-streams-scala.g8

The sv3ndk template used the Lightbend version of the kafka-streams-scala. It’s very out of date. So that was out. The idarlington template used Kafka Streams 2.0. Not ideal, but not unworkable. So I forked it.

What was done:

  • Upgrade the Kafka version from 2 to 2.5 (Latest Release)
  • Upgrade the testing utilties to use the non-depricated (post 2.4) functionality
  • Improved the file layout
  • Added Assembly support
  • Added a larger gitingore file
  • Upgraded the project to Scala 2.13
  • Upgraded the other libraries in the project
  • Added dependency tree plugin support
  • Upgraded the g8 build file

What did I learn about?

  • I learned a lot about creating a G8 template and how the variables are substituted. The existing project had a lot of this work already done, however, I did have to do some of my own substitutions.
  • MergeStrategies and dealing with “module-info.class” in the assembly plugin. (Hint: Merge strategy rule for: case “module-info.class” => MergeStrategy.discard) The module-info.class is a new addition that came from Java 9’s Jigsaw to define JVM modules. They’ll pop up in the Jackson libraries.
  • Some of the built-in Kafka Streams Test utilities. For the most part, I’ve been using the mocked streams library.
  • G8 Sbt plugin. Use sbt g8 to build an example copy of the application in target/g8. From there it’s a lot easier to test and build up your template. Use sbt g8Test to run an automated test. I’m not sure how to customize the sbt tasks. I’m sure it’s a configuration option.

TIL: Checking if an environment variable is set and exists

I’ve seen a lot of documentation for checking if a variable is set by calling [[ !-z "$VARNAME" ]]. That condictional statement essentially checks to see if the resolved string "$VARNAME" is not an empty string. Not very straight forward is it?

However, I found there is a new (well to me) conditional statement to confirm that the variable is set without all of this. It’s -v.

From the documentation:

-v varname

True if the shell variable varname is set (has been assigned a value).

Bash Conditional Expressions (Bash Reference Manual)

Example:

$ [[ -v TEST_VAR ]] && echo "nope"
#(Nothing is shown)
$ export TEST_VAR="asdf"         
$ [[ -v TEST_VAR ]] && echo "set var"
set var

Note: Not sure why the formatting is doing that. But the &amp&amp above should just be &&

Help Request: Awesome-kafka

About a year ago I created a repository that is there to organize information, resources, and tools for Kafka.

I haven’t heard from PRs on there and it’s been a while. At the moment I haven’t been able to update the repository with new items. WHat I need help with is:

  1. What new tools are there to work with Kafka?
  2. What are some new resources out there for Kafka?
  3. Where can you find new training materials for it?

Comments and PRs are very appreciated. (PRs ofcourse more so)

It’s been a while

I kind of shocked my friends a week ago by creating a blog post. It’s been a while since I’ve written a post. I’ve been quite busy in my work life, and I just didn’t prioritize writing anything here.

I miss it a bit and and I don’t a little. I enjoy the ability to document and journal here. However, I get really frustrated that the blog isn’t a very good community. (Note that’s a blog post for another day)

To cut this short, I may blog about the following things in the upcoming days: How jCrete was (that was last July), learning ESP, the Kafka Streams example, how I could see businesses adapt to these new rules of the quarantine.

Until then, stay awesome.

Well hello to you ESP32!

Due to the quarintine, I’ve been stuck in the house more. I’ve also had a bit more time to try to experiment with the embedded boards that I have. (See the picture below.)

From the left to the right: WIO node (gift from IBM at OracleCode), NanoPI air, ESP32-WROOM, and a RISCV HiFive1 Rev3 (Still haven't booteded it)
From the left to the right: WIO node (gift from IBM at OracleCode), NanoPI air, ESP32-WROOM, and a RISCV HiFive1 Rev3 (Still haven’t booteded it)

I identified the ESP32 boards that I bought off of Aliexpress as the ExpressIF ones. (ESP32-WROOM). Followed the instructions from:
https://docs.espressif.com/projects/esp-idf/en/stable/get-started/index.html#get-started-get-esp-idf

The result is the first response in this project:

It’s late, but that’s not a bad first attempt. From the hello world source it looks like a fairly straight forward C application. (This is going to take some refreshing, but I love the lean-ness)

What do I hope to accomplish with this? I hope to learn more about the ESP32 embedded systems and I hope to create 2 devices that will go on my HomeAssistant network. One for air quality monitoring, and another for humidity and temperature monitoring (DH11 device). Also I have a few extras for the device (like an e-ink screen that’ll be pretty cool to play arround with)

I made a new thing: serialization-checker

So I just made a new thing, and open sourced it.

It’s called the serialization checker. From the readme page it’s here to solve:

The root problem that led to this project’s creation is that REST typically uses JSON, and that JSON is Schemaless. This makes it difficult to create data objects to interact with services. In the case of connecting to a third-party REST service, you typically have lots of examples. This project helps you, the developer, iterate through the creation of the data objects.

Where can you find this?

Github page: https://github.com/monksy/serialization-checker

Your project:

resolvers += Resolver.bintrayRepo("monksy","maven")

libraryDependencies += "com.mrmonksy" %% "serialization-checker" % "0.1.3"

Or even it’s Bintray: https://bintray.com/monksy/maven/serialization-checker

A suggestion for those who create projects for open source

Please include a tutorial on how to use your product.

An example of this:

I saw a talk on FiloDB. It sounds like an interesting time series collection database. The technical challenges sound interesting and the architecture sounds like its sensible. However, when going to the project, there’s nothing there that will get me off the ground and working with it quickly.

TL;DR Write documentation in a manner which it helps to lead the user to use your product/project successfully.