New G8 Template: Kafka Streams

TL;DR: Kafka Streams Scala Application: sbt new monksy/kafka-streams.g8

I’ve been working on a new project, which I’ll give more details on later, that involves a few Kafka Streams applications. As with any new Scala project, you should use a G8 template to start out with it. Starting out with a premade template is just a good practice. Additionally, templates can do a lot of work for you when it comes to good practices, structures, and extra compilation tools.

Unfortunately, with the Confluence/Kafka world, there aren’t many Kafka-Streams Scala based templates. On Github I found two that came up in the search:


The sv3ndk template used the Lightbend version of the kafka-streams-scala. It’s very out of date. So that was out. The idarlington template used Kafka Streams 2.0. Not ideal, but not unworkable. So I forked it.

What was done:

  • Upgrade the Kafka version from 2 to 2.5 (Latest Release)
  • Upgrade the testing utilties to use the non-depricated (post 2.4) functionality
  • Improved the file layout
  • Added Assembly support
  • Added a larger gitingore file
  • Upgraded the project to Scala 2.13
  • Upgraded the other libraries in the project
  • Added dependency tree plugin support
  • Upgraded the g8 build file

What did I learn about?

  • I learned a lot about creating a G8 template and how the variables are substituted. The existing project had a lot of this work already done, however, I did have to do some of my own substitutions.
  • MergeStrategies and dealing with “module-info.class” in the assembly plugin. (Hint: Merge strategy rule for: case “module-info.class” => MergeStrategy.discard) The module-info.class is a new addition that came from Java 9’s Jigsaw to define JVM modules. They’ll pop up in the Jackson libraries.
  • Some of the built-in Kafka Streams Test utilities. For the most part, I’ve been using the mocked streams library.
  • G8 Sbt plugin. Use sbt g8 to build an example copy of the application in target/g8. From there it’s a lot easier to test and build up your template. Use sbt g8Test to run an automated test. I’m not sure how to customize the sbt tasks. I’m sure it’s a configuration option.

I made a new thing: serialization-checker

So I just made a new thing, and open sourced it.

It’s called the serialization checker. From the readme page it’s here to solve:

The root problem that led to this project’s creation is that REST typically uses JSON, and that JSON is Schemaless. This makes it difficult to create data objects to interact with services. In the case of connecting to a third-party REST service, you typically have lots of examples. This project helps you, the developer, iterate through the creation of the data objects.

Where can you find this?

Github page:

Your project:

resolvers += Resolver.bintrayRepo("monksy","maven")

libraryDependencies += "com.mrmonksy" %% "serialization-checker" % "0.1.3"

Or even it’s Bintray:

First Thoughts: “Coders at Work” by Peter Seibel

I’ve just started to read the book Coders At Work. The book is a nice, recent collection of interviews from many big name developers. I’ve read other developer interview books before, but this one sticks out in an unusual way: with most “interview” books, the interview is either completely boring or incredibly interesting. In Coders At Work, the interviews have varied between amazing and neutral. I haven’t gotten to a bad interview yet.

A few things jumped out at me and made me think. Jamie Zawinski’s interview made me wonder about the value of non formally-educated developers in “today’s market.” Brad Fitzpatrick’s interview reminded me of the “I’ll build everything,” but you “must know everything” attitudes. Douglas Crockford’s interview didn’t inspire me, but it did make me consider other issues within software development.

Jamie Zawinski’s interview was an amazing conversation about a guy who has many interests in learning and doing work. He is a self taught LISP developer who can occasionally get very opinionated. I found his work experience with Netscape fascinating. As a user of the early versions of Netscape, I never knew all of the politics or construction going behind the scenes. I also found it technically intriguing that the pre-3.0 mail reader within Netscape was not written in C++. I have a lot of respect for Mr. Zawinski for being able to identify a potential bias of his – he appeared very introspective when asked about hiring new developers. He understood that he could distinguish people that he could find reputable, but not those who would make good candidates.

One of the things that struck me as a bit off-putting about Mr. Zawinski was his rejection of automatic unit testing. I feel that if it was made as easy in the 90s as it is today, software would be VERY different today.

Brad Fitzpatrick’s interview left me with mixed feeling about the guy. I’m not sure if he is a guy you would want to work with, however he sounds like the kind of guy that you would want to share war stories with over drinks. He has worked on many interesting projects, mainly LiveJournal, and is one of the early “Growth Hackers {}.” I like his recommendation that you should spend some time in reading other’s code. He fights the immediate urge to ignore others’ code and his approach sounds different from what I had expected: I expected his approach to making suggestions on other people’s code would be antagonistic. However, it was described as the following:

  1. Code copies are distributed to the audience – in digital and paper form

  2. The developer presents their code line by line

  3. Q&A time

  4. Suggestions / feedback from the audience

This struck me as different from my experience where code reviews tend to be either technically or personally antagonistic (or both). This approach was more similar to proofreading a paper you just made or audience-testing a book you just wrote.

The two things that really put me off about Mr. Fitzpatrick was one of the questions he asks in interviews, and the other is the insistence of knowing everything. Mr. Fitzpatrick’s “famous” interview/programming question was a recycled question from his “AP CS” exam. The question is to write a class that handles large number arithmetic (rewrite BigDecimal). It appears that he uses his previous experience as a baseline for evaluation. I also got the feeling that it is a way for him to “show superiority over others” (over something he did many years ago a high school student). Second, he is incredibly insistent over knowing everything about how every layer works. He ranted against high-level language developers because they didn’t know that a specific way of polling may not work on a specific deployment machine. He even ranted over those who deployed towards a VM because the VM’s “virtual”->native OS/hardware has been abstracted. I feel that in 98% of the cases he’s picking up pennies in front of a steam roller.

I was not very thrilled with Douglas Crockford’s interview. Primarily because it dealt with Javascript, it was a little too high-level for my taste. During the reading of this interview, my mind went back to Mr. Fitzpatrick’s interview. It made me wonder and realize about how you find the “best” tools. I find it incredibly difficult to keep afloat of all the languages and tools available. Recently, for example, I just learned how – and why – Git, Jenkins (plus the automated unit/lint/reporting/checkstyle plug-ins), and deep Maven knowledge are really good things to know if you’re developing in Java.

When new languages, tools, and frameworks come around, I love to read about them and learn how they work (if they’re useful and interesting enough). However, time is limited: how do you identify the tools that would solve the most pressing need you have?  Prior to Jenkins, I built everything via an IDE. Why would I need an automated build tool? I’m the only developer on the project. Prior to Git, I used Subversion – it did everything I needed. Why would I want to make “sub-commits”? Prior to Maven, why would I want to have the build tool automatically deploy the WAR file or require that all unit tests pass before generating an executable? (I’m running unit tests all the time anyway.)

Later it made me think about the code reading suggestion and I realized: I’m not very happy with the code review tools I know about. ReviewBoard looks nice, but that is for only for Ruby. Should I write my own? Where are the existing tools for Java (which can also integrate with Maven and Jenkins)? Are the tools good? Are there others out there that have solved this issue? Is it worth setting up a code review tool just for this? This are questions I’m not sure how to answer.

Overall, I really enjoyed that this book goes over many topics – personal projects, interview questions, and famous debugging stories. I do occasionally enjoy a story bragging how the developer’s language or tool was miles ahead. However after reading about their accomplishments in a serial fashion, it just gets old. Perhaps interspersing their accounts in a more conversational form would have made this book more interesting, and easier to recommend.

Similarities of the individuals who were interviewed:

  1. All have a strong focus on one particular project

  2. Each interviewee has worked in many companies

  3. None of them focused on the reputation of the company they have worked for

  4. All have interesting debugging stories

Installing Maven on Centos 5 or 6/RHEL

At the moment there is no RPM package or yum install available for the latest version of Maven on Centos. The user is left to install Maven manually. To attempt to overcome this, I created a script to install the latest, at the moment: 3.1.1. At the moment, there are many things that should be added to the script, they’re listed in the TODO section of the documentation, but those features may be added later.

Instructions on how to run the script, and the script it’s self may be found at: 

Apache Wicket [In Action]: A Review and How It Relates to the Java World

Java is a great tool for creating software. It is well designed, modular, has a wide array of platforms that it can run on, performs well, it’s very extendable, and it has a large community with lots of support. However, it’s support for websites and related services is severely lacking. It’s bad enough that frameworks that extend the existing infrastructure have massive pitfalls that you later discover.

For the most part Apache Wicket solves many of the web related issues that J2EE (JSP) has. If you are to follow the prescribed way of doing things, it can actually be quite pleasant. However, there are a few thorny patches with Wicket. I will get to those later.

Wicket In Action like a marriage. During the honeymoon, everything is great. Everyone is happy, and then later discontent grows and things go up and down. However, unlike a marriage, you don’t really get an ending. This is a rather good way to end things, but some of the lesser parts of the book were rather disappointing. One of the big selling points of Wicket is that it is a framework that assumes that the developer has already prototyped the pages in HTML prior to starting with Wicket [The pages are adapted into Wicket-ized and previewable pages]. This upside was enough to ignore the placement of the HTML files, in the class path rather than the web resources section.

I jumped into this book with lots of enthusiasm after reading the introduction. I even bore through some of the non-stated setup issues in the book. The book starts off by creating a project, from scratch, however I went the Maven route (which I discovered is the better way to go). The book mentions maven, but it doesn’t mention how to build your application or to generate a project. I believe that I went the correct way because Maven helped to setup all of the application servers and the folder structures. The book started by having the user to jump in, examine a few code segments, and then to start on a sample (full featured) e-commerces application. The store was oddly pleasant; the goal was to sell cheeses online. The application started from a few sample view pages, it went on to creating a reusable shopping cart, and finally on to a membership mechanism. This is a very straightforward and to-the point way of starting a new framework. It’s already addressing the needs of the majority of its audience.

Another nice thing to point out about the introduction is that it did not try to cover all of the material at once. It would frequently describe what you were doing, but would mention the chapter where the concepts were explained in depth later on. Something that pleased me was that the code listings did not include a listing number. They were place in the correct location of the text. After you’re done with the sample application, you should be quite proud of yourself. This is similar to your first website.

However, the book got a little disappointing when describing the more detailed interworkings of Wicket: sessions, bookmarkable pages, and layering/rendering. The book improves when it gets to the Ajax functionality and a brief mention of dependency injection and Wicket. The book gets a little rough in the Spring through Hibernate sections and then better in the testing section. The book ends in a rather low note on SEO, production configuration, and JMX. If I had known more about JMX, I would have probably had a better opinion of the ending.

Overall I am not sure if I can say that the less than stellar sections of the book were entirely the authors’ or the book’s fault. It quite possibly may be the technology’s fault.  I would strongly recommend the book if you are new to Wicket.

Lastly, here are some direct tips that I had to discover on my own that helped out a lot:


Features I’d like to see Added in Wikis

For the last decade, wikis have not changed very much. Even minor features such as AJAX support are still uncommon items. The following is a list of features I’d like to see added. A few of these items are available as plugins, however I am referring to having these features baked into the actual product.

Live Collaborative Editing

If you use Google Documents with others you will notice this: live collaborative editing. Usually the first time that someone notices this it either freaks him out, or just blows his mind. Multiple users can edit the same document at the same time, whilst receiving the changes in real-time. Having this support on a wiki would make the page lock / change conflict problem go away.

Importing Data Framework

It would be immensely helpful to have some sort of feature within a wiki to import data from other sources: images, RSS feeds, CSV feeds, etc. With a uniform way of bringing-in data, plugins could become a bit more generic and support the transformation. For example: let’s say that you have a build system, and it produces a build log file. Wouldn’t it be helpful for having that file be placed on the wiki [within a certain section] where it could be commented on? With a data framework, there could even be a plugin that could grep the output and only show the important sections.

Graphing Support

Charts are important for simple visualizations of data. Why this requires selecting and learning how to use a new plugin is beyond me.


Wikis seem to be text based only. I have only found a small handful of plugins that support the editing of pictures. Even fewer use SVG. I’d like to see the ability to edit an SVG picture built into the wiki software.


Why wiki software continues to roll their own authentication system by default is beyond me. This item seems like a no brainer. Create an OpenID authentication mechanism that uses the top providers [Yahoo, Google, Facebook]. If need be, revert to a local authentication method if there is a lack of internet access – or if the wiki is being deployed exclusively internally.

APIs for Dealing with Content On The Wiki

Lastly but not the least important, create and promote a uniform wiki API. This would allow for other systems to automatically push content onto the wiki, or pull it. This would be great for a monitoring system to:

  • create a new page on the wiki
  • post configuration details and current state
  • maybe even show statistics

Also with the API, the same monitoring system could grab a wiki page and check for changes in the configuration details. Granted, there are potential issues with the configuration details changing on a wiki, however at the moment this is more of an idea rather than a real world implementation.

Another example: WordPress could use wiki support to have multiple editors collaborate over a post within a wiki Page. WordPress could also, with an API, pull the wiki page to make it a blog post.

Interesting finds of the Week [Week of the 20January 2013]

Here is yet another installment of things I found interesting/learned this week.

  • Typically, when you group disks together there are two options for this: RAID and LVM. This only works if you have similarly sized disks. However, if you don’t have then you can group them together by using “JBOD” related services. JBOD = Just a bunch of disks. At the moment, the technology I’m learning is Greyhole. It looks really awesome for small home server setups.
  • Auditing your code base for dead/unused code can make a world of difference to your productivity and size of your executable. I do not have actual figures, but I can only assume that it would make class loading and loading of your executable faster.
  • Obvious statement: Math and comparisons with doubles is extremely aggravating. Two numbers can be displayed the same, but unequal when compared. Google’s Guava collection handles this in DoubleMath. The class was introduced in version 13.
  • Need a list of locale-friendly holidays for your Java application? There’s an open source library for that: JollyDay
  • BoardingArea found that it is to pull your basic marketing profile from Delta. This didn’t work for me, I can only assume that the data end point has been fixed.


Init.d Script for codeBeamer MR

codeBeamer ManagedRepositories is a free web interface for Subversion, GIT, and Mercurial from Intland. The product contains a standalone web application with their distributed version of Tomcat. However, the Linux version does not include the init.d scripts to start/stop the service on boot or on demand. I’ve written a script that can do this. It can be found on my GitHub page. The instructions can be found in the At the time of writing, I cannot endorse or recommend against this product as that I haven’t used it yet. However, a review may be coming up in a future post.

To bring down the script, and the read me file [assuming that git is installed], create a new directory and run the following command within it:

git clone