Things I’ve learned so far this week (6 Dec)

It’s been a very long time since I’ve done one of these. So I thought I’d share. Not all of this was learned in the last week, but in the recent past.

  • When you do port forwarding with SSH, the default binding address for the forwarded port is 127.0.0.1. This is a security feature that prevents that port from being used by outsiders.
    • If you want to expose this, enable the configuration GatewayPorts in the SSHD configuration
  • Groovy has special DSL support for Cucumber. It makes writing cucumber steps easier and simpler to manage environment hooks
  • The obvious one, but rarely done in practice: Creating a POC in code is a lot better to get your core logic working than building out a full app and trying to make decisions later.
  • Bash: Substring extraction is possible in a very concise and easy manner. (Like most things in Bash, there are magical characters for that) It makes sub-arrays possible.
  • Bash function parameters are errrr different. Unlike most C-like languages, they define them by an array index than a variable name. (I.e. $1, $2, …)
    • You should define your local variables in your function near the top
    • All variables are global unless you declare them as local in bash

Other

  • Magic shows can be pretty cool (Thanks, Michael Carducci! [NFJS])
  • I need to see more firebreathing shows. (Rammstein included lots of fire displays)

Secure Services – It’s not just for REST or SOAP anymore

A note: I meant to write this blog post for years. I am finally knocking this blog post off of my todo list. Hurray!

In the beginning, there was the terminal, and it was great. After networking had come around, the terminal was networked and called telnet. It was thought to be good but later realized to be a disaster when people realized it was not secure. Then there came the secure shell which was a secure wrapper around telnet. Later the features for SSH expanded into:

  1. Port Forwarding (used as a jump box/service)
  2. Proxying network traffic
  3. X11 forwarding
  4. Secure shell environment
  5. Key based authentication
  6. Secure File Transfer (SFTP)

There are a lot of different uses of SSH and how you can configure it to do some pretty extraordinary things. That is something that would be out of the scope of this blog post. A great reference on SSH can be found in this book.

One of the features that caught my attention is that it is possible to create services that are purely in the Unix environment and are incredibly secure. The attack surface is small, communication is encrypted, and that your environment is sandboxed (well as much as you make it).

Authorized Keys

Passwords are incredibly low effort to unlock a system. They tend to be short, and they can be brute forced. (Even worse, they frequently have a small space of combinations as that they are human chosen). Randomly generated keys with lots of bits were created to avoid this issue. This was added to SSH to add in passwordless login and to avoid sharing passwords as well. All of the public keys are stored in the user’s authorized_keys file.

Within the authorized_keys, each entry has the following format:

<key encryption used> <public key> <comment>

Within each line, it is possible to extend the features (shell, a command to run, environment variables, etc.) of that particular login. (See the man page for more details)

To build a secure service, use the command variable. You will see in the section labeled your first secure service for an example.

The setup

  1. To setup an account, you are going to need a private key. Generate that with ssh-keygen without a password. (You can use a password, but it will make automation tough)
  2. Add the entry into the authorized_keys file. (~/.ssh/authorized_keys) With the ssh-copy-id command (ex. ssh-copy-id –i [location of your private key] [user running the command]@[server]
  3. Your ~/.ssh/authorized_keys file should have entries similar to the following:
    1. cat ~/.ssh/authorized_keys
    2. ssh-rsa A……..iJu+ElF7 steven@server
      1. See the section Authorized Keys section for an explanation of what this means.
  4. Test the access to the service by sshing into the box under that user and with that key. (Sample command: ssh –i [private key] user@server
    1. The first time you connect to a server with SSH you are going to get a message asking if it is ok to connect to a particular box. (This is something you need to do if you are automating a process as well)

Your first service

Your first service will be an incredibly simple example. Open up your authorized_keys file that was modified from the setup section. Add in the following command in front of the ssh-rsa/dsa line for the new entry.

command="echo hello the time is `date`"

The line of the login should now look similar to the following:

command="echo hello the time is `date`" ssh-rsa A.......

Now you can make a call to the configured service as such:

ssh -i .ssh/id_sitetest steven@localhost

You will receive an output of:

hello the time is Fri Nov 11 22:28:37 CST 2016
Connection to localhost closed.

Congratulations, if you followed along with the previous instructions: you have created your first secure service. All of the input communicated to the service and coming back from the service is encrypted. You have control over the format that is output and how you are going to take in input. The beauty of this service is that a terminal is not left open, and only the command that is defined is run. Once the command is done, the session automatically closes.

Suggestions

How should one best develop services?

You should develop a bash script to replicate the functionality that you would like the server to perform before setting it up to run in SSH. This gives you an isolated environment to test the process and to test it before it goes live. The SSH service setup is merely a layer above the script.

How should I get input into the service?
To do this, you would need to take in input just like how you would in a bash script. I would suggest using the ‘read’ command to do this. See this guide.

A note on this: Always validate the input. Attackers are unlikely (assuming that the key is managed properly), but it never hurts to

Is it webscale?

Honestly, I do not know the answer to this question. I guess it is possible to do this via the web, and I am not sure how stable it would be with lots of concurrent users. It is at most as scalable as the SSH service and system.

I did a brief search to see if this was possible in javascript on the client side and I could not find a source to show that it was possible.

Can you automate the use of these secure services?

Yes. However, when you create the key, you should never add a passphrase as that requires manual interaction. A word to the wise, keys should have a required live cycle and should be phased out periodically. Key management will be an issue in this situation.

No Fluff Just Stuff October 2016/Chicago

My workplace sent me to No Fluff Just Stuff to attend the conference. Some of the interesting bits about the conference were:

  • Garbage Collection in the VM
    • Lead to the following questions
      • If there a way to do leak testing in unit tests? (VC++ provides heapdump checking and you can do it there)
      • Where is the memory indexed at?
      • Is it possible to modify the code cache?
    • Ken Sipe is a great speaker and I managed to have lunch with him.
  • Tracer Bullet architecture
    • Advocates slow development (make sure things work on a smaller scale before escalating higher)
    • Some redefining the existing development change process, and some parts naming work patterns.
    • Advocated for many of the circuit breaker/fault resistance that was already baked into Akka
  • LAMBDA Architecture
    • Mostly advocated in a more efficient datacenter using Mesos/DCOS.
  • Metaprogramming and AOP programming with Groovy
    • This was a great talk on how closures can take in state
    • Went over some basic built up on extending the language through Abstract Syntax Transforms
  • Encryption/Security
    • Mostly reviewed encryption algorithms and briefly talked about exploits
    • The talks on the exploits were interesting
    • Oddly enough, the conference talked a lot about the Heartbleed bug.
      • Sidenote: The events and context surrounding this bug are very strange.

Other notes

The conference loaned out conference ipads to keep notes on and rebroadcast the sessions. I found this to be similar to having a laptop in the classroom. Mostly it just led to distraction.

Things I would have liked to see more in the conference

  • Talk some about Scala
    • There were many mentions of Groovy, but other than the Metaprogramming/AOP they were just mentions
  • Discuss things like Akka or new methodologies
  • Try to avoid boilerplate talks on Spring
  • Label the sessions as Intro/Intermediate/Advanced

Akka/Scala vs Groovy for a Periodic Process

Last fall I created two different processes that were responsible for a reoccurring task of pulling data and broadcasting it elsewhere. The Akka based project pulled the newest meetups from the meetup page and then rebroadcasted them via Reddit and Twitter. The Groovy-based application pulled posts from a forum and then created a summary post of the posts pulled.

Both applications were reoccurring and were only activated on a schedule. However, there were two different outcomes. The Groovy based jar was scheduled as a cronjob and exits when done. The Akka process is set up as a systemd service and remains up.

How did it turn out?

Both of the solutions work and have required very little maintenance. When running, the Groovy process takes up less than 70mb of memory, and the Akka based process takes more than 200mb  of memory. [It’s showing as 0.3% memory usage on a 65gb machine] (It’s written in Scala, and brings in Akka) Nether of the processes are intense enough to make a noticeable effect on the CPU. The ultimate package size is the following: Akka- 42mb Groovy- 12mb. (That is a little deceptive as that that the Groovy process contains less 3rd party libraries, but the amount of libraries that Scala and Akka bring in are a lot).

Now it comes down to probably the biggest concern: The time it took to develop. It took a month of lunches to develop the Scala and Akka based application. It took so long because I had to find and get up to speed on the Scala-based libraries for working with Rest services (I had spun my wheels with different clients), Scalalikejdbc, and Twitter4j. I learned another lession: Don’t use SBT. It’s a pain to use when compared to Gradle or Maven. On top of all of that I had a very challenging lesson: Don’t mix Scala versions for dependencies that weren’t compiled against the Scala version that your application is using.

The Groovy-based application took three lunches to write. One lunch (~30-40min) to write the main logic, the rest for the packaging. (The worst part of this was researching on how to make a fat Jar for Groovy under Gradle).

What did I learn?

Akka is an amazing piece of technology. I like using it, and I like using Scala. However it turns out for a process that is meant to be periodic, using REST calls: you are far better off writing the process in Groovy, letting Linux handle scheduled execution and getting it done quicker.

When would I use Akka? I would use it for a system that would expect constant requests, and that it had to be highly reactive, may include complexity, and would expect a high throughput. (The REST service is a good example for this)

(Transitive) Dependency Hell Pt2: How it could have been avoided

In the last blog post, I went over an issue where multiple logging systems built into the dependencies caused issues in the main application. All of this could have been resolved by delaying the decision on what logging implementation should be used.

The initial fix that would have made my debugging session easier

Trying to resolve issues with conflicting logging frameworks is a huge pain. It would be great if there were an option to show the initial setup of the logging framework used in the first place. If there is already documentation for this, I would love to know where it describes the actions of it.

What are JSRs

Java Specification Requests are community/corporate created specifications that define an interface for a feature in Java. For the most part, they just define the interface and the implementation can be included in the classpath. No actual concrete classes are included in the specification. If all of the sub-dependencies were using an up to date JSR for logging, this issue would have never come up. The only issue that would have shown would be “No concrete class for interface found.” A quick search for logging based JSRs turned up JSR-47.

For example, The REST JSR-311 has the API JAR. If you didn’t understand about JSRs, you’d believe that it includes the functionality to work with REST services. However, it doesn’t it requires a JSR-311 compatible implementation/provider. (Such as Jersey). Another example of this is JPA and the Hibernate JPA implementation.
On a similarly related note, it would be better if library providers (I.e. Amazon AWS SDK) provided a specification/interface collection to their libraries. If this was the case, the libraries would be coded against this and the implementation could be used at the highest level. (Where the code is actually run) This would improve on testing, as that you’re now writing code against an interface rather than an implementation you have to mock out, and it would improve on dependency conflicts. The AWS SDK has had changes in the package structure, and issues with deprecated methods.

If you took nothing out of these two blog posts, take this: Programming against collections of interfaces is much more preferable rather than playing whack a mole with multiple conflicting dependencies. Filling your POM with exclusions is no fun, and it’s incredibly risky.

(Transitive) Dependency Hell (In Java) Part 1

Transitive conflicting dependencies is like letting 20 children fight over one piece of candy. At the end, one kid may have candy, leaving everyone else either crying, fighting, and/or hurt. At the very best situation the right dependency may be used, however frequently that’s not the case and may change if the class loader feels like it. Before I get into the context of what happened, I should mention this happened nearly a year ago, and this retelling is entirely from memory.

How did this happen? I was working on a project that required bringing in Spring CLI for a small utility. Spring CLI made the wrapping of methods into CLI interfaces incredibly easy. Side note: having to build up a CLI interface to do this is absurd and makes the task needlessly complicated. (I’m looking at many of the CLI options out there)

My project had a dependency of Java Universal Logging (JUL), Log4j (which the configuration was conveniently ignored), and SLF4J. Since the project was logging what we needed (and we were lucky that it was), the conflicts between SLF4J and Log4j2 were ignored. JUL used SLF4J as it’s default logger. After all, JUL is a weak logging framework that’ll jump in bed with the next available logging implementation. Each implementation has it’s own respective configuration format, and if the one that is being used doesn’t match the configuration that you have, it won’t take that into consideration.

The downside to bringing in Spring CLI is:

SpringCLI brings in Apache Commons-logging: Oh crap.

The even worse thing about this: When I started looking into this issue It’s not very clear about what logger is being used to log out the information. That means that even changing the, what you believe is the correct, configuration to increase the verbosity of the logging won’t work. To find out which logger was winning out, I had to debug deep into the internals of where the logging was going on. It wasn’t much fun.

From here, it’s key to decide which implementation that you want to work with. I decided to go with Log4j2, as that the configuration was written for it. The next step in the process was to eliminate/exclude dependencies from coming in. If you have a dependency that is using a fat jar,you’re out of luck on this one. Fortunately, I didn’t have this issue. This can be resolved by using the dependencies::tree plugin via Maven. Find all of the alternative implementations and get rid of them.

That still left the issue unresolved. It seemed like other implementations were leaking in and/or missing once you have eliminated the other dependencies you’re going to have to use bridges to resolve the issues with the now missing transitive dependencies. The following new dependencies were added:

Oy, this has now made the POM rather large and not very clear about what is going on or why you need all of that. For the most part, most of the issues should be resolved by this. However, JCL was still being redirected incorrectly within the Spring CLI output. *sigh*

After hours of debugging: I found that Spring CLI used JUL to write out the logging statements, and the logging was still trying to go through the common logger (but not writing it out/respecting the preferred logging).

This was resolved by setting the default logging manager at startup to the Log4j manager via the property:

java.util.logging.manager to org.apache.logging.log4j.jul.LogManager

 

In the next blog post: I will discuss some of the ways that all of this could have been avoided.

 

In Support of Complex Software

We have some major communication and usability issues with software. I’m not referring to the lack of documentation (although quality documentation is needed), nor am I talking about UX. It appears that there are way too many choices in what software product to use, but there is very little structure into how to implement it to solve your problem. For example, there has been a bit of a fight between Swarm, Kubernetes, and Mesos. Even though they have similar usages, each of them have vague descriptions of what they actually do. When I’ve seen comparisons of the products, I tend to see complex routing strategies being discussed rather than what they have in common – or where one product succeeds over others (and fails in other ways).

What am I ranting about? Well, a lot of products describe what they do, but the descriptions rarely ever make it easy to understand to the reader. All of the projects that I talked about earlier describe what they do [they tend to make sharing physical resources transparent for containers, each have their strengths and weaknesses]. All of the projects claim that they help to build a cluster, but each tend to fail on describing how they do so on an infrastructure level which can actually give implementers a shot at making an intelligent decision for their environment/purposes. They tend to build up a picture in the user’s head, only to be let down, that it’ll make the user’s scaling problems go away. Unless the user of the project has made their application distributed (in the way they expect), these technology choices won’t help them.

Silicon Valley has latched onto a harmful mindset. This has been encouraged via the Lean Startup/MVP mentality where someone decides what they believe that you need and only that is what is built. Your actual needs from the product are dictated by someone else, and are only implemented when they decide it’s going in the product. Complex products tend to be sidestepped due to insecurity about the people using it. It makes me think that technology being produced by startups in the Valley are akin to promoting the education of math to its users, but to be in denial of operations beyond addition and subtraction. Complex products can work, and can be used to their fullest extent. The problem is due to the communication about the product. Rarely have I ever seen a complex product described in a simple-enough manner up front to make it easier for the user to explore different avenues after he/she has achieved proficiency in the basics.

 

Changing Power Adapters Types

Recently I bought a Fitbit. It’s been a great device to compete with my friends on walking. I’ve also bought a Nexus 6P. It’s also another great product.

The downside to both of these: They’ve changed the connector to charge the device. The Nexus 6P requires a USB-C cable, and the Fitbit requires a very weird connector. (Which I’m sure they charge $50 or better for a 6” cable).

From what I’ve had success with buying the cables:

Project: Meetup Photo Downloader

As it currently stands there is no OS independent way to extract all of the photos from your meetup group. There are other projects that run on Windows, but nothing written in Java. Under meetup they don’t offer the ability to export all of the photos.  This short script is used to gather all of the photos in your group and download them into your local file system via the information provided by the Meetup API.

What technologies are used?

         Groovy and the RESTClient (It’s fairly simple)

For the most part this was a fairly simple project that could be finished in a lunch or two. That is only the case if you already have experience with the Meetup API. It was a pretty simple thing to write. Only thing that I found unique/learned with this project was that you can write to an output stream via the left shift operator.

Meetupphotodownloader source: https://github.com/monksy/meetupphotodownloader/

 

A few things that I would like to see in Docker

I’ve mentioned before that I’ve rather taken a liking to Docker. However, there are a few things that I wish that were better.

Docker Doc

Java doc did something extremely well. (Other than getting people to hate checkstyle) it made sure there there was a standard way to write documentation on your code base, and that it could be formatted in a proper, consistent way.

It would be great to see if there was a similar tool for Docker to produce consistent documentation on what the dockerfile produces. It would also be great to have registry support for this as well.

Some of the things that I would expect to see mentioned on there:

  1. Required environment variables for the application
  2. Documentation on maintenance to be done on the container (i.e. run particular methods via the exec command)
  3. Ports that should be shared
  4. What volumes are assumed to be there
  5. How the configuration should be set  (is it a file, etc).
  6. How to connect the application with other services (databases, message queues, etc)

Docker Templates

It would be great to generate best practice docker files based on if you’re working with a Tomcat application, or a standard Jar. This would get rid of the creation time needed to make a Dockerfile. Along with Dockerfile templates, it would be nice to see better Jenkins publishing support for Docker. However, that’s a subject for another day.

Docker Traits

Multiple inheritance has a lot of issues that Java avoids, and C++ deals with. The issue with this is avoided in Python (with mixins), and Scala/Groovy with the addition of traits. Traits are aways to side load content into the docker file, post the base container, without having to use inheritance. That means that you could create a Docker container with utilities rather than baking them into the base image.

Application Profiles

Applications typically have a limited set of configurations. They are either one time running applications or they’re web applications. There are sometimes deviations from this, but they’re not so common.

For the most part, one time running applications require a configuration set to go into the application and a location to write out the data. For web applications it mostly needs a port to be open, configuration to a data storage system (i.e. MySQL, MongoDB, etc), and configuration information.

The best way that I can forsee this being setup is to have a standard mount location for these folders outside of the container, and to have them set as required inputs for starting the container.

Another item on configuration. I would love to see the configuration copied from the container to its outside storage for the starter configuration. This would mean that the Docker Registry complex configuration would have a new mounted volume outside of the container there with a configuration ready to have a change made to it.