Things I’ve learned so far this week (6 Dec)

It’s been a very long time since I’ve done one of these. So I thought I’d share. Not all of this was learned in the last week, but in the recent past.

  • When you do port forwarding with SSH, the default binding address for the forwarded port is 127.0.0.1. This is a security feature that prevents that port from being used by outsiders.
    • If you want to expose this, enable the configuration GatewayPorts in the SSHD configuration
  • Groovy has special DSL support for Cucumber. It makes writing cucumber steps easier and simpler to manage environment hooks
  • The obvious one, but rarely done in practice: Creating a POC in code is a lot better to get your core logic working than building out a full app and trying to make decisions later.
  • Bash: Substring extraction is possible in a very concise and easy manner. (Like most things in Bash, there are magical characters for that) It makes sub-arrays possible.
  • Bash function parameters are errrr different. Unlike most C-like languages, they define them by an array index than a variable name. (I.e. $1, $2, …)
    • You should define your local variables in your function near the top
    • All variables are global unless you declare them as local in bash

Other

  • Magic shows can be pretty cool (Thanks, Michael Carducci! [NFJS])
  • I need to see more firebreathing shows. (Rammstein included lots of fire displays)

Project: Music Organization System

Before ITunes and online music stores/streaming-options came arround, you had to build up a digital music collection if you wanted to load an MP3 player with it. I always prefered this option. This meant that I could manage my own collection, and that I wasn’t fixed to a service that would eat up all of my data and that I could listen to what I wanted. For example, in most US music services you can’t find the band Die Toten Hosen. They’re a great German band, but they haven’t hit the US market. Also, having your own collection it’s a lot easier to move the collection to other devices without having the direct integration (such as my car stereo).

The downside to managing your own music collection is that you’re subject to managing the collection yourself. That means that a large collection can get unwieldy very quickly. Thankfully there are a few tools to help with that. I found the JThink tools (SongKong and Jaikoz) to be very helpful with keeping an organized music collection.

What is it?

This project is intended to automatically standardize files into a human-friendly collection without user intervention.

What technologies are used?

  • Bash
  • SongKong (Jthink.net)- For file formatting, metadata correction, and metadata improvements (From an online source)
  • FFMpeg (for media conversion)
  • Docker
  • Docker Registry

How did I solve it?

To solve this issue I did the following:

  1. Created the Dockerfile and outlined the general steps used.
  2. Identified the software dependencies.
  3. Opened up X forwarding to test out SongKong (It’s mainly an X application, that has the possibility of a command line tool)
  4. Ensured that Songkong could operate from within the Docker container
  5. Moved over the Ogg2MP3 and Flac2Mp3 scripts. (Which can be found at Github.com/monksy)
  6. Created a docker registry so that I can keep the docker image local. (Songkong is a licensed and for pay product)
  7. Setup the CI pipeline with Jenkins
  8. Create a script to run on the box managing the music collection. This uses the Docker Registry to pull down this process and run the organization utility
  9. Setup the Crontab scripts to run the container

Some of the challenges that I had while doing all of this included:

  1. The difference between the run command and entrypoint. The entrypoint command within docker runs the command when the container is invoked. The RUN command may run only when the container is being built.
  2. The Jenkins Docker plugins are a little difficult to use and setup. I tried using the Docker-build-step plugin, however, it tended to include very little documentation, was very unhelpful about invalid input and was difficult to build and publish. Fortunately the Cloudbees Docker Build and Publish plugin was just what I was looking for.
  3. Debugging the Docker Registry was a pain. For the most part you’ll have to depend on the standard output coming out. If not that, do a docker exec -ti <container id>  /bin/bash and look for the log files.
    1. This really needs to be improved to output what is broken and why
    2. Bad logins to the docker registry from the client go from Version 2 of the API to Version 1 if something goes wrong on the Version 2 request. (I.e. a bad certificate file). This is frustrating.
  4. If you have a LetsEncrypt certificate to use on the Docker registry, it’s not very well documented that you should be using the Fullchain certificate file. Without it, you’ll have security issues.
    1. Another note on this, it should be a lot easier to create users on the registry rather than to generate HTAccess files.
    2. If you are generate a user access file, you have to use bcrypt as the encryption option. Otherwise, the docker registry won’t accept your credentials.
  5. The storage option that I used for storing the collection was a network mount point. Not having the proper permissions on the server side for the child folders caused a wild goose chase. That lead to studying up on the options of the mount.cifs tool. (For example file_mode, dir_mode, noperms, and forceuid options).
  6. Reproducing the application’s environment was a little difficult as that it wasn’t clear about where it’s private files were located.
  7. The id3 tagging command originally used no longer exists. I had to upgrade to the Id3v2 software and reformat the command usage.

Exit Codes: Why Java Gets it Wrong

Exit Codes

The standard protocol of using command line interface tools in Unix is based on a few things: standard out, standard in, standard error and the exit code. The exit code is the reason why the start method of a C program includes an int as a return type. That value is being passed back to the code that executed the application. (Typically the shell). The expected values of an exit code are: 0 for a success and anything non-0 is known as a failure code. This gives the developer a way to communicate what went wrong in a very quick fashion.

Java is a weird beast in that regard. Unless there was a JVM failure, Java will always report back a 0 exit code. This can be incredibly irritating when you want to create Java applications that are meant to be execute in a Unix environment or in a chained fashion. (As the Unix philosophy intends for an application to be run as).

The workaround for returning an non-0 exit code is to call System.exit(<code>).This has 2 draw backs. Firstly, it’s a very abrupt call, and can introduce issues later down the line. (It could cause confusion as to why the application just failed, similarly to multiple return statements in a method) Secondly, the shutdown request to the JVM is concerning, it doesn’t attempt to resolve any other threads running at the moment or give them a chance to finish before closing. For example: resources could remain unclosed or unfinished, temporary files may not be cleaned up, and network connections could be dropped. The only way to get a notification that this is happening is to setup a shutdown hook. (That is described in the documentation for System::exit)

Well That Was Silly Of Me, Issues with Sed….

Refreshing my memory on sed caused me to run into two issues tonight. Firstly…. the -n parameter only shows the patterns that you wish to show [after it is used]. Secondly, the order of deleting and printing lines  matters. It turns out that it matters a lot.

Lets say you have a file named contents. It contains:

Gooogle
GooooogleBot
Gooooogle Pictures
Google Plus
Reddit
Yahoo

Let’s assume that you wanted to just show all lines that contained “Gooogle” [and its similar brothers] with sed. You would write a line that contains this:

 
sed -e '/Goo[o]\+gle/p' content

Right? Nope. It’ll show all of the items, despite that you used the print command to display items that matched that pattern. To fix this, put in the -n option before -e.

That’s great… But that returns: Gooogle, GooooogleBot, and Gooooogle Pictures. In this example, we don’t like GoogleBot. So lets remove it. You may now write something like: 

 
sed -n -e '/Goo[o]\+gle/p' -e '/Goo[o]\+gleBot/d' content

It seems like a logical extension. Right? The next regular expression should pass over the printed lines left and make an evaluation. Nope, it doesn’t. It’ll display the same results prior to the second expression. What’s going on? Its not a bad expression. Its not a bad command. It’s due to the placement of the prints, deletes, and where you ask that the pattern space be shown. This is some odd quirk, that I haven’t found an explanation for [yet]. But what it turns out to be the correct way of doing it is to rearrange everything where the deletes are first, and then the prints occur [Also, to refuse to print the pattern space after the deletes (weird I know).

So the correct form is:

sed -e '/Goo[o]\+gleBot/d' -n -e '/Goo[o]\+gle/p' content

Bizarre? Yes, very much so, but it works.