Week #1 – Let’s start coding

It’s the first of march and this means, that Get Noticed! 2017 competition has officially begun! That also means that now there are only 3 months left to finish Commutee. I’m really curious of what will I achieve in this short time. There is a lot of work to be done. So, let’s get down to the code.


First, let’s talk about the environment. All source code is available on GitHub, as per competition requirements, you can find it here. It’s Apache 2.0 licensed, so you can use it and modify it if you need. Also, I’m open to all suggestions, questions etc. So feel free to contribute.

As I’m creating few applications, I’ve decided to place each of them into a separate directory. For now, there is only a routes-service directory for a backend service. But in time there will be a directory for android app, angular app and probably some dockerfiles for setting up dev environment. This approach may cause some problems in the future when it comes to building each app with CI, but we will see. I can always change the layout a bit or try some other hacks.

I have also decided to set up some basic continuous integration tooling, I didn’t do it yet on GitHub, so it seemed like a good idea to try. I have used Travis CI, which is free, so that’s a big plus. I’m more familiar with Jenkins and its pipelines, but hey, it shouldn’t be hard to launch a maven build from some other tool, right?

And it was really simple. First I had to connect Travis to GitHub – that’s done with logging into Travis with GitHub credentials, so it really can’t be any simpler. Then, you just have to enable build on a repository and commit a Travis build configuration file called .travis.yml. Yes, you that’s not a typo, there is a dot at the beginning. It’s written in YAML, and here is my configuration:

First, I’m selecting the project language and choosing a JDK for a build. I’ve chosen jdk8 here as you can see. In the next step, as the backend service is located in routes-service directory, I had to change dir into it with cd command, yes, that’s a plain old cd bash command, so nothing fancy. After this, I just launched maven build, with cleaning workspace. As you can see, I have also specified few build options here. First, skipTests=true, as I just need to build the project here and I’ve also disabled Javadoc documentation generation. Other two options are -B which launches build in batch mode, that means no questions asked during the build and -V option for verbose log output, which may be useful if there will be some problems during the build process. In the next step, only tests are launched to validate that everything still works as intended. After this, all that left to do was adding a build status badge in README.md file so it would display nicely on repository page. To do this, I had to add the following line to the file:

The Code

Now, let’s talk about the app itself. The routes-service app is a backend service. It will host all server logic, that is: talking to the database, periodically fetching and parsing new routes data from ZTM, a Warsaw public transport provider and of course service all this data to client apps, which means all sort of queries that I have described in previous posts (for example here).

The base of this app is created with Spring Initializer available in IntelliJ IDEA, where you can click all the required spring modules and it will create a pre-configured Spring Boot project for you. It’s really great tool as it saves a lot of time. If you don’t have IntelliJ IDEA, you can do it also on start.spring.io webpage. You select there all that you need in your project on a website and download a zip with your project.

Parsing the data

As a first thing to do in this project, I have decided to do the parsing part, as this will let me learn about all the available data that I will have to put somehow into the database.

Data is available as a simple, 220MB text file, that is zipped into 5MB files. These files are published daily and have route data for few days. The file consists of sections, each section has two letter name and number of items in it. Each line in sections represents some kind of data separated into columns with constant width.

As the whole file consists of sections, and sections in sections, I’ve decided to create some kind of base section parser, that I will be able to easily use for each section. And that’s how SectionReader interface was born 🙂

As you see, it has only two methods. The first one reads provided the stream of data, and the second one returns the parsing result. Why have I separated this into two methods? I wanted to have an ability to host a bit more logic in these parsers and I will also need to nest them with an ability for each of them to use the results of the other. That seemed to me like a good solution as it gives me more room for future parsing optimizations. But we will see in the future if that was a good call or not 🙂

The next piece of the puzzle was to create a base implementation of this interface that would host all the boring stream reading stuff. Here it is:

Here we have some logic, but nothing complicated. I just read line after line from a stream and check if it is a section start, section end, or just line of the content calling appropriate callback methods in each case. This approach allows me to easily launch new, nested, SectionReaders when I will encounter another section start with an onSectionStart method, and finish subsection parsing with onSectionEnd. Of course, each line will be parsed with onSectionContentLine. You may have noticed, that I’ve used a @NotNull annotations. These are provided by org.jetbrains.annotations package and are natively supported by IntelliJ IDEA which provides some null checks and code validation thanks to them. As far as I know, you can also use them in Eclipse, they just need some basic configuration there.

Here’s a simple example of a finished SectionReader that I’ve made:

As you see, the code is really simple and readable. And as a big bonus is really easy to test. Each section parsing can be easily tested separately. Here’s an example of such a test:

Simple, isn’t it? 🙂 We just read sample file with single section data, parse it, and check the results. It really can’t be any easier.

That’s all for now. I think that it is a good start, a solid base to work on. Stay tuned for more progress reports, there is still a lot of work to be done.

Also published on Medium.