The Data Stack – Download the most complete overview of the data centric landscape.

This blog post offers an overview and PDF download of the data stack, thus all tools that might be needed for data collection, processing, storage, analysis and finally integrated business intelligence solutions.

(Web)-Developers are used to stacks, most prominent among them probably the LAMP Stack or more current the MEAN stack. On the other hand, I have not heard too many data scientists talking about so much about data stacks – may it because we think, that in a lot of cases all you need is some python a CSV, pandas, and scikit-learn to do the job.

But when we sat down recently with our team, I realized that we indeed use a myriad of different tools, frameworks, and SaaS solutions. I thought it would be useful to organize them in a meaningful data stack. I have not only included the tools we are using, but I sat down and started researching. It turned out into an extensive list aka. the data stack PDF. This poster will:

  • provide an overview of solutions available in the 5 layers (Sources, Processing, Storage, Analysis, Visualization)
  • offer you a way to discover new tools and
  • offer orientation in a very densely populated area

So without further ado, here is my data stack overview (Click to open PDF). Feel free to share it with your friends too.

Liip Data stack version 1.0

Liip data stack version 1.0

Click here to get notified by email when I release version 2.0 of the data stack.

Let me lay out some of the questions that guided me in researching each area and throw in my 5 cents while researching each one of them:

Continue reading about The Data Stack – Download the most complete overview of the data centric landscape.

Tags: , , , , , ,

How to reduce your development and maintenance costs with APIs?

An API-based solution has many advantages. The biggest one is the significant spare on development and maintenance costs, thanks to a modular infrastructure. With the example of a recent work for a watch manufacturer, this blog post explains you in 4 points what added value an API can bring to your IT environment.

Context: a manufacturer’s production line without a central server

We were recently involved in the digital transformation of a manufacturer’s production line. The main issue of this manufacturer was control-desktops that were not centrally managed. In consequence, for each code change, an update was required on each control-desktop. It was an expensive and time-consuming process.

Production line - A

This image represents a production line with standalone control-desktops, and their costly maintenance routines.

Continue reading about How to reduce your development and maintenance costs with APIs?

Tags: , , , , , ,

APIs for the public sector

I recently gave a presentation (in German) at the Beschaffungskonferenz. This is a conference for the public sector to exchange round procurement of IT. There were several tracks some focusing more on legal aspects, different procurement processes and agile development while I presented in the tech track. In my talk I presented some of the more established new development paradigms of the past years. But the key message was that APIs need to become a key aspect of how IT projects are planned for the public sector. Specifically I named transport.opendata.ch as a shining example of how providing existing data via a public API can lead to an entirely new economy of use cases on top of it. The idea is really: “Built it and they will come”.

Update: Another good example of a well documented API in the public sector is api3.geo.admin.ch which we have used for various projects here at Liip in the past already.

Update 2: An article which provides an additional perspective: API First at data.gov.uk

Continue reading about APIs for the public sector

Hacking with Particle server and spark firmware

The particle server

In my previous blog post, I wrote about the concept of my project using particle. Now I will explain what I had to do to increase the data rate transfer of my modules (remember, my goal is to get data  with the closest data transfer of 1 [ms] ).

First, I installed the local Api server (https://github.com/spark/spark-server).

Then I had to register all of my photon’s public key on my server and the server public key on my photons.

Using this command :

Then, I launched the server to see if my photons were responding with something like this :

So from here all was working fine but what I also needed to use there is JS library to get data from OAuth. The thing is that you have to do a lot of configurations if you want to make it works but in this project it was not the goal. I had to test as quickly as possible. So I did what you usually do not have to do with a library installed via npm.

Continue reading about Hacking with Particle server and spark firmware

Tags: , , , , , , , ,

Experimenting with React Create App and Google Sheets API

Since the opening of our Lausanne office, a person who shall remain anonymous collected the most epic — from weirdest to funniest — statements made in the open space to fulfill a constantly growing database of quotes. From post-its to Google Docs to Google Spreadsheets, it definitely deserved a better interface to read, filter and… vote!

Quotes React app

Setting up the development environment

After having done some projects with it, it was clear React would be a good choice. But unlike the previous occasions, I did not want to waste time setting everything up; experiments are about coding right? There’s plenty of React boilerplates out there and whereas some are great, most usually include too many tools and dependencies.

Continue reading about Experimenting with React Create App and Google Sheets API

Tags: , , , , , ,

A recommender system for Slack with Pandas & Flask

Recommender systems have been a pet peeve of me for a long time, and recently I thought why not use these things to make my life easier at liip. We have a great community within the company, where most of our communication takes place on Slack. To the people born before 1990: Slack is something like irc channels only that you use it for your company and try to replace Email communication with it. (It is a quite debated topic if it is a good idea to replace Email with Slack)

So at liip we have a slack channel for everything, for #machine-learning (for topics related to machine learning), for #zh-staff (where Zürich staff announcments are made), for #lambda (my team slack channel) and so on. Everybody can create a Slack channel, invite people, and discuss interactively there. What I always found a little bit hard was «How do I know which channels to join?», since we have over 700 of those nowadays.

Bildschirmfoto 2016-06-16 um 11.34.12

Wouldn’t it be cool if I had a tool that tells me, well if you like machine-learning why don’t you join our #bi (Business Intelligence) channel? Since Slack does not have this built in, I thought lets build it and show you guys how to integrate the Slack-API, Pandas (a multipurpose data scientist tool), Flask (a tiny python web server) and Heroku (a place to host your apps).

Continue reading about A recommender system for Slack with Pandas & Flask

Tags: , , , , ,

Symfony: A Tool to Convert NelmioApiDocBundle to Swagger PHP

We have an API built with Symfony that outputs its specification in the Swagger format. We needed to upgrade from version 1 to 2. As we switched the library to generate the specification while upgrading, we had to convert the configuration. In our case that configuration was so extensive that we decided to build a script to convert the configuration.

Swagger is a standard to document REST APIs. Using a JSON file, an application can document its API. Swagger specifies the path for each resource and allowed HTTP methods, as well as input parameters and the returned data. On top of this specification, tools like Swagger UI can automatically provide an API client in a browser. This is an excellent way to explore the documentation and also very helpful when investigating data issues.

We have been using NelmioApiDocBundle with our application for a while now. This bundle reads annotations on the controllers and combines them with the Symfony routing informations to produce an API documentation in the Swagger 1 format. Support for Swagger version 2 however was not available in NelmioApiDocBundle at the time of this blog post. We would have stayed with NelmioApiDocBundle, as it worked well for us, but we did not want to invest the time to refactor that bundle to Swagger 2.

Continue reading about Symfony: A Tool to Convert NelmioApiDocBundle to Swagger PHP

Tags: ,

Predicting how long the böögg is going to burn this year with a bit of eyeballing and machine learning.

So apparently there is the tradition of the böögg in Zürich. It is a little snowman made out of straw that you put up on top of a pole, stuff with explosives and then light up. Eventually the explosives inside the head of the snowman will catch fire and then blow up with a big bang. The tradition demands it that if the böögg explodes after a short time, there will be a lot of summer days, if it takes longer then we will have more rainy days. It reminds me a bit of the groundhog day. If you want to know more about the böögg, you should check out the wikipedia page https://de.wikipedia.org/wiki/Sechseläuten.

Now people have started to bet on how long it will take for the böögg to explode this year. There is even a website  that lets you bet on it and you can win something. In my first instinct I inserted a random number (13 min 06 seconds) but then thought – isn’t there a way to predict it better than with our guts feeling? Well it turns out there is – since we live in 2016 and have open data on all kinds of things. Using this data, what is the prediction for this year?

590 seconds – approximately 10 minutes.

We will have to see on Monday to see if this prediction was right – but I can offer you to show now how I got to this prediction with a bit of eyeballing and machine learning. (Actually our dataset is so small that we wouldn’t have to use any of the tools that I will show you, but its still fun.)

Continue reading about Predicting how long the böögg is going to burn this year with a bit of eyeballing and machine learning.

Tags: ,

The User Experience of APIs

After having read how some people can hold and transmit the terrible misconception that designing APIs has nothing to do with designing great experiences, I felt one could provide a few insights into the benefits of shaping an API around its consumers – the developers and the machines – as much as around the data.

Continue reading about The User Experience of APIs

Tags: , , ,