Why did I change my mind about open data?

Knowledge against fear and suspicion – open data is beneficial

Generally disagreeing about any kind of data sharing, I realized my behavior was mostly based on fear. Fear is a major impediment to anything innovative and to any kind of change. Why did I change my mind about open data? It is about differentiating between public and private data, and about the fact that data made public are first of all edited.

New work – new ideas

In November 2015, I started working at Liip and I had a lot of new projects and inputs. The core of my work is the same, I completely changed field though. I stand now in the middle of a flow of innovative ideas and energy, which is very motivating and helps me be constantly open-minded.
One of my projects, last spring, was the coordination of Liip’s involvement at the annual opendata.ch conference. No, I cannot communicate about anything if I don’t understand it! Otherwise I would write complete bulls**t, people would notice it and Liip would lose all credibility on the subject. In other words, I had to know what I was talking about in order to be able to talk about it.

Continue reading about Why did I change my mind about open data?

Tags: , , , , , ,

Time for Coffee available on Android

Do you have time to take a coffee before your next public transportation connexion? Time for Coffee is a project initially started by François Terrier among friends in 2015.  We continued the work to make it available on further devices. 

When the Apple Watch came out, a few Liipers had the idea to make an app for it because having the next departures on the wrist was a perfect use case for this kind of device. The app received quite a lot of attentions in Swiss newspapers and received a Silver in the best of swiss apps in the category “Wearables & New Devices”. Since the Android world deserved also our attention, we made the app available for Android and Android Wear watches. The app is downloadable on the Play Store.

Continue reading about Time for Coffee available on Android

Tags: , , , , , , , ,

The Data Stack – Download the most complete overview of the data centric landscape.

This blog post offers an overview and PDF download of the data stack, thus all tools that might be needed for data collection, processing, storage, analysis and finally integrated business intelligence solutions.

(Web)-Developers are used to stacks, most prominent among them probably the LAMP Stack or more current the MEAN stack. On the other hand, I have not heard too many data scientists talking about so much about data stacks – may it because we think, that in a lot of cases all you need is some python a CSV, pandas, and scikit-learn to do the job.

But when we sat down recently with our team, I realized that we indeed use a myriad of different tools, frameworks, and SaaS solutions. I thought it would be useful to organize them in a meaningful data stack. I have not only included the tools we are using, but I sat down and started researching. It turned out into an extensive list aka. the data stack PDF. This poster will:

  • provide an overview of solutions available in the 5 layers (Sources, Processing, Storage, Analysis, Visualization)
  • offer you a way to discover new tools and
  • offer orientation in a very densely populated area

Continue reading about The Data Stack – Download the most complete overview of the data centric landscape.

Tags: , , , , , ,

Data-Journalism and scraping skills – Report of a meet-up

Tuesday 27th September, at Liip Lausanne, we had the pleasure to welcome Barnaby Skinner from SonntagsZeitung and Tages-Anzeiger and Paul Ronga from Tribune de Genève for a meet-up about data-journalism. You’ll find the slides and further readings here.

During the summer I came across a news, written by Barnaby Skinner about a 3 months course at Columbia University in New York, that he was attending with Paul Ronga (from Tribune de Genève) and Mathias Born (from Berner Zeitung). The course was mainly intended for journalists, teaching them to gather data, improve their analytic skills (for example with Python, Panda libraries, SQL, combining the three, scraping with BeautifulSoup and using Selenium for automated scraping).

Finding the theme extremely interesting, I invited both Barnaby Skinner and Paul Ronga, at Liip Lausanne to tell us more on the subject.

Why are Scraping Skills Important, Especially For Swiss (and Other European) Journalists, Researchers or App Developers?

You can find the slides Datajournalism_Presentation.

The US government, data driven US companies, NGOs, Thinktanks make so much data available. At least when you compare it to Swiss and European governments or companies. That’s why scraping skills are all the more valuable for Swiss journalists, researchers, app developers: in so many cases the data is actually there. It’s just not structured in a way that is easily machine readable.
Starting with the basics, we will discuss more elaborate and sophisticated scraping techniques, using examples and, discussing and sharing some sample code.
By Paul Ronga and Barnaby Skinner

Continue reading about Data-Journalism and scraping skills – Report of a meet-up

Tags: , , , ,

What’s your twitter mood?

The idea

  • Analyze tweets of a user for being positive, negative or neutral using machine learning techniques
  • Show how the mood of your tweets change over time

Why?

  • Fun way to experiment with Sentiment Analysis
  • Experiment with language detection

How

Gathering data

We analyzed tweets from Switzerland, England, and Brazil. We put extra care to make sure our model can do well against Swiss-German text.

Make awesome model in node

We created custom fast Natural Language Processor in node.js. Why node? It has very good run-time when dealing with lots and lots of strings. We used unsupervised machine learning techniques to teach our model the Swiss German and English writing model. Once we had a working model, we added couple other models using Bayesian inference to create an ensemble https://en.wikipedia.org/wiki/Ensemble_learning

Make nice front-end

portugese sentiment analysys

Once we got our server working we thought about adding some better UI. We asked our User Experience specialist Laura to suggest improvements. See for yourself:

mood-detector-graph1

Problems and learnings

Language detection is needed to use the right sentiment model

Design model for Swiss-German is especially hard: the language incorporates German, with a lot of French and Italian words. Also spelling of words changes from canton to canton. If we add that most people when writing tweets are forced to use abbreviation, we get the whole picture of the challenge.

An accurate model needs a lot of data

In order to get a good result we needed to incorporate data from various people and different nationalities. The good thing is that the more you use our model the more accurate it gets.

Training data is available

One of the problems is that for humans is hard to understand the irony or sarcasm. Especially in short tweets. So it’s also hard for a machine.

If you want to play with our results in this machine learning experiment:

https://twittersentiment.liip.ch

I would like to thanks Andrey Poplavskiy for his “css love”, and Adrian Philipp for his huge contribution and encouragement towards this project.

PS.

Some comments that we received, were not so nice, but as always we are happy to receive any feedback.

twitter-mood-not-so-nice

Predicting how long the böögg is going to burn this year with a bit of eyeballing and machine learning.

So apparently there is the tradition of the böögg in Zürich. It is a little snowman made out of straw that you put up on top of a pole, stuff with explosives and then light up. Eventually the explosives inside the head of the snowman will catch fire and then blow up with a big bang. The tradition demands it that if the böögg explodes after a short time, there will be a lot of summer days, if it takes longer then we will have more rainy days. It reminds me a bit of the groundhog day. If you want to know more about the böögg, you should check out the wikipedia page https://de.wikipedia.org/wiki/Sechseläuten.

Now people have started to bet on how long it will take for the böögg to explode this year. There is even a website  that lets you bet on it and you can win something. In my first instinct I inserted a random number (13 min 06 seconds) but then thought – isn’t there a way to predict it better than with our guts feeling? Well it turns out there is – since we live in 2016 and have open data on all kinds of things. Using this data, what is the prediction for this year?

590 seconds – approximately 10 minutes.

We will have to see on Monday to see if this prediction was right – but I can offer you to show now how I got to this prediction with a bit of eyeballing and machine learning. (Actually our dataset is so small that we wouldn’t have to use any of the tools that I will show you, but its still fun.)

Continue reading about Predicting how long the böögg is going to burn this year with a bit of eyeballing and machine learning.

Tags: ,

Time for Coffee for iOS and Apple Watch

Jan Hug, Cyril Gabathuler and myself worked hard in our free time the last few weeks on an iPhone app for the great website timeforcoffee.ch, a private project started by François Terrier and his friends Serge Pfeifer, Jean-Luc Geering and Kristina Bagdonaite. It also has newly addded support for the upcoming Apple Watch. As this is a project done by Liipers and non-Liipers alike, we talk about it more on medium.com, go and read it! And apply for the beta and follow us on twitter: @time4coffeeApp

Tags: , , ,

Big leap forward for Opendata

make.opendata.ch This year, the second make.opendata.ch-hackdays took place in Geneva and Zurich. More than 120 developers, designers and ideators including a handful of Liipers met to work on “public transport”, which was set as the hackday’s main focus. A goal was to show to the SBB and the public what open data sources allow and how they can be used.

The Liipers present at the hackdays got involved in some of the projects:

Zurich: The Swiss Public Transport API

Team: Colin Frei, Danilo Bargen, Dominic Lüchinger, Fabian Vogler, Roland Schilter

Following our internal Transport API hackday in the beginning of the year, some others joined us to continue working on the Rest-API. The goal of the project was to provide a public transport Rest-API that allows every interested developer to create his own applications based on public transport schedules. Basically the API transforms the complex SBB XML response into a JSON format. Documentation and examples can be found at transport.opendata.ch.

During the two days, the team including three Liipers reacted on user requests and implemented new features to the existing API. Besides little changes, we added a location-based station search as well as the output of the full stage details.

Today, the API is already in use by several projects, including a command line interface and a wheelchair map.

Feel free to use it, extend it, and share it. Feedback is welcome as well.

Zurich: Transport Flows visualization

Team: Benjamin Wiederkehr, Dagmar Muth, Ilya Boyandin, Joel Bez, Patrick Stählin, Patrick Zahnd, Sylke Gruhnwald, Thomas Preusse

In the second project Patrick got involved in visualizing Transport Flows. Adapting the idea of the Villevivante project, the goal was to visualize the Swiss transport flows nicely and in an interactive way.

We collected the data based on the swisstrains.ch JSON output and processed it with Python scripts. With the given information, we created some interactive graphics using the JavaScript visualization framework d3. It turned out that it just perfectly matched our requirements and provided a wide range of features.

In the end we were able to visualize facts like sector-based train speeds and counts. We also visualized the transport hubs on a minute and hour basis.

An interesting statistic is the transport hub list, especially that Lucerne is the number three after Zurich and Berne. Also interesting is that the fastest railway line is still the “Bahn2000” between Berne and Zurich, which some of us use regularly.

The result can be found on flows.transport.opendata.ch.

Geneva: SiesteApp

Team: Andreas Kuendig, Benoît Pointet, Raphaël Halloran

The Geneva hackday crowd grew many interests which were more focused on the Geneva region, since a delegation of the territorial information systems department of Geneva (SITG) was present and provided great help and insights in the available geo-informations for the city. Topics like “bike mobility” or “multi-modality” got under heavy scrutiny, discussion and ideation.

Benoît got involved in a team who focused on a vague but non-the-less fascinating topic around individuality, emotions and comfort. He ended up working on a mobile app to help people find out where they could take a nap in Geneva; have a rest or just breath some fresh air in a quiet (or even dog-free) environment.

Follow the project at http://make.opendata.ch/doku.php?id=project:sieste.

Conclusion

Not only in the eyes of the Liip attendees the hackday was a success, but also in those of the participants and of the many institutional delegations visiting the hackday, like the SBB participants, who were impressed and willing to support the Swiss public transport API.

– Andreas Amsler, Benoît Pointet, Colin Frei, Fabian Vogler, Patrick Zahnd, Roland Schilter

Tags: , , , , , , , ,

First Swiss Open Data Camp 30th of September

Open Data Camp

Not only as sponsor but as deeply convinced people of the power of openness we love to announce the first Swiss Open Data Camp. There are already many attendees but more are welcome. See you there!

Tags: