The Data Stack – Download the most complete overview of the data centric landscape.

This blog post offers an overview and PDF download of the data stack, thus all tools that might be needed for data collection, processing, storage, analysis and finally integrated business intelligence solutions.

(Web)-Developers are used to stacks, most prominent among them probably the LAMP Stack or more current the MEAN stack. On the other hand, I have not heard too many data scientists talking about so much about data stacks – may it because we think, that in a lot of cases all you need is some python a CSV, pandas, and scikit-learn to do the job.

But when we sat down recently with our team, I realized that we indeed use a myriad of different tools, frameworks, and SaaS solutions. I thought it would be useful to organize them in a meaningful data stack. I have not only included the tools we are using, but I sat down and started researching. It turned out into an extensive list aka. the data stack PDF. This poster will:

  • provide an overview of solutions available in the 5 layers (Sources, Processing, Storage, Analysis, Visualization)
  • offer you a way to discover new tools and
  • offer orientation in a very densely populated area

Continue reading about The Data Stack – Download the most complete overview of the data centric landscape.

Tags: , , , , , ,

Drupal SearchAPI and result grouping

In this blog post I will present how, in a recent e-Commerce project built on top of Drupal7 (the former version of the Drupal CMS), we make Drupal7, SearchAPI and Commerce play together to efficiently retrieve grouped results from Solr in SearchAPI, with no indexed data duplication.

We used the SearchAPI and the FacetAPI modules to build a search index for products, so far so good: available products and product-variations can be searched and filtered also by using a set of pre-defined facets. In a subsequent request, a new need arose from our project owner: provide a list of products where the results should include, in addition to the product details, a picture of one of the available product variations, while keep the ability to apply facets on products for the listing. Furthermore, the product variation picture displayed in the list must also match the filter applied by the user: this with the aim of not confusing users, and to provide a better user experience.

Continue reading about Drupal SearchAPI and result grouping

Tags: , ,

Search: Get past English with Solr

Implementing a great search feature for an English website is already quite a task. When you add accented characters like you have in French, things tend to get messy. What about more exotic languages like Japanese and Chinese?

When we tried to implement a search engine for a multi lingual website where we had articles in Chinese, Japanese and Korean, despite not knowing those languages at all, we quickly remarked that our search engine was performing really poorly. On some occasion it wasn’t even returning an article we specifically copied a word from.

We had to do a lot of research to understand what was happening, here is a compilation of what we found along the way in the hope you won’t have to go the same path as us!

Continue reading about Search: Get past English with Solr

Tags: , ,