Tag: datalake

Managed Big Data: DataBricks, Spark as a Service

Managed Big Data: DataBricks, Spark as a Service

The title accompanying this blog post is quite the mouth full. This blog post will explain why you should be using Spark. If a use case would make sense, then we will introduce you to the DataBricks product, which is available on Azure. Being recognised as a Leader in the Magic Quadrant, emphasizes the operational […]

Feature Store

Feature Store

Everyone who has already come in touch with data science, has already heard of features used in such models. One aspect that can become quite challenging, is reusing features in a consistent way, across several team members, projects and in environments. In this article, I will explain the most commonly used way to resolve these […]

Kimball in a data lake? Come again?

Kimball in a data lake? Come again?

Most companies are already familiar with data modelling (be it Kimball or any other modelling technique) and data warehousing with a classical ETL (Extract-Transform-Load) flow. In the age of big data, an increasing number of companies are moving towards a data lake using Spark to store massive amounts of data. However, we often see that […]