Advanced Chart Types - The Sankey Diagram

gert.png

Author: Gert Kemps

Features work from: Simke Nys, Gert-Jan Schokkaert & Kurt Buhler

The first Sankey Diagram  made to visualize the Thermal Efficiency of Steam Engines by Matthew H.P.R. Sankey in 1898

The first Sankey Diagram made to visualize the Thermal Efficiency of Steam Engines by Matthew H.P.R. Sankey in 1898

 

Swankey? Sandkey? Stanley?

Those are a few of the misspellings I have heard when I talk to clients about making a Sankey diagram. If you are not familiar with a Sankey diagram, chances are that you will think “that is a funny name”. True, it is a funny name, but it is a lot more than that.

The swanky Sankey man himself,  Matthew H.P.R. Sankey

The swanky Sankey man himself, Matthew H.P.R. Sankey

First things first, a bit of history. The name Sankey Diagram is named after an Irish Captain named Matthew Henry Phineas Riall Sankey. He invented a diagram in which he visually explained how steam flowed through a steam engine. He was able to express how much steam was processed using the size of the branch, with nodes. By varying the size of the nodes we easily can tell how steam was processed. Since this was very easy to understand and communicate, the Sankey diagram was born.

Now that we know where the funny name is coming from we can start with the why.

Why should we use a Sankey Diagram? Because it looks cool? No! Perhaps it is better to use a pie-chart instead? Wrong! Never use a pie-chart! Or maybe sometimes, but that is another discussion.

We use a Sankey diagram when we want to visually see how things are moving and evolving. Example: At VisionWorks we are pretty good in what we do (Please do ask our clients, we swear we speak the truth! It is amazing, you should find out). Because of that, our unit is growing fast and it was hard to tell how our consultants were growing in their career. In order to solve this problem, our data visualisation consultants sat together and had a little discussion on how we could solve this problem. The answer to our problem was a Sankey Diagram. Surprise!

With the Sankey diagram we were able to visualize the career paths of the consultants in our unit. How do they start? As a Kickstarter or with experience? What role do they have now? In which practice is he/she active? Do we have leavers in our unit? All these questions were answered with our Sankey diagram. Before we move on to the ‘how’, I have to mention that we used mock-up data. We actually do not have any leavers. Not convinced? Check out our vacancies! As said before, you should find out, it is amazing.

We now arrived at - for most of you, anyway - the most important part of this blogpost, the how!

At VisionWorks our goal is to be ahead of change. This means that we do not stick to one technology or software, but we work client specific. A tool is just a means to an end to get things done. We deliver state-of-the-art and future proof solutions. In this case, the Sankey diagram, we delivered 4 Sankey Diagrams in 4 different tools. We used Tableau, Qlik Sense, Power BI and D3.js. Why? Because we can (and also want to learn things in the process)! And of course because these tools and technologies are the future.

Let us show you how we did it:

 

Tableau

 

A report featuring a Sankey Diagram built in Tableau.

Using Tableau, there are multiple ways to build or use a Sankey diagram. The easiest way is to use an extension. After downloading the extension you only have to connect with your data et voila, you have a Sankey Diagram. Keep in mind that your data will be transferred over the web when using extensions. Because of this, we opted to build the Sankey by ourselves. Like always with Tableau, we have multiple options. Thanks to the vibrant online Tableau community we found two feasible ways. One way is invented by Ian Baldwin, using a lot of calculations he was able to create a Sankey diagram without any data prep. The other way is invented by Jeffrey Schaffer and further updated by Olivier Catherin. There is some data prep involved in order to build this Sankey diagram. We use a data model to calculate the min and max of each flow, this will multiply our dataset by 93.

We used the method described by Jeff and Olivier. It gave us the needed flexibility and we did not need to create a lot of calculations. Since our initial dataset was not very big, we had no issues when we combined our dataset with the data model.

 

Power BI

 

Sankey Diagram made with custom visuals in Power BI.

In Power Bi, you can make a Sankey by importing a custom visual from the market place. In your Power Bi Desktop environment go to the visualizations pane, click on the dots and select import custom visual from marketplace. Next, search for Sankey and import the visual you find. Once imported you just need to do some data prep using the edit queries environment (or before you import your data into Power Bi) to transform the data in a shape that is expected by the visual. (See this video for more information about transforming your data in the needed structure). While it is very fast to make this basic Sankey, it is unfortunately not so customizable. You can only change the order by changing the order in the data. Nevertheless, the cross-filtering behavior that normal visuals have also apply to this visual. To make this visual more attractive and efficient I suggest to use it in combination with other visuals that exhibit cross-filtering. When you really want to have a Sankey chart that is very customizable (colors, report page tooltip, …) I suggest to develop your own custom visual (by using a combination of typescript and d3). You can find some tutorials about that online and we at Ordina will also in the future provide our contribution to that community.

 

Qlik Sense

 

In Qlik Sense, the Sankey can be easily created using a well-made extension that has been included in the recent update from February 2019. This extension – written in JavaScript – allows the user to specify “nodes” and “chords” that can be added to the visualization by simply dropping dimensions (nodes) and a single measure (chords) onto the visual. The node order can be changed by moving the dimensions up-and-down with the simple drag-and-drop functionality of Qlik Sense. Changing colors, however, is a bit less intuitive, and requires specifying the appropriate color code (including RGB values). This does allow for some further flexibility, though. Using Qlik’s Set Actions, you can create conditional colors for specific nodes or chords, such as the leavers, in red, or VisionWorks-related dimensions, in orange. Alternatively, both nodes and chords can be colored according to measures with the appropriate script. This extension works well with Qlik’s associative discovery functionality. Interacting with the visual results in corresponding changes across the dashboard, and the general discoverability of the Sankey diagram makes it a good fit for a data exploration dashboard in Qlik.

A Qlik Sense Sankey Diagram with mock-up HR data, built with an out-of-the-box extension included in the February 2019 update.

 

D3.js

 

A Sankey built in JavaScript with D3.js

Creating a Sankey with D3.js is a different story compared to the vendor tools. Creating a Sankey with the tools only required a few clicks. This is not the case when using D3.js.

D3.js is a JavaScript library for visualizing data with HTML, SVG, and CSS. This implies that you will have to code the visualizations. No drag and dropping this time, which means a lot more time consuming. What you will get in return of your time are endless options of customizations, as long as you can code it, or find someone else who has shared source code on the World Wide Web. In this case, me 😊

To conclude, using the Sankey diagram we were able to visually show how consultants within VisionWorks were evolving in their career. A Sankey diagram is based on branches to visualize flows. With the size of each branch, we can communicate growth or increase. Nevertheless, a Sankey diagram is not a common diagram. We only use this type of visual when it makes sense to the end user of the visualization. In fact, 95% of all visuals that are typically used on reports and dashboards are bars, lines, pies and tables. Advanced charts, like the Sankey Diagram, only take up 5%. But 5% is often the key to make the difference between yet-another-dashboard and a dashboard that is really effective for the user.

 

Interested?

Please keep reading our blog, more advanced charts will follow! Looking for the extra and crucial 5% that makes the difference? Don’t hesitate to contact us, we’ll be happy to meet you!