Mastering Data Visualisation: Avoiding Common Dashboard Design Mistakes (Full Transcript)

Learn how to create effective dashboards by prioritising data over design. Discover key principles from Edward Tuft's book and practical tips for better visualisation.

Download Transcript (DOCX)

Speakers

Add new speaker

Speaker 1: The sad truth is, I see it all the time. And to be honest, when I started building dashboards many moons ago, I was definitely guilty of this as well. What am I talking about? Well, it's when dashboard designers prioritise style over substance, or to put it another way, design over data. They try and make their dashboards look cool by incorporating unnecessary design elements, while at the same time completely ignoring the basic fundamental principles of data visualisation. Because after all, effective data visualisation is less art and more science. As I said, way back in 2010 when I was starting out, I used to try and impress clients by making their dashboards look as cool as possible, until someone recommended to me a book that, once I'd finished it, completely changed the way I designed dashboards from that point on. In this video, I'm going to break down the biggest takeaways I got from that book, as well as other things I've learned since, to help you better visualise your data and build better dashboards. Let's jump in. Hello and welcome to Learn BI Online with me, Adam Finer, helping you do more with data. So the book in question is called The Visual Display of Quantitative Information. It's by the author Edward Tuft, who is widely considered to be one of the godfathers of modern data visualisation. An original data-vis-o-g, if you like. It's a fantastic book, and even though it was first published in 1983, well before the advent of BI tools, what it teaches can easily be applied and adapted to modern interactive dashboards. So right at the beginning of the book, Edward Tuft sets out eight fundamental principles of data visualisation that contribute to what he calls graphical excellence. They are as follows. Show the data. So, basically, visually representing sets of numbers that otherwise would be hard to understand properly. He refers to something called Anscombe's Quartet, which is a group of four datasets that have nearly identical simple descriptive statistics, yet have very different distributions and appear very different when visualised. Induce the viewer to think about the substance rather than about methodology, graphic design, the technology of graphic production or something else. Essentially, avoid anything that will distract the viewer from understanding the visualised data. So this is what I referred to earlier when talking about making a dashboard look cool. Sometimes, in an effort to do that, you actually end up distracting the viewer from the whole point of the dashboard in the first place, the data. In a minute, I'll talk about ways to avoid this pitfall. Number three, avoid distorting what the data have to say. Data that's visualised perhaps with the intention of trying to tell a different story or misleading the viewer is essentially lying about the data. A common way this is done is by setting the y-axis value of a graph at a point greater than zero to exaggerate the significance of certain data. Number four, present many numbers in a small space. Essentially, the whole point of data visualisation. Make large datasets coherent. There simply is no better way to make sense of large datasets than with visualisation. But the key word here is coherent. It's actually easier than you might think to visualise data poorly, perhaps by using an unsuitable visualisation or too many different nonsensical colours. Encourage the eye to compare different pieces of data. Essentially, all data visualisations should be comparing data to help the viewer understand it better. A single figure or data point on its own without any context has limited usefulness. It's only when you include other data that it can be better understood through comparisons and correlations. The implication here is that the comparisons should be made as easy as possible for the viewer to see. Reveal the data at several levels of detail, from a broad overview to a fine structure. The visual display of quantitative information was first published in 1983, long before the arrival of BI tools, and even a couple of years before the first release of Excel. So data visualisation has come a long way. The arrival of dedicated BI and dashboarding tools makes revealing the data at different levels much easier to achieve through interactivity options you can include in dashboards, allowing the viewer to filter and drill down into the data to see it at different levels. Serve a reasonably clear purpose, description, exploration, tabulation or decoration. When it comes to dashboards, I feel it's important for each one to maintain a clear objective. So in practice, you should avoid presenting data from too many different data sources on the same dashboard, unless it serves the dashboard's objective and narrative. And if you do, they should be clearly distinguishable. If you're new to data visualisation, I'd recommend you print off these eight points and keep them close by so you can refer to them while you're building your dashboards and reports. So how are you visualising your data wrong? What are the common mistakes? How do you avoid or correct them? Probably the best way, or at least the starting point, is to consider what Edward Tuft calls the data-to-ink ratio. So the data-to-ink ratio is the total amount of ink used to print the data in the visualisation, the lines on a time series chart, the bar on a bar chart or the figures in a table, divided by the total amount of ink used to print the whole visualisation. He also describes it as the proportion of the graphics ink devoted to the non-redundant display of data information. There can also be redundant data ink that should be subject to one of Tuft's two erasing principles aimed at increasing the proportion of data ink. Ink that fails to depict statistical information does not have much interest to the viewer of the graphic. In fact, sometimes such non-data ink clutters up the data. Redundant data ink depicts the same number over and over. If we look at this simple graphic, how many times can you see the same information displayed? I'll give you 5 seconds. Well, if you guessed 6 times, you'd be correct. Here they are. So, we can remove 5 out of the 6 and leave just the column colour. Now let's look further at redundant graphic ink and see how erasing it can improve data visualisation. Here's a time series chart. Let's analyse its data to ink ratio. Which elements are the non-redundant data ink and which might be considered redundant? Starting with the chart title. Do we even need it? Does the chart itself not already communicate to us what the data is that we're looking at through other elements like the axis titles? Is the big blue bar not distracting and drawing the eye away from the data? I'd say yes, in which case I'd remove it. Or at the very least remove the blue and make the title less overpowering. Next, do we need a border around the chart? No, so we remove it. This legend here, do we need this as well? It's actually information that appears 3 times in this chart. Once in the legend, once in the y-axis title and once in the chart title. So we should choose to keep just one of them. We'll start by removing the legend, then we have another choice to make. Either is fine, but I'm going to remove the y-axis title and keep the chart title. While we're at it, why not just remove the x-axis title as well? What purpose does it actually serve? The chart title says we're looking at sales over time. We're using a time series chart. Do we really need to have an axis title stating that we're looking at a date when the values on the axis show up? I'd say no, but only on the proviso that there are not multiple different dates in the data set. And we need to know which one we're using. For example, order date, ship date, etc. In this case, we can remove it. Let's stick with the x-axis for a second. Do we really need to display so many date values? Or can we just leave it as it is? Let's stick with the x-axis for a second. Do we really need to display so many date values? Or can we reduce the number of values and therefore the data to ink ratio? How about this? This? Or even this? OK, maybe that's going too far. It might be OK if you're dealing with a shorter period of time, like the last 28 days, or just when you have fewer values in the time series. For this particular chart, I think it's helpful to see where the years start. So let's go back a step. What we can see when we reduce the number of values on the x-axis is that we also reduce the number of vertical gridlines, which is removing even more non-data ink. I think we can do the same for the y-axis. Let's reduce the number of values to 4. That looks better. So, some data visualisation purists might even suggest you remove gridlines altogether, like this. But personally, I think having at least some guide to see where values far apart on the series are in relation to a particular level is quite helpful. So what I would do is actually modify the gridline colours, or reduce their opacity, so that they're not so prominent, like this. If we compare where we started to where we are now, we can see that it's a vast improvement. The data takes centre stage, and the remaining ink is there to aid the comprehension of that data. So what else counts as redundant ink on a chart or dashboard? Well, things like drop shadows on borders, making graphs 3D, unnecessary images, and sometimes even things like borders on KPI visualisations aren't really necessary. Just ask yourself, is it a design element that's helping to present the data more clearly and effectively, or does it serve no purpose and is purely an aesthetic element? Something I see quite a lot when helping clients build their own dashboards is that they tend to make their charts and graphs too big. Real estate on a dashboard is at a premium, so you need to optimise the space you have. The purpose of your visualisation is to communicate the data effectively. What you should really try and do, at least when space is limited, is to reduce the size of your chart or graph until you really can't understand the data in it, then increase the size back a bit. You see, still legible. The takeaway is that it doesn't need to be big to be understood. Here, I think now that we've reduced the chart size, we can also reduce the axis label font size as well. This being said, if space permits, you're free to make your charts as big as you like. Colour. So important to use it correctly. Although it's only discussed briefly by Tuft in the visual display of quantitative information, I'll give you my take. Personally, I like to keep things simple when it comes to colour. Don't use too many when the visualisation doesn't call for it, and especially don't use bright, gaudy colours that distract from the data. Obviously, there are loads of cases when you should or need to use different colours in the same chart, for things like heatmaps and the segments on a pie chart. You should make sure, for example, that when you are presenting the same dimension values in different charts of a report, you maintain consistent colours. Otherwise, you'll just end up confusing the viewer. Basically, when considering the use of colour, it needs to aid the graphic and not just be a simple design element for design's sake. One semi-exception to this is that you can, in certain circumstances, be clever in your use of colour by using different ones for different datasets. So, blue for sales data, red for social media data, etc. This can help the viewer to better read and understand a dashboard. Oh, does Edward Tuft not like pie charts? Not at all. He actually calls them dumb. The full quote is, A table is nearly always better than a dumb pie chart. The only worse design than a pie chart is several of them. For then, the viewer is asked to compare quantities located in a spatial disarray both within and between pies. He's actually referring to this kind of map visualisation, and I kind of agree. But then I ask myself, isn't this really showing data at different levels, one of the eight points he sets out at the start of the book? For me, when you're looking at something like this in a dashboard, your primary focus should be drawn to the size of the pie charts on the map that will indicate a primary metric. Then, the pie charts allow for a closer inspection of the data by a secondary dimension, i.e. at a different level. I disagree that pie charts are completely dumb, and I'll explain my thinking. Pie charts are used to display the parts that make up a whole, 100%, represented by the 360 degrees of the circle. For me, there is no other visualisation type other than perhaps the tree map that achieves this. You can't visually represent how, say, a set of five values makes up 100% using the columns of a bar chart or the dots on a scatter plot, for example. Yes, you could use stacking to represent 100%, but then would you want to use a single column to display the share of a whole of these five values? It wouldn't be my choice. The other reason I don't think that pie charts are dumb is that people intrinsically understand the concept of a pie and cutting a pie up into parts. It's instantly recognisable and understood. It just makes sense to us. So, when you're working with an at-a-glance dashboard, a pie chart gives the viewer something that's understood very quickly, without the need for much further inspection. Other people might argue that where the segments of a pie chart are quite close in value, and it's hard to distinguish which has the greatest share, it makes no sense to use one, and that the small differences between the values might be better expressed with something like a column chart or a simple table. And I would agree to an extent. But I also think that this is missing the point of a pie chart altogether. What I said to begin with, a pie chart is meant to display 100%, and if the parts are equal in their size, this tells you that the parts are equal in their size. You have effectively communicated the distribution of values making up the whole. So, now that I've explained why I think pie charts aren't dumb, there are caveats. I always recommend that you only use a pie chart when you have no more than, say, five values to display. Otherwise, it gets too complicated to read. And, where possible, consider using a donut chart, which is essentially a pie chart with a hole in the middle, because they use less ink. As I said earlier, the visual display of quantitative information was written in the 80s, before business dashboards, as we know them, even existed. Which means that it doesn't cover dashboard design principles. There's actually a video on my channel where I talk about my seven dashboard design essentials that do, in fact, incorporate elements from Tuft. So if you want to check that out, here it is. But before you do, if you got value from this video, I'd really appreciate it if you could give it a thumbs up, and maybe leave a comment. What do you think of pie charts? Thanks so much for watching this video, and I'll see you soon in the next. Bye.

Summary

Generate a brief summary highlighting the main points of the transcript.

Title

Generate a concise and relevant title for the transcript based on the main themes and content discussed.

Keywords

Identify and highlight the key words or phrases most relevant to the content of the transcript.

Key Takeaways

Extract key takeaways from the content of the transcript.

Sentiments

Analyze the emotional tone of the transcript to determine whether the sentiment is positive, negative, or neutral.

Enter your query

{{ secondsToHumanTime(time) }}

Back

Forward

{{ Math.round(speed * 100) / 100 }}x

{{ secondsToHumanTime(duration) }}

Select Audio file