In the previous post, we saw how tables can be designed by following proper data visualization guidelines. In this post, we will look at graphs.
Graphs are pretty much everywhere these days, from stock market data presentations to marketing and HR presentations. I came to realize, after working a lot with Tableau over the years that designing a beautiful dashboard is not everyone’s cup of tea and it is an art by itself.
If you are a Tableau lover and enjoy designing dashboards, you should check MakeoverMonday!
Graphs are normally used to display relationships among and between sets of quantitative values by giving them shape. Patterns revealed by graphs enable users to detect trends, similarities, differences and anomalies from a collection of data points. Large sets of data can be easily perceived and understood, something that is not possible in Tables. Quantitative values can be represented in graphs using any of the following:
- Shapes with Varying 2-D areas
- Shapes with Varying colour Intensity like Heat Maps
Let’s look at the above in detail in a tabular form for easy grasping;
|Points||When sets of points cannot be clearly distinguished, correct by:|
1. Enlarging the points
2. Selecting objects that are more visually distinct
When points overlap such that some are obscured, correct by:
1. Enlarging the graph, reducing the points’ size
2. Removing the fill colours.
|Bars||Use horizontal bars when their categorical label bars won’t fit side by side. Never use horizontal bars for time-series values.|
1. Set the width of white space separating bars that are labelled along the axis equal to the width of the bars, plus or minus 50%.
2. Do not include white space between bars that are differentiated by a legend.
3. Do not overlap bars.
1. Avoid the use of fill patterns (horizontal, vertical OR diagonal lines).
2. Use fill colours that are clearly distinct.
3. Use fill colours that are more intense than others to highlight particular values.
4. Use fill colours that are fairly balanced in intensity for data sets that are equal in importance.
Only place borders around bars when one of the two following conditions exists:
1. The fill colour of the bars is not distinct against its background, in which case you can use a subtle border (example: grey)
2. You wish to highlight one or more bars compared to the rest.
Always start bars at a baseline of Zero.
|Lines||1. Distinguish lines using different hues whenever possible.|
2. Include points on lines only when values for the same point in time on different lines must be precisely compared.
|Boxes||1. Follow the principles for bar design, except when box plots are connected with a line to show change through time, which might require a greater distance between boxes.|
|Combinations||1. Use boxes and lines for distributions through time.|
2. Use bars and lines in the form of Pareto charts for featuring the contribution of the largest portions of the whole.
3. Use bars and points for uncluttered comparisons.
|Trend Lines||1. In most cases, use moving averages rather than straight lines of best fit to show the overall nature of change through time. Only use linear trend lines (straight lines of best fit) in a scatter plot when the shape of the data is linear rather than curved.|
|Reference Lines||1. Use reference lines to mark meaningful thresholds and regions, especially for measures of the norm.|
|Annotations||1. Use text to feature and comment on values directly when doing so is important to the story.|
|Log Scales||1. Use log scales to reduce the visual difference between quantitative datasets with significantly different values so they can be clearly displayed together.|
2. Use log scales to compare differences in values as percentages.
|Tick Marks||1. Mute tick marks in comparison to the data objects.|
2. Use tick marks with quantitative scales but not with categorical scales, except in line graphs when slightly more precision is needed.
3. Aim for a balance between including so many tick marks that the scale looks cluttered and using so few that your readers have difficulty determining the values of data objects that fall between them.
4. Avoid using tick marks to denote values at odd intervals.
The table above shows the ideal do’s and dont’s for each component. It can be referenced when one designs a graph.
Graphs also are normally used to display relationships between dimensions and measures. Essentially, there are 8 types of relationships that we typically use graphs;
- Time Series
- Nominal Comparison
|Nominal Comparison||In the form of a dot plot when you can’t use bars because the quant scale does not begin at zero.||Avoid||Horizontal or Vertical||Avoid|
|Time Series||In the form of a dot plot, but only when values were not collected at consistent intervals of time.||Emphasis on overall pattern, categorical items on X-axis and Quantitative values on Y-axis||Emphasis on individual values, categorical items on X-axis and Quantitative values on Y-axis||Only when showing distribution as they change through time, categorical items on X-axis, quant on the y-axis|
|Ranking||In the form of a dot plot, especially when you can’t use bars because the quant scale does not begin at zero.||Avoid||Horizontal or vertical||Only when ranking multiple distributions, horizontal or vertical.|
|Part-to-whole||Avoid||To display how parts of a whole have changed through time||Horizontal or vertical||Avoid|
|Deviation||As a dot plot when the quant scale does not begin at zero.||Useful when combined with a time series||Horizontal or vertical, but always vertical when combined with time series.|
|Single Distribution||Known as a strip plot. Emphasis on individual values.||Known as a frequency polygon. Emphasis on the overall pattern.||Known as a histogram. Emphasis on individual intervals.||Avoid|
|Multiple Distribution||Known as a strip plot. Emphasis on individual values.||Known as a frequency polygon. Limit to a few lines.||Avoid||Known as a box plot.|
|Correlation||Known as a scatter plot.||Avoid||Horizontal or vertical, in the form of a table lens||Avoid|
|Geospatial||Vary point sizes to encode values||To mark routes||Avoid||Avoid|
Once again, the above table can be referenced while designing graphs on a dashboard. Apart from relationships and representations, two fundamental principles of quantitative information apply exclusively to graphs;
- Maintain Visual correspondence to quantity.
- Avoid 3D.
If you notice properly, the graph on the left has been deliberately manipulated to make an increase in sales from $19,500,000 in July to $19,560,000 in December, which is an increase of less than one-third of 1%, looks like an increase of more than 200%. The graph on the right more accurately presents the data. So, what was exaggerated on the left graph?
- The scale on the y-axis does not start at zero, thus making minor changes in sales appear extreme.
- The plot area of the graph is taller than it is wide. This dramatically increases the slope of the line.
- The line is green, which usually means positive impact, and thus misleading
- Placing the boldfaced axis label millions in the upper left position near the title “Sales are skyrocketing” suggests that they are increasing by millions.
A quantity that is visually encoded in a graph should match the actual quantity that it represents.
Two specific design practices will help you honour this correspondence:
- Make the distance between tick marks on a scale line correspond to the differences in the values that they represent.
- Generally, include the value zero in your quantitative scale, and alert your readers when you don’t unless you are confident that they won’t be misled.
The other fundamental principle that applies exclusively to graphs is Avoiding 3D;
Simply put – 3D renderings of quantitative information rarely work. Don’t sacrifice effective communication using 3D fluff. There are better ways to show more information than by 3D and one is to use multiple related graphs in a series.
General Guidelines and Reference:
Following are some general guidelines and steps that one can follow to build and decide on the right design for an effective dashboard;
- Should the message be presented in the form of a table or graph?
- If a table, which kind of relationship should it display?
- Between a single set of quantitative values and a single set of categorical subdivisions.
- Between a single set of quantitative values and the intersection of multiple categories.
- Between a single set of quantitative values and the intersection of hierarchical categories.
- Among a single set of quantitative values associated with multiple categorical subdivisions.
- Among multiple sets of quantitative values associated with the same categorical subdivisions.
- If a graph, which kind of relationship should it display?
- Nominal comparison
4. If a graph, which object or combination of objects for encoding the quantitative values would work best?
General Practices of Communication-Oriented design:
In general, I have noticed that perfectly designed dashboards follow certain design patterns that makes the dashboard stand out. In particular, it can be classified under;
- Reduce the non-data ink.
- Subtract unnecessary non-data ink.
- De-emphasize and regularize the remaining non-data ink.
- Enhance the data-ink.
- Subtract unnecessary data-ink.
- Emphasize the remaining data-ink.
There are good-to-follow methods to highlight what is important as well and here is a table for the same;
For Quantitatively perceived visual attributes, here is a table that can be referred to while building visually appealing dashboards and visualizations. For example; if you would like to emphasize the size of a quantitative value, you can perhaps go for a bigger table or font if it’s displayed on a table or if its a chart, go for a bigger graph or shape so that it stands out and is easily perceived by the user.
A similar set of principles are followed for visual attributes as well;
|Attributes||Tables and text||Graphs and Objects|
|Orientation||Italics||Data points with an orientation that is different from the norm.|
|Shape||Any font that is different from the norm.||Any symbol shape that is different from the norm.|
|Enclosure||Border around or shading behind table, rows, columns or Particular values.||Border around or shading behind Graph or Particular values.|
|Hue||Almost any hue that is different from the norm.||Almost any hue that is different from the norm.|
|2-D position||Any position that is out of vertical or horizontal alignment with the norm.||Any position that is out of vertical or horizontal alignment with the norm.|
What we saw above is a gist of what comprises the fundamentals of a good visualization. I hope it was helpful in some way and you end up using these tips and tricks in your future visualizations and dazzle your audience!