According to the definition by Stephen Few: “A dashboard is a visual display of the most important information needed to achieve one or more objectives; consolidated and arranged on a single screen so the information can be monitored at a glance.” In the definition, there are several requirements for designing a dashboard
In a dashboard, there are some commonly used analyses presented through visualisation. For a procurement data portal, its dashboard may contain the following analyses:
As a dashboard uses one screen to show multiple contents at the same time, it is important to arrange these in an effective way to let the viewers focus on the information that is most important to them.
A dashboard can be divided into five different areas, i.e. central, top-left, top-right, bottom-left, and bottom-right. These areas have different levels of emphasis and viewers pay more attention to the contents in the areas with high emphasis. The levels of emphasis in different areas of a dashboard are:
There are also other aspects that should be considered when organising the information in a dashboard:
For the design of a dashboard, there are several pitfalls that should be avoided:
We now discuss what types of charts are suitable for the different analysis in a dashboard. The analyses are summarised from a number of existing public procurement data portals and the requirements. For each one of them, we discuss what it is, what kind of questions in procurement it can answer, and what kind of charts can be used to support the task.
The change of key variables such as the volume (money) of contracts and the number of contracts is important information. Visualising such information can help reveal trends in the data. Such visualisations can answer questions such as:
Line charts are one of the most commonly used techniques to visualise changes over time. The x-axis of a line chart represents time and the y-axis represents the analysed dependent variable. The value of the variable at each time point is presented as a dot and adjacent dots are connected with lines. Multiple lines can be shown at the same time in one line chart and can be distinguished through colours. A rule of thumb is that the number of lines in one chart should not be more than 5-7.
Composition visualisation can help people understand the components and structure of a subset of the data. For example, people may want to know:
Pie charts can be used to show parts of a whole. A pie chart is divided into separate sectors (like a sliced pie) and each sector represents a category. The angle of a sector is proportional to the value of the presented category. Different sectors are distinguished by colours or annotations. But when the number of categories in a pie chart is more than 4, it becomes hard to read.
Treemaps are commonly used by many portals to show compositions of contracts (number and volume of money), buyers, sellers, and so on. A treemap is a rectangular area divided into small rectangular sub-areas. The size (area) of each rectangle represents the value of the corresponding category. As a treemap has a nested structure, it can also show the hierarchical structure in the data. Compared with pie charts, treemaps can represent more categories without reducing the readability.
It is worth to note that the values of categories are encoded into area sizes. This makes it difficult for people to directly compare different components in a treemap. According to the theory of Cleveland and McGill, people are not good at comparing areas.
Comparison is one of the most important tasks in data analysis. Through comparison, we can distinguish between high and low, good and bad, and so on. Many questions can be answered by visualising comparisons, such as:
Bar charts are commonly used techniques for comparison between multiple categories. The x-axis represents all the categories and the y-axis represents the values of the categories. It is easy for people to compare the values of two categories in a bar chart, as their values are encoded as the heights of bars and people are good at comparing positions along a common scale.
Stacked bar charts, as a variety of bar charts, can present both comparison and the composition of a whole. For each bar of a category, it can be divided into several sub-categories. The heights of the sub-categories are proportional to their sizes in the whole category. However, the comparison between different sub-categories may be less accurate than in a standard bar chart.
As procurement data commonly comes with geography related information, visualising dependent variables based on such information can provide insights about differences between regions (e.g. cities, counties, countries, and continents). It can help answer questions such as:
Choropleth maps are maps whose different areas are shaded or coloured. The colour saturation or hue of an area is proportional to the value of the variable in the area. It can provide an overview about how the variable is distributed geographically. But as people’s accuracy on comparing colour saturation or hue is low, using choropleth maps for precise comparison is not suggested.
Showing bar charts on a map can provide more accurate comparison than using a choropleth map. For each area, the height of the bar represents the value in that area.
Connection maps can show relationships between different areas of a map. Each area on the map is marked by a circle and the relationships (e.g. trade, transport) between different areas are presented as lines between the circles. The width of the lines can also be adjusted to show different levels of strength of relationships.
Some procurement data portals use many procurement-related indicators to show the transparency, integrity, and risks of a procurement process. Each of these indicators can only describe one aspect. Visualising them in an integrated way may help people understand the overall procurement performance of a buyer, a seller, or a region.
Radar charts are commonly used to present multiple variables related to the same entity. For a radar chart, there are several spokes that start from the centre. These spokes divide the space into equal sized sectors and each of them represents a variable. The data length of a spoke is proportional to the value of the variable and adjacent data values are connected with straight lines.
One important requirement for a procurement audit is to reveal potential collusion and corruption. To achieve this, we need to know the relationships between buyers and sellers. Visualising such relationships can help answer questions such as:
Network graphs can present the entities of a network and the relationships between these entities. Each entity is presented as a node and the relationship between two nodes is presented as an edge. Through a network graph, we can find out who is in the centre of the network and whether there are clusters in it.
There are several stages in the life cycle of a contract in public procurement. Sellers may be interested in knowing what the upcoming bidding opportunities are and which contracts are going to end. Visualising such time based events can help sellers prepare their future bidding strategies.
Gantt charts can visualise multiple time-based events in a juxtapositional style. The x-axis of a Gantt chart represents the timeline and the y-axis represents different events. Each event is shown as a bar. The left end and the right end of a bar represent the start and end time of the event, respectively.
Basic statistics such as average, median, maximum, and minimum are often provided in data analysis. These results are highly summarised, which means that they may not reveal the whole picture of the data. Even for similar statistical results in a procurement data set, people may want to have deeper insights such as:
Violin charts can show the distribution of numeric data. A violin chart is the combination of a box plot (i.e. a plot showing median, higher/lower quartiles, maximum/minimum, and outliers) and the probability density of the data. It can present both the important statistics of the data and the distribution.
Visualisation may tell better stories when combined with other components such as texts and tables. An ideal analytical report should combine these components effectively.
Column sparklines are word-sized visualisations that can be positioned anywhere in a document, a table, or a dashboard. A column sparkline uses bars to present values and can be used to show trends over time or to show comparison. As it is word-sized, it is easy to use it in a sentence to explain a complex trend. Multiple sparklines can be used in a repeated and intensive way (e.g. in a column of a table) to communicate more information than by using just numbers.
As discussed above, comparison is an important task in data analysis. There are several charts that can be used for comparison (e.g. bar charts, tree maps, and pie charts), but the expected levels of accuracy are different.
According to the theory by Cleveland and McGill, people have different accuracy when comparing different visual elements. The ranks of accuracy (from high to low) are:
For example, using bar charts to visualise data can help people make more accurate comparison than using pie charts. Because in bar charts people compare positions and it has higher rank of accuracy than comparing angles in pie charts. Similarly, bar charts are also more accurate than heat maps, as comparing position is more accurate than comparing colours. Therefore, if you want the viewers to compare values accurately through visualisation, you should give preferences to the charts with high accuracy, e.g. bar charts and stacked bar charts (comparing positions and lengths), rather than charts with low accuracy, e.g. pie charts and heat maps (comparing angles and colours).
Data visualisation should convey information in a concise way. Try to tell one story per chart and use small multiples, rather than putting all the data into one chart. For each chart, the number of variable dimensions used in it should not exceed the number of dimensions in the used data.
A chart should make the viewers focus on the important information in data. Unnecessary visual components may distract the viewers. Thus you should simplify your visualisation by removing distractions without affecting the messages that we want to convey.
According to Tufte’s theory, the distractions (“chart junk”) that can be removed include Moiré patterns, grids, chart frames, and meaningless colours. A general rule to check if an element can be removed is that, if the chart can still tell the same story after the element being removed, then it is fine to remove it.
Sometimes visualisation cannot tell all the stories in data without the help of texts. Thus it is necessary to use labels and annotations to provide explanations with the viewers. These explanations can be implemented in an on demand style, for example only showing the explanations of a visual component when the viewers interact with it.
Some visualisations (e.g. bar charts) summarise many data values into a few statistical results (e.g. average, median, and mode). Inevitably there is uncertainty in these results, and such uncertainty should be shown in the visualisations. There are different ways to visualise uncertainty. One way is to add necessary information such as sample sizes, standard deviations, standard errors, and confidence intervals into the visualisation. Another way is to visualise the distribution of the data.
For a chart, it is important to start its y-axis from 0, if possible. Truncated y-axes starting from non-zero values may show distorted presentations of the actual effects in data. If the baseline of a y-axis is not 0, then it should be clarified to the viewers.
As mentioned above, comparison of colours has the lowest accuracy. Thus colours are not suitable for precise comparison. Additionally, if a chart needs to be exported and printed, the choice of colours should be printer-friendly. The use of colours should also consider colour blind viewers (make sure to test the visualisations using colour blindness simulators), the different interpretations of certain colours in different cultures (e.g. red has negative connotations in the West, and positive connotations in the East), and the viewers’ capability of remembering colours.
One common purpose of data analysis is to find the difference between variables. If two variables are presented in a line chart as two curved lines, according to the theory by Cleveland and McGill, the comparison between them will be very inaccurate. The reason is that people tend to find the closest distance between two curves rather than to find the vertical distance between them. Therefore, to help the viewers compare variables in line charts (especially for curved lines), an extra chart that directly shows the difference should be provided.
Using Interaction in visualisation can help the viewers explore, navigate, and drill down into the subset of data. According to the book by Ward et al., the classes of interaction techniques include:
Some encoding components of visualisation may affect the message that they convey. Things need to be reviewed include:
Apart from being accurate, visualisation should also be aesthetically pleasing. According to the book by Ward et al., the three main aspects to be considered in terms of aesthetics are:
As a technique to present composition, compared with bar charts, pie charts are less accurate as the viewers have to compare angles instead of positions. It is difficult to compare the categories close to each other without reading the annotations. Compared with tree maps, pie charts are less capable to present many categories at the same time, as it is difficult to read when the number of categories is higher than 4. Therefore it is not recommended to use pie charts for showing comparison, or showing many parts of a whole. Only use pie charts when showing fewer than 4 parts of a whole, and do not use them for comparison.
3D charts (e.g. 3D pie charts, 3D bar charts) present 3D objects on a 2D surface (a screen) through perspective. In perspective, objects close to us are perceived to be bigger than objects far from us, even if their actual sizes are the same. Thus the size of the same visual object may be interpreted differently depending on its position in a 3D chart, which leads to inaccuracy. For example, in the chart below, Item C appears to be equivalent to Item A when visualised in a 3D chart, despite being much smaller. Avoid using 3D charts.
A rainbow colourmap (red-yellow-green-blue) sometimes is used in heatmaps or choropleth maps to represent different values through different colours. However it is misleading as it introduces artificial colour boundaries even if the actual data is continuous and smooth, which leads to wrong conclusions.
When using double y-axes, the viewers sometimes find it difficult to figure out which data to read against which y-axis. For example, in the chart below, it is difficult to find the correct axis for each of the two lines. In addition, the intersection of the two lines is misleading. The viewers may think that it means something meaningful, but in fact it does not make any sense because the two lines do not share the same y-axis. Remember, each visualisation should tell one story. Try to avoid using double y-axes and instead consider separating the data into two vertically aligned charts and each of them having its individual y-axis.
Not all information has to be visualised. If the information can be represented using a single number, then you do not need to visualise it. Conversely, in many situations, you should not pack all insights of your analysis into one chart - instead, decide on a series of charts, and arrange them in a coherent sequence.