Modern visualizations can be confusing. There are usually multiple datapoints encoded into a single object. In the previous post, we discussed the different type of data encoding and how we can use them to bypass the limitations of our short-term memory. I would like to demonstrate how we can make the most of data in the modern era of data visualization, looking at two effective alternatives to bubble charts.
The data set below is product data for a fictitious retail company.
We can see that Product Group and Product Name are nominal values, and Sales Quantity and Sales Amount are interval values. Traditional visualization would have a combination chart with Sales Amount on the first axis and Sales Quantity on the second.
Though this looks ok, there are still a lot of data points for us to digest. Position, length, slope, and color hue are all the different encoding within this visualization. But what if we could visualize the same information with a more modern and intuitive visualization.
Above is a packed bubble chart. This type of visualization can convey the same data points, however in a more direct way, without confusing the user with position and slope. The color hue indicates a nominal grouping of information, the label for the bubble indicates the nominal product name as well as interval data that does not need to be compared (such as Sales Quantity), and the area is the interval value that we want the user to compare to the other nominal data items.
Rather than asking the user to process 4 data encodings like the first example, we are now only asking the user to process 2; Area and Color Hue.
We can also encode the Sales Quantity interval value into the chart as a Color Saturation.
In this instance, we can see that color hue is less important because we distinguish between Product Group by Position. The Color Saturation represents the total quantity sold. This visualization is encoding 3 data points: Position, Area, Color Saturation.
Another modern visualization that would be appropriate is a Treemap. These visualizations are a way of showing an additional nominal breakdown of categorical data.
We can easily discern the nominal Product Group by Color Hue and Position. The Product Names are identifiable as individual shapes (boxes). The Sales Amount interval is the size of the area. Moreover, the aggregation of the Sales Amount for a Product Group makes up the area of a larger box. We should think of a Treemap as a multi-layer pie chart. The color hue represents the first slices of the pie from the total, and then the smaller shapes represent slices of the parent slice.
With this type of visualization, Position also represents something; in our case, the yellow boxes in the first position (upper, left) make up the largest segment of our 4 Product Groups. The smaller yellow box within the larger yellow segment is the largest sub-segment of data. Our brain does not need an additional encoding method to distract us from understanding this concept. We are encoding data points 4 different ways without being distracting; Position, Area, Shape and Color Hue.
If we were to encode the Sales Quantity, we could provide additional detail.
With Sales Quantity being the color saturation scale for the chart, we still have 4 encoding types. Even though color hue helped define the nominal break between the Product Group areas, we still have the white border to help us with it positioning. This means that the Position encoding is used twice (for the position of the Product Group and Product Name), and the Area encoding is used twice (for the size of the Product Group and Product Name).
Modern data visualizations can be a blessing and a curse. We want to provide as much information to the end user as we can, without overwhelming them to the point of losing focus. Another visualization type that could represent this data is a Sunburst chart, which we will be discussing in our next post.
Want more on the topic of visualization? Check out the first two posts from this series:
The Neuroscience of Seeing Data
Data Impression: 2 Concepts to Improve Data Comprehension