How to Choose the Right Data Visualization


low-fi data visualization using a pencil, a piece of paper and a ruler


PHOTO:
Isaac Smith | unsplash

Data is best understood when presented in a visual format rather than text. So how do you choose the visual that best captures what your marketing data is trying to say?

In this post I will cover the key considerations behind a good visualization choice. 

Your Choice of Visualization Impacts the Story Your Data Will Tell

Data visualizations capture any measured task in the customer journey. It’s meant to organize observations of a dimension or metric in a graph. But the right visualization choice isn’t always immediately apparent when analysts get to work in their data solutions. Solution menus and dashboards often contain graphics representing the platforms they were meant to measure. Such options can work, if you use the tool consistently. 

Yet often analysts need to combine data found in one platform with other data or into a calculated metric. That will change their visualization choices. They don’t lack for choices however as the growth of data has lead to an increased number of visualization options for displaying results and incorporating real-time data. 

All of this makes choosing the right visual even more complicated.

Related Article: How Data Visualization Tools Are Making Self-Service Analytics Easier

To Start Your Visualization Selection, Ask the Munzner Questions

So where do you start to choose a good graph?

In my visualization post, What Makes a Good Data Visualization I mentioned two aspects of data to consider. You want a graph that conveys ideas from the data that are too complex to explain through word and that helps your audience to quickly parse information and act on the results.

To get to that graph, ask a set of questions created by Tamara Munzner, professor of computer studies at University of British Columbia. Munzner is renowned for her extensive research in the development, evaluation, and characterization of visualization systems and techniques. She highlighted this question framework in her presentation on avoiding visualization analysis.

  1. Who are the end users? (This is the audience who needs the information.)
  2. What is being shown? 
  3. Why is the user looking at it? (Questions 2 and 3 are meant to highlight what the data is, how it is arranged and its source.)
  4. How is this being shown? (This is the key question — what kind of graph best shows the data.)

The answers to Munzner’s questions help narrow down which graphs best represent the answers visually. Your graph choice should achieve one of the following purposes:

  1. To analyze a distribution composition or a change.
  2. To identify patterns or trends.
  3. To reveal purpose 1 and/or purpose 2 within a subset of a given dataset.

Pick a Graph That Displays a Hierarchy in the Data   

Four categories of graphs are suitable to display hierarchies in data: composition, distribution, relationship and comparison. Both composition and distribution graphs address the structure of your given dimensions or metrics as it relates through the observations, while relationship and comparison graphs are meant to highlight contrasting differences through patterns and trends.

Composition graphs are meant to describe the makeup of a set of observations. Visualizations in this category include pie charts, treemaps and stacked bar charts.

Distribution graphs display the range of observations, making it perfect for statistics indicating the quality of dimensions and metrics that contain those observations. Examples like histogram or boxplots are chosen to address statistical range.

boxplot

Relationship graphs are about correlation trends among two or more dimensions or metrics. Scatterplots and bubble charts are good examples.

scatterplot

Comparison graphs are meant to highlight differences with respect to deviation, trends or ranking among two or more dimensions or metrics. These are often a specialized variation of either relationship or composition graphs, such as regressions charts, Pareto charts, bump charts and stacked column charts.

The best graph for your purpose organizes the data to answer the question “Why is the user looking at this?”

Each of these categories have multiple graph styles, more than can be covered in a single post. But in choosing a graph, you are seeking the one that best displays a hierarchy that clearly and accurately answers your questions.

Related Article: Stop Torturing Your Data and Other Tips to Reveal True Data Insights

Know How Your Data and Color Convey Information

A graph’s success depends in part on if it creates a cognitive load for the audience. Cognitive load refers to the amount of information a brain can process at any given time. So you want to ensure graphical elements combine to tell the clearest story with the least amount of effort on the part of the viewer. 

For example, bar charts and pie charts can equally show a composition of data, but bar charts are better at displaying unit differences. Those differences are important for showing the precision of comparison. Instead of saying there’s a 20% increase in organic traffic, for example, you need a bar chart that shows that 20% increase. With just one look, your end user can easily absorb the change.

In the chart below you can clearly see there were few vehicles with rear-wheel drive (r) compared to all wheel drive (4) or front wheel drive vehicles (f).

drivetrain

A good visualization focuses on accuracy when indicating measurements. Heat maps can show gradient changes, but can be a poor choice for accuracy when the audience wants to understand distinct numeric differences between elements. For example, if a one or two degree temperature change has significance for your subject, you need to pick a graph that highlights when that difference appears.

Color is another element to consider. Sticking with a single color and using shades to indicate visual distinction lowers the cognitive loads. Also consider accessibility concerns, such as color-blind users, when selecting your color scheme. A second color is acceptable for highlighting a specific dimension so it stands out against the other dimensions in a bar graph. Two colors are perfect for graphs that show two divergent extremes, such as a heatmap. You often see this in correlation charts, like the one below, to indicate the strength of correlation for observations.

correlation

But there are limits to how many colors can be assigned in some composition graphs. Usually six to eight colors is a good ballpark for showing meaningful difference across multiple dimensions or metrics. More than that introduces too much granularity. The resulting visualization crowds graph visuals together and make distinctions hard to view.

If you must show more than eight different dimensions with distinct colors, a treemap is a better choice. A treemap is a diagram of nested rectangles displayed as a hierarchy according to the value of the given data. The area of each rectangle corresponds to the numeric value of its data. The sizes make the scale of each datapoint clear to see, with color scales providing further distinction, all within a constrained display space.

In addition, advanced visualization platforms like Tableau and Google Data Studio have options for query data subsets from data sources. This gives you additional color and visual choices to tell your data story.

Related Article: How to Effectively Use Google Analytics TreeMap Reports

Pick Visualizations That Fit Your Timeline or Location

The next visualization choice relates to displaying how data evolves over time. Relationship graphs usually work well, such as line charts which can show a comparison over time, or charting a regression charts of data changes over a set period. But you may have to display lengthy time periods to show an important, albeit slowly evolving, trend.

This is where programming languages like R and Python can help.  Libraries — scripts added for functionality — offer visualization choices so the user can annotate graphs and create animations that display how data changes over time. Often the data is read into the program, then mapped into visual graphs using the library. Python users have a choice of libraries, such as Matplotlib and Seaborn, while R users have access to ggplot2, a library based on a grammar of graphics concept of adding or removing each graph element as a layer to provide customization options. 

The advantage of these libraries is you can build custom visuals to suit your needs, using scripts that call real-time data through an API. This allow graphs to remain up-to-date with the newest information.

These are also useful for spatial visualizations such as geolocation graphs. Data is mapped to a location of interest, adding another consideration for displaying information. Libraries for both Python and R offer options for visual maps and graph combinations.

Ask How Frequently Graph Updates Are Needed

Does a graph need to be updated on a regular basis to monitor ongoing performance or is it needed for a one-time analysis? The answer dictates what kind of workflow works best.

Real-time graphs are usually coupled with cloud based dashboards to manage the data and visuals. For example, in R programming, you can easily create a shiny app, a simple web application that allows data, program results, and graphs to appear in a shared digital environment.  A shiny app can be hosted as a dashboard which updates the visualizations instantly when data is called. Moreover, you can also add HTML features like buttons and sliders to allow your audience to adjust a display without touching the data or the underlying code.

Ultimately you must outline the reporting schedule that best addresses what your audience needs from the data. Doing so will highlight the steps necessary to deliver your graphs and see what impacts decisions. Sometimes there are technical reasons for adjusting the timeline. Many times people prefer a static image or are limited to an image if the graph is for printed material. Mapping raw data to visuals raises the question of what access of data sources is needed to feed the graphs. If it is updated regularly, then you need an easy means for updating data and associated annotations.  

Related Article: What Is Tableau? How BI Inspires Growth

A Few Last Tips on Picking Good Visualizations

Ultimately a good visualization selection will make your analytics clear. As I mentioned in 10 Mistakes to Avoid When Rethinking Your Analytics Strategy, you want to avoid broad questions that spiral into a long dull narrative about your data. That leads to no meaningful conclusions about your marketing efforts.

If you have a lot of critical material but know stakeholders don’t have a lot of time, you can place those visuals in an appendix so recipients can review details when it’s convenient. You can discover a few more rules of thumb in my visualization post.  

Picking good visuals to tell a story puts your marketing analysis into focus. A strong visualization will open up discussions in your audience of takeaways that move your customer experiences — and your organization — forward.

Pierre DeBois is the founder of Zimana, a small business digital analytics consultancy. He reviews data from web analytics and social media dashboard solutions, then provides recommendations and web development action that improves marketing strategy and business profitability.



Source link

We will be happy to hear your thoughts

Leave a reply

SHOP WITH THE DURENS
Logo
Compare items
  • Total (0)
Compare
0
Shopping cart