Stop Visualizing Data!

You work in a small company that has a program to help consumers manage their health. Your basic product involves a mobile app for tracking daily events and a personalized dashboard. For a monthly subscription users can also get access to coaching and other resources.

There’s a meeting with a potential investor on the calendar and you want to use data to support your story that things are going well. So, you open up Excel and start digging through the data you have.

Finding the Story

You got some nice local news coverage back in March and you signed your first partnership in June, both of which resulted in a spike of app downloads. So, you look at that.

1-downloads

Well, that’s something, but it doesn’t really communicate the excitement of the last few months. You remember that a lot of those downloads in the spring never turned into even free accounts. So, you decide to look at new accounts instead of downloads.

2-accounts

That looks more like what you were expecting. Whereas the app downloads spiked in March, the new accounts hit a peak in July. Comparing the two graphs, you become curious as to how many new accounts were linked to the news coverage and the partnership, so you draw another graph.

3-new-accounts-source

This view makes it clear that by the time the July peak hit, the effect of the news story had died. The big spike in July was just the partnership. You kind of knew this, but it’s the first time you’ve seen a picture of it, which is pretty cool.

You remember that your company has a 20% download-to-account conversion target, and you want to see how many of these months hit that. This seems like a good situation for a scatter plot:

4-scatter

Wow. Comparing against the diagonal line that represents the 20% target, you can see July and August blew it away, while March and April didn’t even come close.

You note another promising detail on the spreadsheet. Not only are accounts up, but the percentage of accounts that are paid subscriptions is rising as well. This is good for revenue, which investors obviously care about.

5-percentage-paid

You wonder how many of the paid accounts come from the new partnership, so you look at that.

6-number-paid-by-source

Clearly, the partnership has been a great thing for your company. Armed with these insights you put together a nice summary in dashboard form for your investor. You add a few other interesting tidbits (you know from your market researcher that about two-thirds of your paid account holders are women) to make it visually interesting.

7-dashboard

When you walk a few of your colleagues through it you get some nice comments—this is the first time some of them have seen all this information together like this—but when you present it the following day, your potential investor squints at the wall and tries to figure out what’s going on

Visualize Situations, Not Data

When you start by looking at the data you have and concentrate on how to draw a picture of it, it’s easy to lose track of the message. Overwhelming your audience with data is an easy trap to fall into. The person crafting a dashboard (or an article, or a presentation, or a web page) knows the content backwards and forwards and can unconsciously assume that the audience is on the same page.

A graph is a picture of a situation. The trick to creating a good one is to start by identifying a situation that your audience cares about. In some cases, you may know. Your investors probably care more about revenue (and projected growth) than they do about specific conversion rates.

8-revenue

This graph describes a situation that investors will understand: Revenue is going up due to a partnership, and more partnerships and more revenue are on the way.

Often you won’t know what situations your audience cares about, even when you think you do. A clinician who is monitoring a heart failure population may not need to know about her patient’s every movement but does care if he has become less active over the past few days. A credit card customer looks at a breakdown of his purchases out of idle curiosity, but what he really wants to know is how he can maximize the frequent flyer miles he earns by using his card. A patient doesn’t understand what her deductible is, but she does want to know which insurance plan is going to cost her less over the coming year.

It’s not fair to throw data at people and expect them to decode it. Just as with any design, effective data visualization requires you to understand the situations that are significant to your audience. By starting there, you can use data to describe something they will care about.

Useful Data Visualizations

There are a lot of ways you can visualize data, and there’s no shortage of best practices out there for making your charts and graphs. Best practices only take you so far, however. I’ll be talking about how a user’s context and goals inform useful data visualizations at the UXPA’s February meeting, next Thursday, February 11. Event details are here. Also: jokes. Hope to see you there!

View Jeff’s SlideShare Presentation

Visualize Nothingness

By Jeff Harrison

It’s an exciting time to be me! If this email I got from LinkedIn is any guide, my career is about to really take off.

linkedin

Also, this email from my bank shows my rewards balance on this credit card remains at an all-time high. (I don’t know what “Earn More Mall Earnings” means but as someone who lives within a hypothetical short drive of the Mall of America I’m pretty stoked.)

rewards

To top it off, according to this visualization in ClassDojo, my kid is rocking Spanish class. The chart helps me see that all of the feedback from his teacher is positive.

class dojo

All these displays have one thing in common: underwhelming data. I do not actively promote my profile on LinkedIn [edited to add link to LinkedIn profile], and my son’s Spanish teacher never got into the habit of using ClassDojo to communicate with parents. I never signed up for the rewards program for which I receive the monthly grid of zeroes above; they just started showing up in my email a year or two ago. (The program is attached to an overdraft protection feature that Wells Fargo couldn’t figure out how to implement without issuing me a second debit card, which I routinely cut in half each time I get a new one.)

It’s easy to imagine the design reviews for these interfaces. Colorful charts! Insights! Engagement! When there’s a match between the data in these displays and what customers care about optimizing, magic happens: think of all the Fitbit users who consult their apps to monitor their steps and optimize their day for physical activity. The data contributes to a feedback loop, and more people take the stairs. However, when there’s a mismatch the displays aren’t motivating. They just feel kind of lame.

Do your user research. Get it right. And stop sending me notifications that suggest my life is somehow disappointing. Because LinkedIn and my mom would both tell you different:

allstar

The "So What": Telling Stories with Data


When we review a presentation before showing it to a client, someone often asks about the “so what.” Your findings seem reasonable, but so what? What do you want your audience to learn? What action should they take as a result?

If there are charts and graphs in the presentation, those also have to support the “so what.” You can’t just look at data and pick the optimal visualization. You’ll get better results if you first figure out the point you’re making, then design the graph as a supporting illustration.

The Storytelling with Data blog, written by Cole Nussbaumer, is not the only place to learn about good data visualization practices, but it has a more holistic view of communication than many. Yes, Cole talks about the pros and cons of bars, lines, and (shudder) pies, but she goes beyond that to discuss titles, labels, and other accompanying text, and how the shapes and the words come together to make meaning happen.

One of the reasons I like Cole’s blog is that she trades in small data. Her examples tend to feature manageable data sets that might inspire normal people to whip up a graph. So, when she announced a visualization challenge a couple of weeks ago the goal was not to inspire the kind of kinetic sculpture that big companies use to brand themselves as innovators. It was simply to improve upon a set of world population graphs published by The Economist.

While the challenge was to remake the visuals, the biggest problem with the original was the lack of a clear point. The text accompanying the graph was a laundry list of observations:

“The number of people will grow from 7.3 billion to 9.7 billion in 2050, 100m more than was estimated in the UN’s last report two years ago. More than half of this growth comes from Africa, where the population is set to double to 2.5 billion. Nigeria’s population will reach 413m, overtaking America as the world’s third most-populous country. Congo and Ethiopia will swell to more than 195m and 188m repectively, more than twice their current numbers. India will surpass China as the world’s most populous country in 2022, six years earlier than was previously forecast. China’s population will peak at 1.4 billion in 2028; India’s four decades later at 1.75 billion. Changes in fertility make long-term projections hard, but by 2100 the planet’s population will be rising past 11.2 billion. It will also be much older. The median age of 30 will rise to 36 in 2050 and 42 in 2100—the median age of Europeans today. A quarter of Europe’s people are already aged 60 or more; by 2050 deaths will outnumber births by 32m. The UN warns that only migration will prevent the region’s population from shrinking further.”

What I get from this is, there are going to be a lot more people. Okay. So what?

To me, the big story in the data was the massive projected growth of Africa, and the problems that it could spell for billions of people. Here’s what I made (click to embiggen):

swdChallenge

There were other interesting observations that could have been made, but I picked the story that seemed compelling to me and focused on that. The great thing, though, was that other people chose different points (for example, the shifting makeup of the world’s overall population, or how the projections fit with historical trends), and most of them improved on the original in some way. The variety of stories and approaches are on display on the round-up published yesterday, along with Cole’s critique.

Data can clarify, illustrate, and convince, but it doesn’t speak for itself. If you want it to support what you want to say, you have to figure out the “so what” first.

Miles to Go Before I Sleep

By Jeff Harrison

Graphs are pictures that help humans make comparisons and spot patterns in data. Using size, color, position, and other visual cues, a good graph is designed to support this analysis. Designing a good graph means going beyond simply dropping your data into a pair of axes. Making sure that the right comparisons are easy can make all the difference for your audience.

Here’s an example. My car claims to tell me how many miles I can drive before refueling:

gauge

I have 260 miles left on this tank. That’s enough to reach Des Moines—and freedom!

 

The number isn’t exact. To start with, it’s always a multiple of ten miles. This would still be precise enough to be useful to me if it were reliable, but the number also jumps around a bit. Sometimes it goes down by twenty miles when I’ve only driven five, and occasionally it actually goes up.

This variation makes sense when you think about it. Cars have different fuel efficiency ratings for city driving and highway travel. For most cars (mine included) the city number is smaller because every time you stop at an intersection you lose your momentum and have to burn extra gas to get up to speed again. (The city number is higher in hybrids, which use the battery for driving at lower speeds.) The projection’s uneven progress toward zero could mean that it’s based in part on my recent fuel usage. And, indeed, the manual confirms that the projection is based on fuel consumption over the last 19 miles.

That’s good to know, but it doesn’t tell me how far off the prediction is likely to be at any given moment. To put some numbers on it, I decided to collect some data. I reset my trip odometer to zero when I filled the gas tank. Then, every time I arrived at a destination, I recorded how many miles I had traveled, and what my projected remaining miles were. Then I dropped this data into a line graph:

remaining-miles

The initial estimate when I filled the tank that time was 430 miles. Practical considerations prevented me from driving until I ran out of gas, but it looks like I would have landed somewhere near there, or possibly a little higher. The gas light came on at 373 miles, when the projection read 60; it’s slightly off the trend because it came on between stops and I took note of the numbers at that point.

Make It Easier

As expected, the projection trend is not a straight line, but it’s kind of hard to estimate the variability with confidence. In order to do so you have to compare it against an imaginary diagonal line that represents what an ideal projection would look like.

Realizing this, I looked for a way to make it easier. Humans are better at making comparisons against a horizontal baseline than an angled one. Adding the miles already driven to the Y axis lets us plot projected total miles, which gives us that comparison:

total-miles

This graph makes it easier to judge the vertical distance from valleys to nearby peaks. The trend’s range—the vertical variation—over the left half of the graph is 60-70 miles. We don’t know what the correct final number would have been, but the spread alone is enough to convince me not to pay too much attention to this number when my tank is even half full.

Further Analysis

There is variation on the right side of the trend, too. These more gradual changes are much easier to detect in this version than in the first one. They suggest that the projection is still varying with my fuel consumption, but that the variation is less when the tank is emptier. This would make sense because having three gallons of gas gives you a smaller range of possible outcomes than does a full tank. However, a more gradual slope could also indicate a different driving pattern. Perhaps this last section contained trips that were more efficient than my typical errand-running but less efficient than my commute.

To shed some light on that, I looked back at my trip log and added an overlay to my trend to call out segments that were predominantly highway driving:

converging-estimate

This graph illustrates that highway driving reliably made the projection rise. In the first half of the experiment, highway trips pushed the projection to about 480 miles. Segments comprising many short errands and school drop-offs brought the projection back down. The graph also shows that I actually drove more highway miles during the second half of the tank, but that these periods of (presumed) higher fuel efficiency failed to bring the projection up to its earlier highs. The variation during the last rise is only about 30 miles. It’s difficult to measure this variation in the original graph, but in this one it’s easy.

In the end I decided to disable the projection, though I sometimes turn it back on when the gas light comes on and it’s not convenient for me to stop at a gas station right away. Another thing I learned from the manual is that the light is designed to come on when there are 2.6 gallons remaining in the tank, which tells me I do have some breathing room there. This may some day get me into trouble but for now I’m seeing it as a win, at least until I’m stranded in the woods with promises to keep.