Donations vs. Disease

Randy Krum from Cool Infographics writes in:

In August, published The Truth about the Ice Bucket Challenge and included an infographic (“Where We Donate vs. Diseases That Kill Us”) that used proportionally-sized circles as its data visualization. The problem with this design was that the circle sizes didn’t match the values shown. This is a false visualization and significantly over exaggerated the smaller amounts of money contributed to each charity and the deaths attributed to each cause. The designer made the mistake of adjusting the diameter of circles to match the data instead of the area, which incorrectly sizes the circles dramatically.

To demonstrate, I designed a corrected version of the infographic and posted it on Cool Infographics®, which you can see here side-by-side next to the original. To stay close to the original, I only made three changes: corrected circle sizes, eliminated the color legend and added the connecting lines to help readers make the direct comparisons.


The Google Docs spreadsheet of the original data and correct circle area and diameter calculations is available here. To their credit, has also published a corrected version of the infographic in the original article.

The first step was to get the bubble chart data visualization correct. Now that we have an infographic that matches the data presented, we can step back and ask the hard questions.

  • Is a bubble chart the best way to visualize this information?
  • Is this the right data to show when comparing money raised to deaths by diseases?

From Twitter, @indented recreated the visual as a scatterplot using HighCharts to more clearly show the large differences. (The interactive is available here.)


Jon Schwabish (@jschwabish) also created a scatterplot version, but changed the data to compare individual fundraising events to National Institute’s of Health funding, and then size the bubbles by the number of deaths. (The interactive is available here).


We (myself, @indented, and @jschwabish) had an interesting Twitter discussion about this visualization and the challenges of using the various data sources.

twitter_conversation_1 twitter_conversation_2 twitter_conversation_3

Looking for other options and options about how this data can be improved, or visualized better…


Additional resources:

This Bubble Chart Is Killing Me, David Mendoza

Where Should Our Money Go?, Aneesh Karve

One of the worst infographics ever, but people don’t care?, Phil Price

Ice buckets, research and the cost of disease, Scienceogram UK

NIH Spending Versus Diseases That Kill Us, Mohammed AlQuraishi

10 thoughts on “Donations vs. Disease

  1. I love this blog. I wish I had more time to participate!

    Did problems with using area to encode information come up in the discussions? The answer to the question about the appropriateness of bubble charts is a qualified no. If accurate visual comparisons is the primary goal, area is a bad parameter to use.

    People are always going to be drawn to bubble charts. They are visually engaging and do a reasonably good job at communicating gross differences, but we are not not very good at comparing areas.

    I wrote about this ( recently as part of an effort to talk a collaborator out of relying on area as the primary metric in a project we are working on. Alberto Cairo also touches on this issue in The Functional Art.

  2. I think you are mish-mashing the data a little bit. When looking at it, the bigger picture really should be – what is the ratio of dollars to deaths? It sounds a little morbid and could probably be wordsmithed to find the appropriate wording, but it should be something like this attached file. Good data viz shouldn’t be an eye chart, and shouldn’t take time to figure out.

    PS – The numbers have likely changed a bit since the Ice Bucket Challenge started. Last I heard (before the challenge became over-viral), it was at $100M already.

  3. I completely agree that using area is a less-specific visualization method, and similar to the Twitter discussion Jon captured above, I have heard many people suggest that a bubble chart might not be the best method here. Area visualizations fall roughly in the middle of the scale of Accurate Judgements included in Alberto Cairo’s book (Fig 6-12 on P.120).

    However, in this specific case, I think circles do the job really well. This graphic is not intended to show super-precise comparisons. Instead, it’s trying to show the vast differences between values, and I think it succeeds in that. There are large differences between money raised and deaths from breast cancer, heart disease, prostate cancer and motor neuron disease (including ALS). Differences large enough that people can easily understand the comparisons using area. Lining up the matching circles could make that comparison easier for readers instead of sorting each column in descending order.

    Having said that, I’d like to see alternative designs from people here on HelpMeViz. Do the scatterplots above work better? Would a simple bar chart be more impactful?

    • Setting aside the question as to how appropriate a bubble chart is, I wonder about the focus on area as the proper parameter with which to encode information. Given:

      – The goal of the figure is to highlight differences in the amount of money donated
      – We are not good at estimating relative differences in area of two circles
      – specifically we tend to UNDERESTIMATE differences when comparing areas

      Why use area? It deemphasizes the differences. Would the original figure have passed muster (and been effective) if the authors had stayed with diameter and explicitly stated that the height of the circles showed values in $?

  4. Even though I don’t think the fundraiser-deaths question is particularly interesting–and that some of these mix categories (e.g., Cancer as a whole instead of Breast Cancer and Prostate Cancer separately)–why not just make a simple slope chart?

    Randy got at this a bit, in his remake, but it still depended on circle size. I didn’t clean this up much at all, but you get the idea.

  5. the last one looks like a sparkline, which would indicate a time-series analysis. I don’t think it’s obvious what you are looking at.

  6. As mentioned, areas are not the best option. Also when you are forced to color code each category for differentiation, this is usually a sign that visualization is not optimal + colors can introduce perception errors due to optical effects, such as yellow (AIDS) in above example looks bigger than pink (ALS).

    IMHO, slopegraph (line chart) not a good idea. It should be used for time series (comparing two or more points in time), not very suitable for comparing two measures. Lines always suggest trends…

    So, how about a bar (“butterfly”) chart – simple, understandable, no colors, no lie factor. The only color code needed is Death = black:

  7. Actually, if we apply scaling of e.g. 1 death = 1000$, the message becomes quite powerful (also the construction of bar chart is a bit more enhanced this time):

    • @Andrej @John I disagree that the slopechart looks like a sparkline. Sparklines are small line or area charts. I also disagree that it looks like a line graph. It simply connects the two points. There are lots of great examples of slopecharts out there.

      I’m not arguing it’s the best approach for these data, but I think it’s erroneous to call it a line chart or sparklines.

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>