Data Visualization
Charts can be more memorable, shareable, and quickly understood than written explanations. They help explore data, explain concepts, and share information effectively. Clear visuals strengthen Communication and Dashboards.
Why Visualize Data
- Visualizing data helps spot patterns, trends, and unusual data points that are hard to see in averages or summaries alone. A chart can reveal what an aggregate hides (e.g: Anscombe’s quartet).
- Diagrams can explain concepts faster than text. A few-second visual can replace a long, confused explanation.
- A good chart communicates faster than 1,000 words, but that power comes with responsibility. Misleading charts spread just as easily as accurate ones.
- Plotting data helps spot potential errors and artefacts before publishing.
Chart Type Selection
- Pick the chart type and metric that answers the exact question; rates, counts, and shares reveal different truths, so show multiple small views when one cut feels incomplete.
- Use familiar or practical units (minutes, not standard deviations) when possible. They’re easier to interpret and sense-check.
Clarity
- Keep labels horizontal and close to the data. Direct labels beat legends.
- Don’t include a legend, instead, label data series directly in the plot area (usually to the right of the most recent data point). Exception: many categories referring to many elements (e.g., maps).
- Use small multiples when too many lines overlap. Splitting into panels makes individual trends easier to follow, though it trades off direct comparison between entities.
- Sort categories logically (inherent order) or alphabetically (easier to skim).
- Data looks better naked.
- Reduce non-data-ink as much as possible without losing communicative power.
- Don’t include more precision than needed.
- Format axis labels to match the figures being measured (e.g., currency for dollars).
- Look at axis label spacing and increase intervals if crowded.
Color
- Match colors to concepts (plants → green, bad → red) so readers aren’t forced into a Stroop test.
- Use color-blind friendly palettes. About 4-5% of the population has some form of color blindness.
- Direct labeling also helps color-blind readers distinguish categories.
Axes
- Start your y-axis at zero (assuming no negative values).
- Avoid deceptive scale tricks.
- Leave breathing room on axes instead of extreme zoom.
- The lowest point shouldn’t appear to be the lowest possible value.
- Pair relative effects with absolute numbers (or prediction intervals instead of confidence intervals) to show real-world risk.
Context
- Include explanations for anomalous events directly on the graph.
- For unfamiliar chart types, guide readers with annotations. Add a mini-tutorial if needed.
- Include targets as asymptotes to help audiences see if you’re on track.
- Make the chart standalone. Add purpose, units, timeframe, and source so it can travel without losing meaning and slot into Dashboards or memos without extra explanation.
- Titles for graphs should be the conclusion or key takeaway.
- Always note the data source below the graph.
Reproducibility
- Publish provenance with the chart (data source, assumptions, and ideally a link to code) so others can verify or reuse it and keep Data Practices consistent.
- A chart with no source isn’t much better than claiming a trend was revealed in a dream.
Common Pitfalls
- Skip arrows or other glyphs that imply trends you can’t support.
- Don’t use 3D charts. They distract and make values harder to read.
- Avoid confidence intervals when showing variability. They’re often misinterpreted as ranges. Consider prediction intervals or underlying percentages instead.
- Try not to have too many data series; 5-8 is the usual limit depending on clustering.
Guiding Questions
Ask yourself when creating a visualization:
- Is my chart type meaningful for the question?
- Can I make it clearer?
- If complicated, can I guide the viewer through it?
- Does the chart work as a standalone?
- Is my chart’s presentation justifiable?
- Is my chart reproducible?
Tools
- Datawrapper. Quick interactive charts with great defaults.
- Raw Graphs. Open-source, unusual chart types.
- Observable Plot. JavaScript-based exploratory charts.
- Kepler. Geospatial visualization.