Awesome Plotly with code series (Part 1): Alternatives to bar charts

A bar chart is not always the best solution.

Oct 18, 2024

Introduction to the Series

Data visualization plays a crucial role in how stories are told and understood. I’ve always been fascinated by the sleek, annotated charts used in data journalism — those visualizations that can instantly communicate complex ideas in a way anyone can grasp.

I have also been deeply inspired by resources like Cole Nussbaumer Knaflic’s book, Storytelling with Data [1], which provides essential best practices for creating clear, impactful visualisations, emphasising a minimalistic approach that strips away unnecessary elements. Her approach is to focus on what truly matters, ensuring the data’s story shines through without distractions.

Then there’s AddTwoDigital, a digital agency working on data storytelling. They have open-sourced a series of blog posts on data visualisation best practices [2], showcasing everything from simple bar charts to more complex, mind-bending infographics. Their content is a treasure trove of inspiration, and there’s always something new to learn from their examples that are not on typical data visualisation books.

Finally, my go-to tool for creating visualisations is Plotly. It’s incredibly intuitive, from layering traces to adding interactivity. However, whilst Plotly excels at functionality, it doesn’t come with a “data journalism” template that offers polished charts right out of the box.

That’s where this series comes in — I’ll be sharing how to transform Plotly’s powerful features into sleek, professional-grade charts that meet data journalism standards.

PS 1: I have no affiliation with Cole Nussbaumer book, nor with AddTwoDigital agency. But their content is my data visualisation bible. I will heavily lean on their ideas and concepts but bringing them to life through Plotly.

PS 2: If there are chart similarities, I have permission from Adam Frost (from AddTwoDigital) to replicate them using Plotly. I believe they use other data visualisation frameworks such as DataWrapper, but nothing related to Plotly.

Beyond the bars: Finding better ways to visualise data

Bar charts are a staple of data visualisation, but there are instances where they might not be the best choice. In this post, we’ll explore two specific cases where bar charts fall short:

Comparing multiple categories — As the number of categories increases, it becomes harder to make accurate comparisons between bars, and the chart can quickly become cluttered.
Time series data — Bar charts are not ideal for showing trends over time, where line or area charts provide a clearer picture of patterns and changes.

“How to read” the blog

Each example in this blog will follow a consistent structure:

Theory: A brief introduction to the data visualization concept.
plotly.express demo. A super basic chart generated with Plotly Express, showcasing what you get out of the box.
plotly.graph_objects refinement. Tailoring the chart to achieve the desired result. There a lot of manual refinements, but the results are amazing compared to plotly.express .
Code and links to my GitHub repository will be provided at the end of the blog.

Let’s get started!

Too many categories = comparison confusion

Bar charts are one of the most widely used chart types because they excel at comparing discrete categories of data. However, if a bar chart contains too many categories, our human brains have a hard time processing all the information that the bar chart could contain. This hinders the charts ability to tell a good story.

Theory: how does the human brain process bar charts.

Scanning and comparing sizes. The human brain is exceptionally good at comparing lengths and heights. In fact, length is a pre-attentive visual attribute, which is another way of saying that our brains need no “brain power” in understanding length. Our eyes naturally scan the tops of the bars and quickly estimate their relative heights. This top-to-top comparison allows us to instantly grasp which categories have higher or lower values, without needing to read exact numbers (reading numbers is not a pre-attentive visual attribute).
Pattern recognition. The brain is wired to recognize patterns. In bar charts, it looks for recurring visual cues, like clusters of similarly sized bars or stark contrasts between heights. It also does it by processing visual elements by moving from left to right (or top-bottom). This left-to-right scanning makes it easy to mentally sort and compare categories in sequence, especially when they’re arranged logically (e.g., by size or time).
Contrast sensitivity. Our brains are also highly sensitive to contrast, which helps with distinguishing the differences between bars. For example, if one bar is significantly taller than the others, this visual contrast instantly draws attention, making the outlier or key data point easy to spot.

Theory: how many bar charts is too “many”?

The best practitioners in the industry talk about a maximum of 12 bars.

For example, Cole Nussbaumer recommends avoiding bar charts with more than 10–12 categories. Stephen Few, in his book Show Me the Numbers [3], suggests keeping the number of bars under 15 in most cases to avoid cognitive overload.

But, what happens if you do need more than these 12 bars?

A couple of options are: (1) breaking the data into multiple charts (as suggested by Cole N.) or (2) using dot plots (as suggested by Stephen F.).

In my opinion, there might be cases where having more than 12 bars in a single plot could still work. For example, if the human brain has been instinctively wired to understand groups with more categories; for example, 24 hours in a day, 27 countries in the EU or, being a bit extreme, 50 states in the US.

The plotly.express example — 27 bars in a bar chart

Let’s take the situation where you actually need to display more than 12–15 bars. Imagine you are presenting to an expert audience, who would expect all 27 European Union countries to be represented in the chart. With plotly.express you can build the chart below in very few lines.

Where do I think this plot has issues?

Given how wide the x-axis is (27 categories), I always have to scroll my eyes to left to see the y-axis value for the “IE” or “PL” countries.
I can cluster some countries as high and low, but there are countries I can’t distinguish from each other.
Axis titles would need cleaning up, eliminating or even rotating (the y-axis title requires me to become the exorcist child…)
There is an “Avg.” column, but difficult to pick it out.
Title and subtitle are small to read.

The plotly.go example(s) — an improved bar chart

Keeping the spirit of wanting to display a bar chart, we can make tweaks to the bar chart to make it more readable. Check the proposed new bar chart layout.

Why do I think this plot is better?

Title and subtitle are bigger and standout over the rest of the chart text.
Moving the y-axis title slightly above the plot and left aligned with main title makes it easier to read. In addition, we reduced clutter by eliminating the x-axis title.
The “Avg” column is clear and visible. It also helps separate the countries above and below the average. Another option would have been to eliminate this bar and simply colour the bars depending if they are above or below the average.
The y-axis data labels have been added to each bar to easily distinguish between closely sized bars. In addition, no need for eyes scrolling left-to-right looking for the y-axis values.
Visual cues like emojis for the country ISO codes can also help.
A “source” annotation with a link is added. This is always best practice so that anyone can open the detailed data/notebook/report.
An image can also help set the context. In this case, it is clear we are talking about the European Union countries. Moreover, the we have used the EE flag blue RGB code to colour the bars.

Introducing the dot plot chart: a cleaner alternative for comparing categories

When you have multiple categories to compare, a dot plot can offer a cleaner and more efficient alternative to bar charts. Instead of using bars to represent values, dot plots use simple points along a continuous scale, reducing visual clutter and making it easier to focus on the actual data. With a dot plot, comparisons become more straightforward because the eye can quickly assess the position of each dot along the axis without being distracted by the varying lengths of bars. Here is a screenshot of a possible dot plot.

Why do I think this plot is better?

Dot plots guide the viewer’s focus directly to the data point (the dot), rather than the full length of a bar, allowing for cleaner, less cluttered comparisons.
The use of a horizontal line representing the average with coloured dots, can be another way to clearly divide countries. We could have used the same approach in the bar chart, but I feel that colouring big bars could have been more distracting than colouring dots.

Tips on how to create this plot

(1) Coloured dots are represented with a scatter plot.

fig.add_trace(
  go.Scatter(
      x=df_filtered['ISO_Code_with_emoji'],
      y=df_filtered['y2023'],
      mode='markers+text',
      marker=dict(
          color=['rgb(255, 204, 0)' if y > avg_value else 'rgb(0, 51, 153)' 
                 for y in df_filtered['y2023']],
          size=18
      ),
      text=df_filtered['y2023'].round(1),
      textposition='top center'
    )
  )

(2) To add vertical lines joining the dots with their x-axis category, you can do it by adding shapes

for i, row in df_filtered.iterrows():
  fig.add_shape(
      type='line',
      x0=row['ISO_Code_with_emoji'],
      x1=row['ISO_Code_with_emoji'],
      y0=0,
      y1=row['y2023'],
      line=dict(color='lightgrey', width=2),
      layer="below",
  )

(3) To add the horizontal line, instead of using shapes, use add_hline

fig.add_hline(
    y=avg_value,
    line=dict(color='grey', dash='dot', width=0.5),
    annotation_text=f'Average: {avg_value:.1f}%',
    annotation_position='top right', layer="below",
  )

(4) Spend time on updating the layout:

Provide margin space to include annotations
Use annotations to represent the subtitle, the y-axis title just above the chart and the source. The key elements are xref="paper", yref="paper" and then tweaking the x and y values.
Hide the y-axis.

fig.update_layout(
  title=dict(
      text='The tourism trap',
      y=1, x=0,
      xanchor='left', yanchor='top',
      font=dict(family="Helvetica Neue", size=24),
  ),
  annotations=[
      # First paragraph annotation
      dict(
          text="10 countries are more heavily reliant on international tourism than the global average",
          xref="paper", yref="paper",
          x=0, y=1.18,
          showarrow=False,
          font=dict(size=18),
          align="left"
      ),
      # Second paragraph annotation
      dict(
          text="Travel and tourism share of GDP in the EU-27 and the UK in 2023",
          xref="paper", yref="paper",
          x=0, y=1.07,
          showarrow=False,
          font=dict(size=14),
          align="left"
      ),
      # Footer annotation
      dict(
          text="Source: Statista, <a href='https://www.statista.com/statistics/1228395/travel-and-tourism-share-of-gdp-in-the-eu-by-country/'>Share of travel and tourism's total contribution to GDP report</a>",
          xref="paper", yref="paper",
          x=0, y=-0.19,
          showarrow=False,
          font=dict(size=12),
          align="left"
      )
  ],
  images=[
      dict(
          source="https://upload.wikimedia.org/wikipedia/commons/b/b7/Flag_of_Europe.svg",
          xref="paper", yref="paper",
          x=1, y=1.05,
          sizex=0.2, sizey=0.2,
          xanchor="right", yanchor="bottom",
      )
  ],
  font=dict(family="Helvetica Neue"),
  yaxis=dict(title="", visible=False),
  xaxis=dict(title="",
             showline=True,
             linecolor='lightgrey',
             linewidth=3,
             type='category',
             ),
  margin=dict(t=100, pad=0),
  height=600,
  width=1000,
)

Bar charts are not a great choice for timeseries

Using bar charts for timeseries is taking the number categories to the extreme. Remember we said that 12–15 was an ideal number? Even though we managed to create a good bar chart plot for 27 categories, this was OK because the 27 categories represented the number of countries in the European Union. But when working with timeseries, we are dealing with dozens or hundreds of bars.

Theory: how does the human brain expect to process timeseries information.

The goal for displaying time series data is to (1) highlight trends over time, so viewers can easily see patterns like growth, decline, or seasonality, and (2) show smooth transitions between data points, ensuring that changes are clearly visualized, whether they are gradual or sudden. Bar charts, by their nature, focus on discrete comparisons rather than the continuous flow of data, which makes them less suited for time series visualisations.

Tracking changes and trends over time. The human brain is naturally wired to track movement and detect changes, which is why we expect time series data to show a continuous flow of information. Rather than focusing on individual data points, the brain looks for overall trends — whether the data is increasing, decreasing, or fluctuating. This is why line charts help mentally “connect the dots” and see how one data point leads to the next.
Pattern recognition. Just as with finding clusters of bar charts, the brain excels at recognising patterns, but here it’s specifically looking for regular intervals and cyclical movements. For example, if there’s a recurring pattern in the data — such as a weekly sales spike or seasonal change — the brain quickly latches onto these cycles.
Smoothing out noise and focusing on trends. In time series data, small fluctuations (noise) can distract from the overall trend. The human brain tends to focus on the bigger picture, naturally smoothing out minor variations and highlighting significant changes. This is why continuous visualizations like line charts work better for time series: they guide the brain to focus on overall movement rather than individual data points. Bar charts, by emphasizing discrete bars, fail to smooth out this noise, making it harder for viewers to see the overarching story.

The plotly.express example — a wall of colour blinds the eye

It takes some time to adjust and detect a few valleys in the timeseries trend.

Where do I think this plot has issues?

So many bars… make it a solid block of blue colour.
Lack of emphasis of certain periods.
Clutter where the trends fluctuate a lot (see years 1900 to 1960)

The plotly.go example(s) — an improved bar chart

As before, let’s present a possible improved bar chart.

Why do I think this plot is better?

Axis titles are cleaned up.
Data source added.
Clear distinction on real vs predicted life expectancy through the use of 2 different shades of blue.
Highlighting the big fluctuations on life expectancy at the beginning of the 1900s through annotations. These annotations even have a link to navigate to Wikipedia and learn more about the period.

Line chart — the de-facto chart for timeseries

Now, the “pro” version. Line charts are what we should generally be building for timeseries data.

Why do I think this plot is better?

First of all, the wall of blue colour is not there. This means that our eyes don’t have to actively cancel this visual noise to concentrate on other areas of the chart.
Having lines helps the human eyes to not be overwhelmed by fluctuations. For example, check the years 1900 to 1915. With a line chart, your brain processes a cyclical increase, but with a bar chart, that was super difficult.
I have added visual cues such as thick markers and thick lines in the years I reference in the annotations. This helps directly tie the text with the specific section of the lines. With the bar chart, you were looking at a rough area, but not able to pin-point specific years.

Tips on how to create this plot

(1) To differentiate the real vs predicted line are, add 1 trace for the main line chart and 1 trace for the predicted line chart.

prior_2022_country_data = (df[df['country'] == country]
                           .query("year <= 2023"))

fig.add_trace(go.Scatter(
    x=prior_2022_country_data['year'],
    y=prior_2022_country_data['life_expectancy'],
    mode='lines',
    line=dict(dash='solid', color='rgb(0, 51, 153)'),
    showlegend=False,
))

post_2022_country_data = (df[df['country'] == country]
                          .query("year >= 2023"))
fig.add_trace(go.Scatter(
    x=post_2022_country_data['year'],
    y=post_2022_country_data['life_expectancy'],
    mode='lines',
    line=dict(dash='dot', color='rgba(0, 51, 153, 0.5)'),
    showlegend=False,
))

(2) To highlight the specific years, add 1 trace per years you woud like to highlight.

def _highlight_years(df, specific_years):
    specific_years_data = df[(df['country'] == country) & (df['year'].isin(specific_years))]
    fig.add_trace(go.Scatter(
        x=specific_years_data['year'],
        y=specific_years_data['life_expectancy'],
        mode='lines+markers',
        line=dict(width=5.5, color='rgb(0, 51, 153)'),
        marker=dict(
            size=8,
            color='white',
            line=dict(color='rgb(0, 51, 153)', width=4)
        ),
        showlegend=False,
    ))

_highlight_years(df, [1917, 1918, 1919, 1920, 1921, 1922])
_highlight_years(df, [1933])
_highlight_years(df, [1941, 1942, 1943, 1944, 1945])

Summary

In summary, choosing the right visualisation for your data can significantly enhance clarity and engagement. While bar charts have their place, dot plots and line charts often offer cleaner, more intuitive alternatives for comparing categories or visualising time series.

Where can you find the code?

In my repo and the live Streamlit app:

Acknowledgements

[1] Cole Nussbaumer. Story telling with data book.
[2] AddTwoDigital agency. Not too many bars article.
[3] Stephen Few. Show Me the Numbers book.

Senior Data Science Lead

All my articles, one convenient place

Discussion about this post

Senior Data Science Lead

Awesome Plotly with code series (Part 1): Alternatives to bar charts

A bar chart is not always the best solution.

Introduction to the Series

Beyond the bars: Finding better ways to visualise data

“How to read” the blog

Too many categories = comparison confusion

Theory: how does the human brain process bar charts.

Theory: how many bar charts is too “many”?

The plotly.express example — 27 bars in a bar chart

The plotly.go example(s) — an improved bar chart

Introducing the dot plot chart: a cleaner alternative for comparing categories

Tips on how to create this plot

Bar charts are not a great choice for timeseries

Theory: how does the human brain expect to process timeseries information.

The plotly.express example — a wall of colour blinds the eye

The plotly.go example(s) — an improved bar chart

Line chart — the de-facto chart for timeseries

Tips on how to create this plot

Summary

Where can you find the code?

Acknowledgements

Further reading

All my articles, one convenient place

Discussion about this post