Bubble chart with R

Cedric Vidonne

Lei Chen

Bubble chart with R

A bubble chart displays multi-dimensional data in a two-dimensional plot. It can be considered as a variation of the scatterplot, in which the dots are replaced with bubbles. However, unlike a scatterplot which has only two variables defined by the X and Y axis, on a bubble chart each data point (bubble) can be assigned with a third variable (by size of bubble) and a fourth variable (by colour of bubble)

More about: Bubble chart


Bubble chart

# Loading required packages
library(unhcrthemes)
library(tidyverse)
library(scales)
library(ggrepel)

# Loading data
df <- read_csv("https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/correlation/bubble.csv")

# Plot
ggplot(df, aes(
  x = refugee_number,
  y = idp_number
)) +
  geom_point(aes(size = total_number),
    color = unhcr_pal(n = 1, "pal_blue"),
    alpha = 0.6
  ) +
  scale_size(
    range = c(4, 16),
    name = "Total population",
    labels = label_number_si(),
    breaks = c(8e6, 10e6, 12e6)
  ) +
  geom_text_repel(aes(label = region),
    size = 8 / .pt
  ) +
  labs(
    title = "Comparison of refugee and IDP population by region | 2021",
    y = "Number of IDPs",
    x = "Number of refugees",
    caption = "Source: UNHCR Refugee Data Finder\n© UNHCR, The UN Refugee Agency"
  ) +
  scale_x_continuous(labels = label_number_si()) +
  scale_y_continuous(
    labels = label_number_si(),
    breaks = pretty_breaks(n = 6)
  ) +
  coord_cartesian(clip = "off") +
  theme_unhcr(
    grid = "XY",
    axis = FALSE,
    axis_title = "xy",
    legend_title = TRUE
  )

A bubble chart showing comparison of refugee and IDP population by region | 2021


Bubble chart with colours

# Loading required packages
library(unhcrthemes)
library(tidyverse)
library(scales)
library(ggrepel)

# Loading data
df <- read_csv("https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/correlation/bubble.csv")

# Order regions for visualization
df$region <- factor(df$region,
  levels = c("East and Horn of Africa and Great Lakes", "Southern Africa", "West and Central Africa", "Americas", "Asia and the Pacific", "Europe", "Middle East and North Africa")
)

# Plot
ggplot(
  df,
  aes(
    x = refugee_number,
    y = idp_number
  )
) +
  geom_point(aes(
    size = total_number,
    color = region
  ),
  alpha = 0.6
  ) +
  scale_size(
    range = c(4, 16),
    name = "Total population",
    labels = label_number_si(),
    breaks = c(8e6, 10e6, 12e6)
  ) +
  geom_text_repel(aes(label = region),
    size = 8 / .pt
  ) +
  labs(
    title = "Comparison of refugee and IDP population by region | 2021",
    y = "Number of IDPs",
    x = "Number of refugees",
    caption = "Source: UNHCR Refugee Data Finder\n© UNHCR, The UN Refugee Agency"
  ) +
  scale_x_continuous(labels = label_number_si()) +
  scale_y_continuous(
    labels = label_number_si(),
    breaks = pretty_breaks(n = 6)
  ) +
  scale_color_unhcr_d(
    palette = "pal_unhcr_region",
    guide = "none"
  ) +
  coord_cartesian(clip = "off") +
  theme_unhcr(
    grid = "XY",
    axis = FALSE,
    axis_title = "xy",
    legend_title = TRUE
  ) +
  guides(size = guide_legend(override.aes = list(shape = 21)))

A bubble chart showing comparison of refugee and IDP population by region | 2021


Related chart with R