UNHCR Logo
  • Guidance
  • Chart types
  • Resources
  • Tutorials
  • Product gallery
Tutorials
  • R
    • Change over time
    • Comparison
    • Correlation
    • Distribution
    • Geospatial
    • Part-to-a-whole
    • Ranking
  • Matplotlib
    • Change over time
    • Comparison
    • Correlation
    • Distribution
    • Part-to-a-whole
    • Ranking
  • Plotly Python
    • Comparison
  • D3
    • Change over time
    • Comparison
    • Correlation
    • Distribution
    • Geospatial
    • Part-to-a-whole
On this page
  • Boxplot
  • Histogram
  • Population pyramid
  1. Home
  2. Tutorials
  3. R
  4. Distribution

Boxplot

The boxplot uses boxes and lines to show the distributions of one or more groups of numeric data based on a 5-point summary of data points: the upperextreme (maximum), upper quartile (Q3), median, lower quartile (Q1), and lower extreme (minimum) values.

More about: Boxplot

Basic boxplot

# Loading required packages
library(unhcrthemes)
library(tidyverse)

# Loading data
df <- read_csv(
  "https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/boxplot.csv"
)

# Plot
ggplot(df, aes(x = country, y = age)) +
  geom_boxplot(
    fill = unhcr_pal(n = 1, "pal_blue"),
    alpha = 0.3,
    color = "grey60",
    width = .5
  ) +
  labs(
    title = "Refugees age distribution by country of asylum | 2020",
    y = "Age",
    caption = "Source: UNHCR Refugee Data Finder<br>© UNHCR, The UN Refugee Agency"
  ) +
  scale_y_continuous(
    expand = expansion(c(.02, .1)),
    breaks = seq(0, 100, 20)
  ) +
  theme_unhcr(grid = "Y", axis_title = "y")

Grouped boxplot

# Loading required packages
library(unhcrthemes)
library(tidyverse)

# Loading data
df <- read_csv(
  "https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/boxplot.csv"
)

# Plot
ggplot(df, aes(x = country, y = age, fill = gender)) +
  geom_boxplot(
    alpha = 0.4,
    color = "grey60",
    width = .5
  ) +
  labs(
    title = "Refugees age distribution by country of asylum | 2020",
    y = "Age",
    caption = "Source: UNHCR Refugee Data Finder<br>© UNHCR, The UN Refugee Agency"
  ) +
  scale_y_continuous(
    expand = expansion(c(.02, .1)),
    breaks = seq(0, 100, 20)
  ) +
  scale_fill_unhcr_d(nmax = 4, order = c(4, 1)) +
  theme_unhcr(grid = "Y", axis_title = "y")

Histogram

A histogram displays the distribution of data over a continuous interval or specific time period. The height of each bar in a histogram indicates the frequency of data points within the interval/bin.

More about: Histogram - Other tutorials: Matplotlib D3

# Loading required packages
library(unhcrthemes)
library(tidyverse)
library(scales)

# Loading data
df <- read_csv(
  "https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/histogram.csv"
)

# Plot
ggplot(df, aes(x = poc_age)) +
  geom_histogram(
    fill = unhcr_pal(n = 1, "pal_blue"),
    binwidth = 5,
    boundary = 0
  ) +
  labs(
    title = "Age distribution | 2020",
    x = "Age",
    y = "Number of people",
    caption = "Source: UNHCR Refugee Data Finder<br>© UNHCR, The UN Refugee Agency"
  ) +
  scale_x_continuous(breaks = scales::pretty_breaks(n = 10)) +
  scale_y_continuous(expand = expansion(c(0, 0.1))) +
  theme_unhcr(grid = "Y", axis = "x")

Population pyramid

A population pyramid consists of two histograms, one for each gender (conventionally, males on the left and females on the right) where the population numbers are shown horizontally (X-axis) and the age vertically (Y-axis).

More about: Population pyramid - Other tutorials: Matplotlib D3

# Loading required packages
library(unhcrthemes)
library(tidyverse)
library(scales)

# Loading data
df <- read_csv(
  "https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/population_pyramid.csv"
)

# Plot
ggplot(df, aes(y = ages)) +
  geom_col(aes(x = -male, fill = "Male"), width = .7) +
  geom_col(aes(x = female, fill = "Female"), width = .7) +
  geom_text(
    aes(
      x = -male,
      label = percent(abs(male))
    ),
    hjust = 1.25,
    size = 10 / ggplot2::.pt
  ) +
  geom_text(
    aes(
      x = female,
      label = percent(female)
    ),
    hjust = -0.25,
    size = 10 / ggplot2::.pt
  ) +
  labs(
    title = "Demographics of forcibly displaced people | 2020",
    caption = "Note: figures do not add up to 100 per cent due to rounding<br>Source: UNHCR Refugee Data Finder<br>© UNHCR, The UN Refugee Agency"
  ) +
  scale_x_continuous(expand = expansion(c(.2, .2))) +
  scale_fill_manual(
    values = setNames(
      unhcr_pal(n = 4, "pal_unhcr")[c(1, 4)],
      c("Male", "Female")
    )
  ) +
  theme_unhcr(grid = FALSE, axis_title = FALSE, axis_text = "Y") +
  theme(legend.key.size = unit(.75, "line"))
Contact us
  • Guidance
  • Chart types
  • Resources
  • Tutorials
  • Product gallery

© UNHCR