Histogram with python

Cedric Vidonne

Lei Chen

Histogram with python

A histogram displays the distribution of data over a continuous interval or specific time period. The height of each bar in a histogram indicates the frequency of data points within the interval/bin. It’s a great tool to identify where values are concentrated, or if there are extreme values or gaps in the dataset.

More about: Histogram


# import libraries
import matplotlib.pyplot as plt
import pandas as pd

#load data set
df = pd.read_csv('https://raw.githubusercontent.com/GDS-ODSSS/unhcr-dataviz-platform/master/data/distribution/histogram.csv')

#compute data array for plotting
x = df['poc_age']
num_bins = 25

#plot the chart
fig, ax = plt.subplots()
histo = ax.hist(x, num_bins)

#set x,y axis limits
xl = plt.xlim(0,100)
yl = plt.ylim(0,35)

#set chart title
ax.set_title('Age distribution | 2020')

#set axis label
ax.set_ylabel('Number of people')

#set chart source and copyright
plt.annotate('Source: UNHCR Refugee Data Finder', (0,0), (0, -25), xycoords='axes fraction', textcoords='offset points', va='top', color = '#666666', fontsize=9)
plt.annotate('©UNHCR, The UN Refugee Agency', (0,0), (0, -35), xycoords='axes fraction', textcoords='offset points', va='top', color = '#666666', fontsize=9)

#adjust chart margin and layout

#show chart

A histogram showing age distribution | 2020

Related chart with Python