Exporting and visualizing your data from Jira using Python
My current company uses a scrum system and the Jira software, which means keeping track of story points, issues, epics, and more. Although Jira provides a tone of features for visualizing your data, I wanted to see if I could plot my story points along different epics. So, I decided to use Python to do some basic data analysis.
In this article, I will show you how to plot data from your Jira dashboard using Python
The basic steps for this are:
- Export the data from Jira
- Load the data
- Clean and prepare the data
- Visualize the data
Let’s get to work!
1. Export the data from Jira
The steps to export your data from Jira are:
- Go to Reports
2. Go to sprint report
3. Go to see navigator issue in the right corner of the sprint dashboard
4. Click in all issues
5. Click in export
6. Export to CSV
Load the Data
Ok, now for the fun part, let’s load the data using Python. For privacy reasons, I changed the data and removed restricted information.
import pandas as pd
df = pd.read_csv("./jira_data.csv")
Clean and prepare the data
To clean up the data we will just select the relevant columns we will need, drop nans and irrelevant data, and finally, validate that the final dataframe looks good.
The columns we will need from the CSV are:
- Assignee: the person responsible for the task (in this case me!)
- Sprint: the name of the sprint (in this case in letters alphabetical order)
- Custom field (Story Points): the story points for each task (here I am not sure if this field will have a different name, but just look for something like story points)
df = df[["Assignee", "Sprint", "Custom field (Story Points)"]].reset_index(drop=True)
# drop nans from dataframe
df = df.dropna()
Now I group columns by the sprint name:
# letters from A to J in upper case as a list
cols = list(map(chr, range(65, 75)))
for col in df["Sprint"].unique():
if col not in cols:
df.drop(df[df["Sprint"] == col].index, inplace=True)
df["Sprint"].unique()array(['J', 'I', 'H', 'G', 'F', 'E', 'D', 'C', 'B', 'A'], dtype=object)df_sorted_points = df.groupby(["Sprint"])["Custom field (Story Points)"].sum().reset_index()
Perfect! Now that we have our data we can start working on visualizing it!
Visualize the Data
What I want from this example is just an organized timeline of my story points per sprint in order.
import matplotlib.pyplot as plt
import seaborn as sns
colors = sns.color_palette("hls", len(df_sorted_points["Sprint"].unique()))
plt.title("Story Points per Sprint")
height=df_sorted_points["Custom field (Story Points)"],
There is a lot one could do with Jira data to improve productivity, in this case, my focus was mostly on having information regarding the coherence of my sprint planning.
In the future, I might look at things like the interaction between the number of story points and time spent on tasks.