A 2-hour workshop on moving beyond "making charts" to building narratives that drive decisions.
Presenter: Rose Boudreau — AI Systems Architect at CGI & NSCC Data Analytics Student
Rigorous data hygiene. Checking bias. Questioning “obvious” correlations and digging deeper than the easy narrative.
Color theory, Gestalt principles, visual hierarchy. Reducing cognitive load so the data speaks clearly.
Headlines, not labels. Crafting narratives from data that guide the audience to conclusions with integrity.
2 hours. 4 acts. Two streams: Flourish (drag-and-drop) & Python (code). Choose your path.
The Trap — correlation vs. causation with real macro-economic data.
Dig deeper. Build scatter plots. Question every correlation. Find your own data.
Good vs. bad visualization. Color theory. Chart selection. Design principles.
The Sandpit — tell three contradictory stories from the same dataset.
We show you a chart. You’ll draw a conclusion. And you’ll be wrong. That’s the point.
As ChatGPT usage rises, dev job postings plummet. Obvious conclusion: “AI is killing jobs.” But is it?
The real driver: as interest rates climbed from 0% to 5.3%, venture capital dried up and hiring froze. Timing ≠ causation.
ChatGPT launched in November 2022 — the exact same period rates spiked. Two things happened simultaneously, but only one caused the hiring freeze. The Scientist asks: “What’s the mechanism?”
Interest rates ↑ → Venture capital dries up → Startups freeze hiring → Big Tech follows with layoffs. ChatGPT? Coincidence, not causation.
tech_hiring_macro_trends.csv. Set Date as X-axis.Copy-paste this into a Jupyter notebook or Python file. Requires pandas, plotly.
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
df = pd.read_csv("data/tech_hiring_macro_trends.csv", parse_dates=["Date"])
# THE TRAP: Dev Jobs vs ChatGPT
fig = make_subplots(specs=[[{"secondary_y": True}]])
fig.add_trace(go.Scatter(x=df["Date"], y=df["Dev_Jobs"],
name="Dev Job Postings", line=dict(color="#ef4444", width=3)), secondary_y=False)
fig.add_trace(go.Scatter(x=df["Date"], y=df["ChatGPT_Traffic"],
name="ChatGPT Interest", line=dict(color="#22c55e", width=3, dash="dash")), secondary_y=True)
fig.update_layout(title="The Trap: Did AI Kill the Entry-Level Job?",
template="plotly_dark", paper_bgcolor="#0f172a", plot_bgcolor="#0f172a",
hovermode="x unified", legend=dict(orientation="h", y=-0.15))
fig.show()
# THE TRUTH: Dev Jobs vs Interest Rate
fig2 = make_subplots(specs=[[{"secondary_y": True}]])
fig2.add_trace(go.Scatter(x=df["Date"], y=df["Dev_Jobs"],
name="Dev Job Postings", line=dict(color="#ef4444", width=3)), secondary_y=False)
fig2.add_trace(go.Scatter(x=df["Date"], y=df["Fed_Interest"],
name="Fed Interest Rate (%)", line=dict(color="#3b82f6", width=3, dash="dot")), secondary_y=True)
fig2.update_layout(title="The Truth: Interest Rates Drove the Hiring Freeze",
template="plotly_dark", paper_bgcolor="#0f172a", plot_bgcolor="#0f172a",
hovermode="x unified", legend=dict(orientation="h", y=-0.15))
fig2.show()
Data-driven narratives are not black and white. Good Data Analysts provide nuance, not clickbait.
| Date | Dev_Jobs | ChatGPT_Traffic | Fed_Interest | AI_Jobs |
|---|---|---|---|---|
| 2022-01 | 125.4 | 0 | 0.08 | 45 |
| 2022-11 | 81.5 | 12 | 3.78 | 65 |
| 2023-02 | 68.4 | 100 | 4.57 | 92 |
| 2024-01 | 48.5 | 85 | 5.33 | 150 |
Key observation: Dev jobs dropped 61% while AI jobs grew 233%. ChatGPT traffic AND interest rates both correlate — but which one has a causal mechanism?
Visualize the dataset using any Flourish chart type of your choosing. Embed your narrative within the title and description.
Find additional data using AI or Google (e.g., VC funding data, NASDAQ index, layoff trackers) from a reputable source. Add it to alter your narrative and show a more complete picture.
import pandas as pd
import plotly.express as px
df = pd.read_csv("data/tech_hiring_macro_trends.csv", parse_dates=["Date"])
# Scatter 1: ChatGPT vs Dev Jobs (the misleading correlation)
df_chat = df[df["ChatGPT_Traffic"] > 0]
fig1 = px.scatter(df_chat, x="ChatGPT_Traffic", y="Dev_Jobs", trendline="ols",
title="Dev Jobs vs ChatGPT Traffic — Looks Scary, Right?",
labels={"ChatGPT_Traffic":"ChatGPT Search Interest", "Dev_Jobs":"Dev Jobs (Index)"},
color_discrete_sequence=["#ef4444"], template="plotly_dark")
fig1.update_layout(paper_bgcolor="#0f172a", plot_bgcolor="#0f172a")
fig1.show()
# Scatter 2: Interest Rate vs Dev Jobs (the real driver)
fig2 = px.scatter(df, x="Fed_Interest", y="Dev_Jobs", trendline="ols",
title="Dev Jobs vs Federal Interest Rate — The Real Driver",
labels={"Fed_Interest":"Fed Rate (%)", "Dev_Jobs":"Dev Jobs (Index)"},
color_discrete_sequence=["#3b82f6"], template="plotly_dark")
fig2.update_layout(paper_bgcolor="#0f172a", plot_bgcolor="#0f172a")
fig2.show()
import plotly.graph_objects as go
corr = df[["Dev_Jobs","ChatGPT_Traffic","Fed_Interest","AI_Jobs"]].corr()
fig = go.Figure(data=go.Heatmap(z=corr.values, x=corr.columns, y=corr.columns,
colorscale="RdBu", zmid=0, text=corr.values.round(2), texttemplate="%{text}"))
fig.update_layout(title="Correlation Matrix — Everything Correlates, But Why?",
template="plotly_dark", paper_bgcolor="#0f172a", plot_bgcolor="#0f172a",
width=600, height=500)
fig.show()
Run the scatter plots above. Which correlation looks “stronger”? Does strength of correlation prove causation?
Use pandas_datareader or find a CSV of VC funding data to add a third variable. Does it strengthen the interest rate narrative?
The right chart, the right color, the right focus. Design choices are editorial choices.
Example: A well-designed bubble chart showing GDP vs Happiness with Internet Penetration as bubble size.
global_development_2024.csv.Choose ANY chart type in Flourish. Use global_development_2024.csv. Apply at least 3 of the 5 Artist’s Principles above. Make it beautiful AND meaningful.
import pandas as pd
import plotly.express as px
df = pd.read_csv("data/global_development_2024.csv")
fig = px.scatter(df, x="GDP_Per_Capita_USD", y="Happiness_Score_0_10",
size="Internet_Usage_Pct", color="Region", hover_name="Country",
title="Does Money Buy Happiness? (Bubble = Internet Penetration)",
labels={"GDP_Per_Capita_USD":"GDP Per Capita (USD)",
"Happiness_Score_0_10":"Happiness (0-10)"},
size_max=40,
color_discrete_map={"Europe":"#3b82f6","Asia":"#f97316",
"North America":"#e31937","South America":"#22c55e",
"Africa":"#eab308","Oceania":"#a855f7"},
template="plotly_dark")
fig.update_layout(paper_bgcolor="#0f172a", plot_bgcolor="#0f172a")
fig.show()
import plotly.graph_objects as go
df_u = df.sort_values("Unemployment_Rate", ascending=False)
colors = ["#e31937" if r > 10 else "#334155" for r in df_u["Unemployment_Rate"]]
fig = go.Figure(go.Bar(x=df_u["Country"], y=df_u["Unemployment_Rate"],
marker_color=colors, text=df_u["Unemployment_Rate"].round(1),
textposition="outside"))
fig.update_layout(
title="Unemployment Crisis: Nigeria, Afghanistan & Costa Rica Above 10%",
template="plotly_dark", paper_bgcolor="#0f172a", plot_bgcolor="#0f172a",
yaxis_title="Unemployment Rate (%)",
xaxis=dict(tickangle=-45, tickfont=dict(size=10)))
fig.show()
fig = px.choropleth(df, locations="Country", locationmode="country names",
color="CO2_Emissions_Per_Capita", hover_name="Country",
color_continuous_scale="YlOrRd",
title="CO2 Per Capita: North America Leads (Not a Good Thing)",
labels={"CO2_Emissions_Per_Capita":"CO2 (tonnes)"},
template="plotly_dark")
fig.update_layout(paper_bgcolor="#0f172a",
geo=dict(bgcolor="#0f172a", lakecolor="#0f172a"))
fig.show()
Using plotly, create a visualization from global_development_2024.csv that applies at least 3 of the 5 Artist’s Principles. Add annotations, custom colors, and a narrative title.
Same dataset. Three contradictory — yet truthful — stories. The Author chooses which truth to amplify.
Example: A grouped bar chart revealing gender pay disparities across departments — the same data, framed by the Author.
None of these stories are lies. But none tell the whole truth. A great Data Professional presents the nuance — not just the narrative that serves their agenda. Your job is to be honest about what the data shows AND what it doesn’t.
workforce_dynamics.csv to Flourish.Pick ONE of the three personas above. Build the visualization in Flourish that tells their story. Then write a 2-sentence description that a manager would read and act on. Bonus: Can you add a disclaimer about what the data does NOT show?
import pandas as pd
import plotly.graph_objects as go
df = pd.read_csv("data/workforce_dynamics.csv")
promo = df.groupby("Work_Setup").agg(
total=("Promoted_Last_Year","count"),
promoted=("Promoted_Last_Year", lambda x: (x=="Yes").sum())
).reset_index()
promo["Rate"] = (promo["promoted"]/promo["total"]*100).round(1)
fig = go.Figure(go.Bar(x=promo["Work_Setup"], y=promo["Rate"],
marker_color=["#e31937","#f97316","#334155"],
text=[f"{v}%" for v in promo["Rate"]], textposition="outside"))
fig.update_layout(title="RTO Advocate: On-Site Workers Get Promoted More",
template="plotly_dark", paper_bgcolor="#0f172a", plot_bgcolor="#0f172a",
yaxis_title="Promotion Rate (%)", yaxis=dict(range=[0,80]))
fig.show()
import plotly.express as px
fig = px.scatter(df, x="Weekly_Hours", y="Mental_Health_Index",
color="Productivity_Score", size="Salary", hover_name="Emp_ID",
title="Union Rep: High Performers Are Burning Out",
color_continuous_scale="RdYlGn", template="plotly_dark")
fig.update_layout(paper_bgcolor="#0f172a", plot_bgcolor="#0f172a")
fig.add_shape(type="rect", x0=50, x1=66, y0=0, y1=5,
fillcolor="rgba(239,68,68,0.1)", line=dict(color="#ef4444",dash="dash"))
fig.add_annotation(x=58, y=2.5, text="BURNOUT ZONE",
font=dict(color="#ef4444",size=12), showarrow=False)
fig.show()
pay = df.groupby(["Department","Gender"])["Salary"].mean().reset_index()
fig = px.bar(pay, x="Department", y="Salary", color="Gender", barmode="group",
title="DEI Auditor: Gender Pay Gaps Persist Across Departments",
color_discrete_map={"Male":"#3b82f6","Female":"#e31937","Non-Binary":"#a855f7"},
template="plotly_dark")
fig.update_layout(paper_bgcolor="#0f172a", plot_bgcolor="#0f172a",
legend=dict(orientation="h", y=-0.15))
fig.show()
Pick ONE persona. Build the chart in Python. Then modify it to tell the opposite story from the same data. Add annotations explaining what the original chart hid. This is the Author’s superpower: showing both sides.
Everything you need to continue your data storytelling journey.
All datasets are available for download above, or directly from the repo:
data/tech_hiring_macro_trends.csvdata/global_development_2024.csvdata/workforce_dynamics.csv# Install required packages
pip install pandas plotly jupyter statsmodels
# Start Jupyter Notebook
jupyter notebook
# Or run scripts directly
python pythonSolutions/act1_the_hook.py
python pythonSolutions/act2_the_scientist.py
python pythonSolutions/act3_the_artist.py
python pythonSolutions/act4_the_author.py
🔬 The Scientist
Question every correlation. Look for mechanisms, confounders, and hidden variables. Correlation ≠ causation.
🎨 The Artist
Choose chart types wisely. Use color with purpose. Write headlines, not labels. Reduce clutter ruthlessly.
📝 The Author
Data tells many stories. Choose yours ethically. Show what the data reveals AND what it hides.