-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: Diverging stacked bar #3823
Comments
This can likely be achieved in some form through the objects interface, but I am confused as to what the data looks like exactly. Could you provide an example of the raw data ? |
In my instance with microsoft forms, the data is a list of categorial strings : "Not intuitive, slightly not intuitive, neutral, slightly intuitive, intuitive" The idea behind a diverging stacked bar graph is to show responses on a spectrum while keeping it aligned to the central position. Here is an example pandas data frame that would have such raw data: dataFrame = pd.DataFrame()
# In this case 5 categories
dataFrame["Intuitive"] = [
"Very Intuitive",
"Very Intuitive",
"Very Intuitive",
"Very Intuitive",
"Very Intuitive",
"Slightly Intuitive",
"Slightly Intuitive",
"Slightly Intuitive",
"Slightly Intuitive",
"Neutral"
"Neutral"
"Neutral"
"Slightly Unintuitive",
"Slightly Unintuitive",
"Very Unintuitive"
] |
Ah, got you, so grey/Neutral is centered on 0 and the other responses are stacked on the left or right, correct? I will see if I manage to replicate that in a not too contrived way; it does seem to be a pretty specific thing though. |
That is correct. Ideally the X axis would be positive on both sides as they are both counts. Currently I have not found any easy way to do it without significant fidgetting around, at least in MATLAB. Here is a little micro example (it turns out it's much simpler in matplotlib than in MATLAB) # importing package
import matplotlib.pyplot as plt
# create data
label = ["test"]
# Counted and ordered values
y = [1, 2, 3, 4, 5]
# plot bars in stack manner
plt.bar(label, y[0], bottom=-y[0]-y[1]-y[2]/2)
plt.bar(label, y[1], bottom=-y[1]-y[2]/2)
plt.bar(label, y[2], bottom=-y[2]/2)
plt.bar(label, y[3], bottom=y[2]/2)
plt.bar(label, y[4], bottom=y[2]/2+y[3])
plt.show() |
Also sometimes you want to force the user to not be neutral, hence the odd and even numbers as a slight difficulty in the implementation. You don't center in the same way. In that case the options would be: The center in that case would be on the point between "slightly not intuitive" and "slightly intuitive". EDIT : Corresponding example: # importing package
import matplotlib.pyplot as plt
# create data
label = ["test"]
# Counted and ordered values
y = [1, 2, 3, 4, 5, 6]
# plot bars in stack manner
plt.bar(label, y[0], bottom=-y[0]-y[1]-y[2])
plt.bar(label, y[1], bottom=-y[1]-y[2])
plt.bar(label, y[2], bottom=-y[2])
plt.bar(label, y[3], bottom=0)
plt.bar(label, y[4], bottom=y[3])
plt.bar(label, y[5], bottom=y[3]+y[4])
plt.show() |
As more example, of such graphs we can look at population pyramids with 2 or 4 values: https://en.wikipedia.org/wiki/Population_pyramid |
Ok, you can find the result below. I had to create a custom object, which is not officially supported by seaborn currently so this might break in the future. But hey, it works. Of course, in a real context the class would be in another module; the second group of imports is only necessary for defining the DivergingStack class. You also need a tiny bit of pandas manipulation in order to get the counts in the way I implemented it but it is pretty manageable. import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import seaborn.objects as so
from dataclasses import dataclass
from functools import partial
from pandas import DataFrame
from seaborn._core.groupby import GroupBy
from seaborn._core.moves import Move
from seaborn._core.scales import Scale
@dataclass
class DivergingStack(Move):
def _stack(self, df, orient, order=None):
df = GroupBy(order).apply(df, lambda x: x)
if df["baseline"].nunique() > 1:
err = "Stack move cannot be used when baselines are already heterogeneous"
raise RuntimeError(err)
other = {"x": "y", "y": "x"}[orient]
stacked_lengths = (df[other] - df["baseline"]).dropna().cumsum()
offsets = stacked_lengths.shift(1).fillna(0)
if len(df) % 2 == 0:
middle = stacked_lengths[len(df) // 2 - 1]
else:
middle = (stacked_lengths[len(df) // 2 - 1] + stacked_lengths[len(df)//2]) / 2
df[other] = stacked_lengths - middle
df["baseline"] = df["baseline"] + offsets - middle
return df
def __call__(
self, data: DataFrame, groupby: GroupBy, orient: str, scales: dict[str, Scale],
) -> DataFrame:
groupers = ["col", "row", orient]
return GroupBy(groupers).apply(data, partial(self._stack, order=groupby.order), orient)
ranks = ["Very Unintuitive","Slightly Unintuitive","Neutral","Slightly Intuitive","Very Intuitive"]
colors = ["red","indianred","grey","limegreen","green"]
df = pd.DataFrame({
"Intuitive": np.random.choice(ranks,size=500),
"category": np.random.choice(["A","B","C","D","E"],size=500),
})
grouped_df = df.groupby(["category","Intuitive"]).size().to_frame(name="count")
fig,ax = plt.subplots()
p = (
so.Plot(data=grouped_df,x="count",y="category",color="Intuitive")
.add(so.Bar(edgewidth=0),DivergingStack())
.scale(color=so.Nominal(values=colors,order=ranks))
)
p.on(ax).plot()
plt.show() |
Oh wow, that is really nice work! Here are the few things I can think of that could improve what you did
If you need help testing things out at some point or documenting I'll gladly help. |
In any case, I provided the heavy lifting here; I will leave you handle the details. |
Hi, I'm currently analysing likerts data and it's common practice to analyse the response using diverging stacked bar.
Seeing there are no implementation for it in the library I wanted to propose that as an enhancement.
Expected problems: handling a even and odd number of bins makes handling the center data different (at least from my experience in Matlab and their bar charts).
Here is an example of a diverging stacked bar I did for data analysis.

The text was updated successfully, but these errors were encountered: