For a long time I have wondered why some people would use
position_stack() for position alignment instead of the simpler version
position = "stack".
Recently, though, I learned the purpose of the former approach when I tried to add data labels to a stacked bar chart for better legibility.
Further, I decided that this knowledge is a good addition to this ggplot2-tips series, so let’s see what
position_stack() can do.
To achieve this, let us create a small dummy data set.
Next, take a look at the corresponding stacked bar chart.
Since we created a dataset that contains percentages, I took the liberty of appropriately transforming the y-axis via
I believe that this visualization could be improved by adding text labels to each part of the stacked bar chart in order for the reader to immediately detect how large each portion of the bars is.
Let’s try this via simply converting the values to strings and adding
geom_text() to the plot.
Clearly, this did not work as intended because
position = "identity" by default which is why the y-position of the labels is simply determined by its value.
Now, here is where I would usually change the positioning via
position = "stack".
However, the result this approach delivers is somewhat less than perfect.
Ideally, I would like the labels to appear in the middle of each colored block.
We could try to use
vjust to move the labels which is not a great idea since every label will be moved by the same amount and the blocks are of different height.
Similarly, we could compute the block middle points by hand and use that as separate y-aesthetic in
Clearly, this involves a tedious additional computation and we should avoid this, if possible.
This is precisely where
position_stack() comes in.
position = position_stack() stacks the bars just like
position = "stack" does but the function
position_stack() has another argument
vjust by which we can move the labels individually.
Here, the possible values of
vjust range from 0 (bottom of the designated height) to 1 (top of the designated height).
Therefore, moving the labels to the middle of each bar is as easy as setting
vjust = 0.5.
Finally, one may - and this is definitely a matter of taste - tweak this plot further by changing the color and text formatting. Personally, I like darker colors combined with a white, bold label. In this case, this would look like this.
In summary, we have seen that using
position = position_stack() is a more powerful alternative to
position = "stack" that allows individual positioning.
Nevertheless, as long as the additional arguments of
position_stack() are not needed I still find the latter version simpler.