A couple of weeks back, I recreated an info graphic with
The result and the whole story is embedded in this thread on Twitter:
The fun thing about getting better at #ggplot2 is that you begin to mimick other #dataviz.— Albert Rapp (@rappa753) March 5, 2022
Here is a practice #rstats info graphic I created after seeing a similar infographic from @EatSmarter_de
Original graphic, making of, comments and some ressources below ⬇️🧵 pic.twitter.com/FslScy9sc7
Aside from the embarrasing typo in “What you should know…”, I picked up a useful technique for what do when I want aesthetics to vary within a geom. Sounds complicated? Let’s take a look at a couple of examples.
How do I manually set aesthetics with aes() and scale_*_identity()?
This one is the easy case when all geoms behave properly.
- the sizes were determined in the
- sizes were mapped to the aesthethic via
scale_size_identity()layer makes sure that the sizes are not assigned by ggplot but taken as given (identity scale layers are available for other aesthetics as well).
How do I manually set aesthetics without aes()?
The last example used
aes() to access
However, we then had to make sure that ggplot does not assign sizes based on unique values in
Instead, sizes were supposed to be taken as is.
This was the job of
Let’s make it work without it.
This will generate the exact same plot as before (which is why I suppressed the output).
In this case, we mapped the sizes manually by assigning a vector of sizes to the
geom_segment() but outside
Of course, now we cannot simply write
size = size_col because
geom_segment() won’t know that variable.
aes() let ggplot know that we mean
size_col from the data set
Now, we have to pass the vector by accessing it from
tib ourself through
How do I manually set aesthethics when the previous approaches do not work?
Finally, let’s switch from
This changes our straight lines from before to curved lines.
What’s more, I can control how strong the curvature is supposed to be via
But as it is right now, both of our differently-sized curves have the same level of curvature.
Maybe, this ought to be different. Maybe, not all curves are made the same. Maybe, our visualization should reflect the diversity of all the curves out there in this gigantic world we inhabit. All curves are beautiful!
Let’s make this happen as we did before.
It seems as if
geom_curve() expects the argument of
curvature to be a single number.
Well, at least this time we can see curves.
Unfortunately, the warning let’s us know that
curvature is an unknown aesthetic which will be ignored.
As you can see, this results in the same curvature for both curves again.
So, it looks like we can only hope to set each curvature separately.
Alright, this time we got what we wanted. That’s something at least. Honestly, our “solution” is not scalable though. What if we want to draw hundreds of curves?
In fact, this is what slowed me down when I created the info graphic that started this blog post. The text boxes were not vectorized so I would have to place each text box manually. That’s a lot of text boxes and I was having none of that.
So, here is where functional programming stepped in.
Let’s recreate what I did based on our curve example.
First, we extend
tib with another curvature column.
Then, we use
pmap() to create a list of curve layers.
If you have not used any functional programming before, checkout my YARDS lecture notes on that topic.
Basically, what we will do is to apply the
geom_curve() function to each row of the
~ (in front of the function) and
..2, etc. we can then say where to stick in the values from each of
Here, we have set the first column of
x) to the
x-aesthetic within aes.
Then, we proceeded similarly for all other columns.
This resulted in a list of curve layers.
These are useless without a
So, let’s complete the plot.
Damn, these are some nice functionally created curves. Now, let’s put our new technique to a test. Can it handle arbitrarily many curves?
Congratulations! We have successfully created drawings of a toddler. And the even better news is that we can draw as many curves as we want.
Surprisingly, before I started this blog post, I was not aware that you can simply add lists to
ggplot() and it works.
As you will see in the Twitter thread on top of this post, I initially thought that one had to combine the list with more functional programming like so.
This was something I picked up from Hadley Wickham’s ggplot2 book but it seems that we don’t need that anymore (the combine function, the book is still a great ressource). But I leave this here for completeness' sake. Once again, writing a blog post has taught me stuff I thought I already knew. If you want to watch me learn more stuff or want to learn more ggplot things yourself, feel free to subscribe to my RSS feed or follow me on Twitter.