Introduction to Statistics (MAT/SST 115.03 2008S)
Stacked Bar Graphs (also called Segmented Bar Graphs) are a data visualization technique that can be useful for studying two-way tables. In a stacked bar plot, we use one bar for each value of the explanatory variable (as in simple bar plots). However, the bar is segmented into multiple parts, one for each value of the response variable. In Workshop Statistics, Edition 3, stacked bar graphs are introduced in Topic 6.
In R, you make stacked bar graphs using the
function. However, instead of providing
with a vector to plot, you provide it with a modified frame. (How do
you modify the frame? You call
as.matrix on that
frame. Don't ask why.) The
barplot function plots each column in the table
as a bar. So, you first create your table using whatever method you
EnvironmentSpending = edit(data.frame())
EnvironmentSpending = data.frame( Liberal = c(.819, .174, .007), Moderate = c(.619, .314, .067), Conservative = c(.479, .385, .136) )
EnvironmentSpending = read.csv("/home/rebelsky/Stats115/Data/EnvironmentSpending.csv", row.names=1)
Once you've read in the data, the basic command is fairly straightforward.
Of course, this is R, so there are dozens of options to make the barplot more interesting. Here are a few that might be helpful.
As in the past,
puts a title on the graph.
It's usually helpful if stacked bar plots included a legend that explains
what the parts of each a stacked bar represents. The
legend=... option adds the legend. As you might expect,
we'll use a vector of strings for that legend. You can create the
barplot(as.matrix(EnvironmentSpending), legend=c("Too Little", "About Right", "Too Much") )
If we've labeled the rows of our table, we can also grab use the vector of row names.
barplot(as.matrix(EnvironmentSpending), legend=rownames(EnvironmentSpending) )
We can even recolor the parts of the stacked bar graph using the
col=.... option. Once again, we provide a vector of
strings that name colors, such as
You can look up possible names using
Putting it all together, we get the following as a relatively complete bar graph for activity 6-1.
barplot(as.matrix(EnvironmentSpending), main="Political Perspectives on Environment Spending", legend=rownames(EnvironmentSpending), col=c("green","grey","red") )
Unfortunately, the designers of R seem to have chosen a bad default place for the legend. In this particular example, it may obscure the separation between two parts of the stacked bar.
The solution is to tell R a bit more about how you expect the graph to
look. In particular, you can use
to specify the horizontal “limits” of the image and
width= to specify the width of each bar. For example,
if we have three bars, we might make each one two units wide and allow
nine units of space.
barplot(as.matrix(EnvironmentSpending), main="Political Perspectives on Environment Spending", legend=rownames(EnvironmentSpending), col=c("green","grey","red"), xlim=c(0,9), width=2 )
You may also find it useful to place the various bars for one
value (column) side-by-side, rather than stacked on top of each other.
For that, you use the parameter
besid=T. For example,
It is worth trying this alternate at least once, just to see the difference.
Copyright (c) 2007-8 Samuel A. Rebelsky.
This work is licensed under a Creative Commons
Attribution-NonCommercial 2.5 License. To view a copy of this
or send a letter to Creative Commons, 543 Howard Street, 5th Floor,
San Francisco, California, 94105, USA.