Inspired by this
question where apparently the top answer is using an unsafe/erroneous way to add colors to a legend for a scatter plot.
Top answer suggests doing this:
data<-iris
plot(data$Sepal.Length, data$Sepal.Width, col=data$Species)
legend(7,4.3,unique(data$Species),col=1:length(data$Species),pch=1)
Comments suggest using levels()
instead of unique()
for controlling the text and colors in the call to legend()
, but are unclear on why it would help. I would need a better explanation to trust that code.
How can I write code that guarantees proper coloring?
A solution I've found is:
data <- iris
# Create a translation table that couple species to color
colorcode = data.frame(
cbind(colorsMy = c("red", "green", "blue"), species = levels(data$Species)),
stringsAsFactors = F)
# Make vector with colors for the different points in the scatter
iriscolors = sapply(data$Species, # Species to colors translation acc to colorcode
function(x) colorcode$colorsMy[colorcode$species == x])
# Plot the scatter using the color vector constructed according the colorcode
plot(data$Sepal.Length, data$Sepal.Width, col = iriscolors, pch = 19)
# Since iriscolors according to colorcode, I can use colorcode for the legend
legend("bottomright", legend = colorcode$species, fill = colorcode$colorsMy)
This code is a bit bulky, but easy to follow and explicitly constructs correct color labeling in the legend. The "trick" is to create the colorcode
variable that serves as a translation table between levels of the factor (iris species in this case) and colors for the legend.