This is my first question on stackoverlow, please correct me if I am not following correct question protocols.
I am trying to create some graphs for data that has been collected over three time points (time 1, time 2, time 3) which equates to X1..., X2... and X3... at the beginning of column names. The graphs are also separated by the column $Group from the data frame.
I have no problem creating the graphs, I just have many variables (~170) and am wanting to compare time 1 vs time 2, time 2 vs time 3, etc. so am trying to work a shortcut to be running this kind of code rather than having to type out each one individually.
As indicated above, I have created variable names like X1... X2... which indicate the time that the variable was recorded i.e. X1BCSTCAT = time 1; X2BCSTCAT = time 2; X3BCSTCAT = time 3. Here is a small sample of what my data looks like:
df <- structure(list(ID = structure(1:6, .Label = c("101","102","103","118","119","120"), class = "factor"),
Group = structure(c(1L,1L,1L,2L,2L,2L), .Label = c("C8","TC"), class = "factor"),
Wave = structure(c(1L, 2L, 3L, 4L, 1L, 2L), .Label = c("A","B","C","D"), class = "factor"),
Yr = structure(c(1L, 2L, 1L, 2L, 1L, 2L), .Label = c("3","5"), class = c("ordered", "factor")),
Age.Yr. = c(10.936,10.936, 9.311, 10.881, 10.683, 11.244),
Training..hr. = c(10.667,10.333, 10.667, 10.333, 10.333, 10.333),
X1BCSTCAT = c(-0.156,0.637,-1.133,0.637,2.189,1.229),
X1BCSTCR = c(0.484,0.192, -1.309, 0.912, 1.902, 0.484),
X1BCSTPR = c(-1.773,0.859, 0.859, 0.12, -1.111, 0.12),
X2BCSTCAT = c(1.006, -0.379,-1.902, 0.444, 2.074, 1.006),
X2BCSTCR = c(0.405, -0.457,-1.622, 1.368, 1.981, 0.168),
X2BCSTPR = c(-0.511, -0.036,2.189, -0.036, -0.894, 0.949),
X3BCSTCAT = c(1.18, -1.399,-1.399, 1.18, 1.18, 1.18),
X3BCSTCR = c(0.967, -1.622, -1.622,0.967, 0.967, 1.255),
X3BCSTPR = c(-1.282, -1.282, 1.539,1.539, 0.792, 0.792)),
row.names = c(1L, 2L, 3L, 4L, 5L,8L), class = "data.frame")
Here is some working code to create one graph using ggplot for time 1 vs time 2 data on one variable:
library(ggplot2)
p <- ggplot(df, aes(x=df$X1BCSTCAT, y=df$X2BCSTCAT, shape = df$Group, color = df$Group)) +
geom_point() + geom_smooth(method=lm, aes(fill=df$Group), fullrange = TRUE) +
labs(title="BCSTCAT", x="Time 1", y = "Time 2") +
scale_color_manual(name = "Group",labels = c("C8","TC"),values = c("blue", "red")) +
scale_shape_manual(name = "Group",labels = c("C8","TC"),values = c(16, 17)) +
scale_fill_manual(name = "Group",labels = c("C8", "TC"),values = c("light blue", "pink"))
So I am really trying to create some kind of a shortcut where R will cycle through and match up variable names X1... vs X2... and so on and create the graphs. I assume there must be some way to plot either based upon matching column numbers e.g. df[,7] vs df[,10] and iterating through this process or plotting by actually matching the names (where the only difference in variable names is the number which indicates time).
I have previously cycled through creating individual graphs using the lapply
function, but have no idea where to even start with trying to do this one.