I have written a function that plots a number of lines along an axis, stacking them where they overlap. Below is the code, a sample table, and the image that it produces.
The plot is mostly exactly what I was looking for but for a few things (in order of importance):
Plotting the segments is an extremely slow process: about 1 segment every 0.5 seconds. Considering they are just lines I expected something much faster. I don't know the cause of this. I know that explicit loops can be slow in R so it might be this, or should I be plotting off-screen somehow and then pushing the plot to screen afterwards? Finding a time efficient method for plotting this kind of map is important because my tables can easily be tens of thousands of rows long.
I can't find any way of specifying the gap between y positions to be a fixed distance regardless of the number of Y positions. At the extreme, plotting just two segments produces a plot with the segments very far apart from each other.
Can anyone help me with either of these points (or indeed, anything else I could be doing better)?
(In this code reads == segments)
The function:
viewReads <- function(reads){
# sort by start
sorted <- reads[order(reads$start),];
#---
# In the first iteration we work out the y-axis
# positions that segments should be plotted on
# segments should be plotted on the next availible
# y position without merging with another segment
#---
yread <- c(); #keeps track of the x space that is used up by segments
# get x axis limits
minstart <- min(sorted$start);
maxend <- max(sorted$end);
# initialise yread
yread[1] <- minstart - 1;
ypos <- c(); #holds the y pos of the ith segment
# for each read
for (r in 1:nrow(sorted)){
read <- sorted[r,];
start <- read$start;
placed <- FALSE;
# iterate through yread to find the next availible
# y pos at this x pos (start)
y <- 1;
while(!placed){
if(yread[y] < start){
ypos[r] <- y;
yread[y] <- read$end;
placed <- TRUE;
}
# current y pos is used by another segment, increment
y <- y + 1;
# initialize another y pos if we're at the end of the list
if(y > length(yread)){
yread[y] <- minstart-1;
}
}
}
# find the maximum y pos that is used to size up the plot
maxy <- length(yread);
sorted$ypos <- ypos;
# Now we have all the information, start the plot
plot.new();
plot.window(xlim=c(minstart, maxend+((maxend-minstart)/10)), ylim=c(1,maxy));
axis(3);
#---
# This second iteration plots the segments using the found y pos and
# the start and end values
#---
for (r in 1:nrow(sorted)){
read <- sorted[r,];
# colour dependent on strand type
if(read$strand == '+'){
color = 'blue'
}else{
color = 'red'
}
#plot this segment!
segments(read$start, maxy-read$ypos, read$end, maxy-read$ypos, col=color);
}
}
Sample code:
start end strand
86 115 +
87 115 +
91 116 +
88 117 +
91 117 +
98 125 -
104 131 +
104 131 +
106 132 -
104 134 +
104 134 +
104 134 +
106 134 +
106 134 +
106 134 +
106 134 +
106 134 +
106 135 +
106 135 +
106 135 +
106 135 +
106 135 +
106 135 +
106 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
108 135 +
109 135 +
116 135 -
106 136 +
106 136 +
106 136 +
108 136 +
108 136 +
108 136 +
108 136 +
108 136 +
108 136 +
108 136 +
108 136 +
108 136 +
108 137 +
108 137 +
109 137 -
108 138 +
108 138 +
108 138 +
108 138 +
112 138 +
112 139 +
119 141 +
116 143 +
121 145 +
127 145 -
119 146 +
121 148 +
142 169 -
142 169 -
160 185 -
162 185 -
165 185 -
result: