I'm having a strange issue where I am looping over a large data frame to create a 3D barplot from the data in 2 columns, where the Z axis is the frequency. The original data frame looks like this (please excuse excess columns):
> head(MergedBH)
Row.names V1.x V2.x V3.x V4.x V5.x
RFL_Contig1 RFL_Contig1 RFL_Contig1 Scaffold3494078 1.00 1.000 470
RFL_Contig100 RFL_Contig100 RFL_Contig100 Scaffold2661063 0.61 0.975 236
RFL_Contig1000 RFL_Contig1000 RFL_Contig1000 Scaffold861300 0.96 0.995 451
RFL_Contig1001 RFL_Contig1001 RFL_Contig1001 Scaffold4753307 0.67 0.982 568
RFL_Contig1002 RFL_Contig1002 RFL_Contig1002 Scaffold317096 1.00 0.996 1513
RFL_Contig1003 RFL_Contig1003 RFL_Contig1003 Scaffold60619 0.90 1.000 698
V1.y V2.y V3.y V4.y V5.y
RFL_Contig1 RFL_Contig1 ta_contig_5DS_2768763 1.00 1.000 572
RFL_Contig100 RFL_Contig100 ta_contig_4DS_482537 0.56 0.966 737
RFL_Contig1000 RFL_Contig1000 ta_contig_2AL_5829507 0.83 0.944 1573
RFL_Contig1001 RFL_Contig1001 ta_contig_7BS_3161139 1.00 0.999 910
RFL_Contig1002 RFL_Contig1002 ta_contig_3B_10401908 1.00 0.997 2681
RFL_Contig1003 RFL_Contig1003 ta_contig_2AL_6424276 0.70 1.000 1004
I want to create a 3d barplot where the x axis is $V4.x and the y axis is $V4.y. I don't use the typical hist2d function since so much weight is at the 1,1 position, and we want to visualize the weight at that position against the others as well. To do this I created a 3 column matrix with columns 1-2 containing all pairwise combinations in the range of V4.x and y respectively (.8-1 by .001), and the final column being the frequency. I do this with the lines below:
> for3d.mat <- matrix(ncol=3,nrow=0)
> for(i in seq(.8,1,by=.001)){for(j in seq(.8,1,by=.001)){iter.mat <- matrix(ncol=3,c(i,j,length(subset(MergedBH,MergedBH$V4.x==i & MergedBH$V4.y==j)$V4.x)));for3d.mat <- rbind(for3d.mat,iter.mat)}}
> subset(for3d.mat,for3d.mat[,1] == .975 & for3d.mat[,2] == .966)
[,1] [,2] [,3]
> for3d.mat[35350:35325,]
[,1] [,2] [,3]
[1,] 0.975 0.974 0
[2,] 0.975 0.973 0
[3,] 0.975 0.972 0
[4,] 0.975 0.971 0
[5,] 0.975 0.970 0
[6,] 0.975 0.969 0
[7,] 0.975 0.968 0
[8,] 0.975 0.967 0
[9,] 0.975 0.966 0
[10,] 0.975 0.965 0
[11,] 0.975 0.964 0
[12,] 0.975 0.963 0
[13,] 0.975 0.962 0
[14,] 0.975 0.961 0
[15,] 0.975 0.960 0
[16,] 0.975 0.959 0
[17,] 0.975 0.958 0
[18,] 0.975 0.957 0
Somehow the value for RFL_Contig100, .975,.966, is not picked up by subset when working on the large matrix, and when I find the correct row it has a value of 0 for the frequency, but if I take that one line out of the for loop and run it it makes the correct entry:
> matrix(ncol=3,c(i,j,length(subset(MergedBH,MergedBH$V4.x==i & MergedBH$V4.y==j)$V4.x)))
[,1] [,2] [,3]
[1,] 0.975 0.966 1
Any suggestions on what the issue is? I've tried a few different ways of doing this but can't get around the subset function, would there be another way to compute the depth for each bin in order to use for a 3D barplot to visualize all points at once?
Thanks in advance
Update:
Getting the same problem with '[', where a large part of the matrix, between .92 and .98 is not getting processed:
> for3d.mat <- matrix(ncol=3,nrow=0)
> for(i in seq(.8,1,by=.001)){for(j in seq(.8,1,by=.001)){iter.mat <- matrix(ncol=3,c(i,j,length(MergedBH[MergedBH$V4.x ==i & MergedBH$V4.y ==j,]$V4.x)));for3d.mat <- rbind(for3d.mat,iter.mat)}}
> for3d.mat[for3d.mat[,1] == .975 & for3d.mat[,2] == .966,]
[,1] [,2] [,3]
Am able to use '[' or subset on most of the matrix, but there is just a specific range whether for the original data frame or the for3d.mat that is not accessible by either subsetting method, example below:
> for3d.mat[for3d.mat[,1] == .976 & for3d.mat[,2] == .937,]
[1] 0.976 0.937 NA
> for3d.mat[for3d.mat[,1] == .975 & for3d.mat[,2] == .937,]
[,1] [,2] [,3]