I am running a coin-toss simulation with a loop which runs about 1 million times.
Each time I run the loop I wish to retain the table output from the RLE command. Unfortunately a simple append does not seem to be appropriate. Each time I run the loop I get a slightly different amount of data which seems to be one of the sticking points.
This code gives an idea of what I am doing:
N <- 5 #Number of times to run
rlex <-NULL
#begin loop#############################
for (i in 1:N) { #tells R to repeat N number
x <-sample(0:1, 100000, 1/2)
rlex <-append(rlex, rle(x))
}
table(rlex) #doesn't work
table(rle(x)) #only 1
So instead of having five separate rle results (in this simulation, 1 million in the full version), I want one merged rle table. Hope this is clear. Obviously my actual code is a bit more complex, hence any solution should be as close to what I have specified as possible.
UPDATE: The loop is an absolute requirement. No ifs or buts. Perhaps I can pull out the table(rle(x)) data and put it into a matrix. However again the stumbling block is the fact that some of the less frequent run lengths do not always turn up in each loop. Thus I guess I am looking to conditionally fill a matrix based on the run length number?
Last update before I give up: Retaining the rle$values will mean that too much data is being retained. My simulation is large-scale and I really only wish to retain the table output of the rle. Either I retain each table(rle(x)) for each loop and combine by hand (there will be thousands), or I find a programmatic way to keep the data (yes for zeroes and ones) and have one table that is formed from merging each of the individual loops as I go along.
Either this is easyish to do, as specified, or I will not be doing it. It may seem a silly idea/request, but that should be incidental to whether it can be done.
Seriously last time. Here is an animated gif showing what I expect to happen.
After each iteration of the loop data is added to the table. This is as clear as I am going to be able to communicate it.
Following up @CarlWitthoft's answer, you probably want:
since I think you don't care about the
$values
component (i.e. whether each run is a run of zeros or ones).Result: one long vector of run lengths.
But this would probably be a lot more efficient:
Result: an
N
bymaxlen
table of run lengths from each iteration.If you only want to save the total number of runs of each length you could try:
Result: an vector of length
maxlen
of the total numbers of run lengths across all iterations.And here's my final answer:
Result: a
maxlen
by 2 table of the total numbers of run lengths across all iterations, divided by type (0-run vs 1-run).OK, attempt number 4:
Produces:
If you want the aggregation inside the loop, do:
You need to read the help page for
rle
. Consider:In the meantime, I strongly suggest you spend some time reading up on statistical methods. There is zero (+/- epsilon) chance that running a binomial simulation a million times will tell you anything you won't learn after a few hundred tries, unless your coin has p=1e-5 :-).