This question already has an answer here:
- Consecutive group number in R 3 answers
I am currently dealing with a car data. We recorded the speed of the car every 5 minutes, and it contains a lot of zero values. My question is, how to segment the data by a zero values and give each non-zero section a ordered number in R? Let's take a sample data as example:
sample <- data.frame(
id = 1:15,
speed = c(50,0, 0, 30, 50, 40,0, 0, 25, 30, 50, 0, 30, 50, 40))
I want to add a new column that gives each non-zero section a number (starting from 1), while a consecutive number of k zero speeds (or more) is numbered as 0. Specifically for this sample data, let's say k equals 2, then my desired result should be like this dataframe:
sample_new <- data.frame(
id = 1:15,
speed = c(50,0, 0, 0, 50, 40,0, 0, 25, 30, 50, 0, 30, 50, 40),
number = c(1, 0, 0, 0, 2, 2, 0 ,0, 3, 3, 3, 3, 3, 3, 3))
which prints as
id speed number
1 1 50 1
2 2 0 0
3 3 0 0
4 4 0 0
5 5 50 2
6 6 40 2
7 7 0 0
8 8 0 0
9 9 25 3
10 10 30 3
11 11 50 3
12 12 0 3** <- here is the difference
13 13 30 3
14 14 50 3
15 15 40 3
There are more than 1 million rows in my data, so I hope that the solution could be acceptable in speed.
The reason for setting a threshold "k" is that, some drivers just leave their GPS open even if they lock the car and go to sleep. But in other occasion where the interval is less than k, they just stopped because of the crossroad light. I want to focus on the longtime stops and just ignore the short time stops.
Hope my question makes sense to you.Thank you.