I want to select the rows of a data frame in which the length of the string in the column v3 is equal to the length of the string of the column v4. My dataframe 'df' looks like:
v1 v2 v3 v4
1 456 . C T
2 462 . C T
3 497 . C T
4 499 . GC AC
5 499 . GC G
6 499 . GC CC
7 513 . GCACA GCA
8 513 . GCACA GCACACA
9 513 . GCACA ACACA
10 513 . GCACA GCACACACA
11 513 . GCACA GCACACACACA
12 513 . GCACA GACCACA
13 513 . GCACA G
14 521 . ACN A
15 522 . CNN C
The output should be:
v1 v2 v3 v4
1 456 . C T
2 462 . C T
3 497 . C T
4 499 . GC AC
9 513 . GCACA ACACA
I have tried:
new_df = df[nchar(str_sub(df$v3))==nchar(str_sub(df$v4))]