I have decided to spend some time to learn dplyr thoroughly. I have just come across the select()
function and some of the helper functions that come with it.
By just playing around I have failed to find any difference between the contains
and matches
helper functions.
Could someone please provide an example of how they can be used for different purposes?
Thank you,
The difference is that
matches
can take regex as pattern to match column names andselect
whilecontains
does the literal match of substring or full name match. It is described in the?select_helpers
asConsider a simple example where we want to select columns that have substring 'col'
Here, it matches the 'col' literally in the column names and select those. If we change the matching criteria to match 'col' followed by one or more digits (
\\d+
) with a regexif fails, because it is looking for column name substring
"col\\d+"
whereas
matches
takeregex
and match those patterns