To make my code more readable, I like to avoid names of objects that already exist when creating new objects. Because of the package-based nature of R, and because functions are first-class objects, it can be easy to overwrite common functions that are not in base R (since a common package might use a short function name but without knowing what package to load there is no way to check for it). Objects such as the built-in logicals T and F also cause trouble.
Some examples that come to mind are:
One letter
- c
- t
- T/F
- J
Two letters
- df
A better solution might be to avoid using short names altogether in favor of more descriptive ones, and I generally try to do that as a matter of habit. Yet "df" for a function which manipulates a generic data.frame is plenty descriptive and a longer name adds little, so short names have their uses. In addition, for SO questions where the larger context isn't necessarily known, coming up with descriptive names is well-nigh impossible.
What other one- and two-letter variable names conflict with existing R objects? Which among those are sufficiently common that they should be avoided? If they are not in base
, please list the package as well. The best answers will involve at least some code; please provide it if used.
Note that I am not asking whether or not overwriting functions that already exist is advisable or not. That question is addressed on SO already:
In R, what exactly is the problem with having variables with the same name as base R functions?
For visualizations of some answers here, see this question on CV:
https://stats.stackexchange.com/questions/13999/visualizing-2-letter-combinations
apropos
is ideal for this:With no packages loaded, this returns:
The exact contents will depend upon the search list. Try loading a few packages and re-running it if you care about conflicts with packages that you commonly use.
I loaded all the (>200) packages installed on my machine with this:
And reran the call to
apropos
, wrapping it inunique
, since there were a few duplicates.This returned:
You can see where they came from with
Been thinking about this more. Here's a list of one-letter object names in base R:
And one- and two-letter object names in base R:
That's a much bigger list than I initially suspected, although I would never think of naming a variable "if", so to a certain degree it makes sense.
Still doesn't capture object names not in base, or give any sense of which functions are best avoided. I think a better answer would either use expert opinion to figure out which functions are important (e.g. using
c
is probably worse than usingqf
) or use a data mining approach on a bunch of R code to see what short-named functions get used the most.