I have two lists of IDs.
I would like to compare the two lists, in particular I am interested in the following figures:
- How many IDs are both in list A and B
- How many IDs are in A but not in B
- How many IDs are in B but not in A
I would also love to draw a Venn diagram.
Here are some basics to try out:
Similarly, you could get counts simply as:
With sqldf: Slower but very suitable for data frames with mixed types:
Using the same example data as one of the answers above.
The
match
function returns a vector with the location inB
of all values inA
. So,cat
, the second element inA
, is the third element inB
. There are no other matches.To get the matching values in
A
andB
, you can do:To get the non-matching values in
A
andB
:Further, you can use
length()
to get the total number of matching and non-matching values.I'm usually dealing with large-ish sets, so I use a table instead of a Venn diagram:
Yet an another way, with using %in% and boolean vectors of common elements instead of intersect and setdiff. I take it you actually want to compare two vectors, not two lists - a list is an R class that may contain any type of element, while vectors always contain elements of just one type, hence easier comparison of what is truly equal. Here the elements are transformed to character strings, as that was the most inflexible element type that was present.
Like it was mentioned, there are multiple choices for plotting Venn-diagrams in R. Here is the output using gplots.