I have a data.frame called d
. In this data.frame, some columns consist of constant numbers across the rows of the first column: study.name
(see below).
For example, columns ESL
, ESL.1
, prof
, and prof.1
are constant numbers for all rows of Shin.Ellis
and also constant for all rows of Trus.Hsu
and so on.
Q: In BASE R, how can I separate such constant variables, and then condense them to one row with only one number?
My desired output is shown further below. A functional answer is appreciated.
d <- read.csv("https://raw.githubusercontent.com/izeh/m/master/irr.csv", h = T)[-(2:3)]
## FIRST 8 ROWS:
# study.name ESL prof scope type ESL.1 prof.1 scope.1 type.1
# 1 Shin.Ellis 1 2 1 1 1 2 1 1
# 2 Shin.Ellis 1 2 1 1 1 2 1 1
# 3 Shin.Ellis 1 2 1 2 1 2 1 1
# 4 Shin.Ellis 1 2 1 2 1 2 1 1
# 5 Shin.Ellis 1 2 NA NA 1 2 NA NA
# 6 Shin.Ellis 1 2 NA NA 1 2 NA NA
# 7 Trus.Hsu 2 2 2 1 2 2 1 1
# 8 Trus.Hsu 2 2 NA NA 2 2 NA NA
Desired output:
# study.name ESL prof ESL.1 prof.1
# 1 Shin.Ellis 1 2 1 2
# 2 Trus.Hsu 2 2 2 2
# . . . . . . # AND SO ON !!!
You could try something like this, though it feels a bit clumsy. Basically, check which columns have constant values by group for all groups, keep only those columns, and then keep only
unique
values (since now they are constant by group).Created on 2019-10-09 by the reprex package (v0.3.0)
If you just want to remove repeated values across all columns unique() is base R
EDIT - Thanks for the clarification @CalumYou - I think this is what OP is looking for in base R.
May be we need
Or in
base R
after grouping by 'study.name', get the first row while specifying thena.action = NULL
as the default option isna.omit
which can omit any row havingNA
in any of the columnsIf we want to subset the columns