I am working with a dataframe where each observation is linked to a specific ID, and I have a set of variables that define the "values" as if I had a factor variable. However, the value in the "cell" is the frequency. Here is a simplified version:
ID 1 2 3
A 2 3 2
B 1 4 1
I would like to get two vectors that expand the frequencies so that I can calculate an interpolated median for each ID. That is, I'd like something of the form:
A B
1 1
1 2
2 2
2 2
2 2
3 3
3
The psych
package has a function interp.median
that could then take each vector and return the interpolated median for each ID that I would like to include as a new variable in the original dataframe. I checked out the vcdExtra
package which could maybe do this with its expand.dft
function, but I'm not sure exactly how it would work.
Any help would be greatly appreciated!
EDIT: To refine a bit more, interp.median
would work best if the final result was a data frame, with NAs padded at the end. That is, something of the form:
A B
1 1
1 2
2 2
2 2
2 2
3 3
3 NA
If
dat
is the datasetOr
data
Here one way:
Yields:
Sample data:
Use
apply
:This outputs characters, but you could output numbers like this:
Edit:
To address OP's new request that the vectors be put in a data.frame and padded with NAs, call this after running either of the options above: