I would like to manipulate data to do network analysis using ggnet.
The dataset is in csv form and looks like this:
offers
{9425, 5801, 18451, 17958, 16023, 7166}
{20003, 17737, 4031, 5554}
{19764, 5553, 5554}
What I would like to break the array, and iterate to permute all the items each row as a pair of 2. So the ultimate output should look like:
print list(itertools.permutations([1,2,3,4], 2)) per row to create:
(9425, 5801)
(9425, 18451)
(9425, 17958)
(9425, 16023)
(9425, 7166)
(5801, 18451)
(5801, 17958)
(5801, 16023)
(5801, 7166)
...
I could use either R or Python to do this.
Any suggestions to solve this problem?
Another R solution, assuming there are more rows in your file.
# read in csv file as list of integers (each row in csv = 1 list element)
offers <- readLines("offers.csv") %>% strsplit(",") %>% lapply(as.integer)
# create permutation pairs for each element in the list
permutation.list <- lapply(seq_along(offers), function(i) {t(combn(offers[[i]], m = 2))})
# combine all permutation pairs into 1 data frame
permutation.data.frame <- plyr::ldply(permutation.list, data.frame)
Below are the results based on the sample data provided:
> permutation.list
[[1]]
[,1] [,2]
[1,] 9425 5801
[2,] 9425 18451
[3,] 9425 17958
[4,] 9425 16023
[5,] 9425 7166
[6,] 5801 18451
[7,] 5801 17958
[8,] 5801 16023
[9,] 5801 7166
[10,] 18451 17958
[11,] 18451 16023
[12,] 18451 7166
[13,] 17958 16023
[14,] 17958 7166
[15,] 16023 7166
[[2]]
[,1] [,2]
[1,] 20003 17737
[2,] 20003 4031
[3,] 20003 5554
[4,] 17737 4031
[5,] 17737 5554
[6,] 4031 5554
[[3]]
[,1] [,2]
[1,] 19764 5553
[2,] 19764 5554
[3,] 5553 5554
> permutation.data.frame
X1 X2
1 9425 5801
2 9425 18451
3 9425 17958
4 9425 16023
5 9425 7166
6 5801 18451
7 5801 17958
8 5801 16023
9 5801 7166
10 18451 17958
11 18451 16023
12 18451 7166
13 17958 16023
14 17958 7166
15 16023 7166
16 20003 17737
17 20003 4031
18 20003 5554
19 17737 4031
20 17737 5554
21 4031 5554
22 19764 5553
23 19764 5554
24 5553 5554
you can try this in R
:
a <- c(9425, 5801, 18451, 17958, 16023, 7166)
b <- c(20003, 17737, 4031, 5554)
c <- c(19764, 5553, 5554)
rbind(t(combn(a,2)),
t(combn(b,2)),
t(combn(c,2)))
t(do.call(cbind,mapply(combn,list(a,b,c),2)))
[,1] [,2]
[1,] 9425 5801
[2,] 9425 18451
[3,] 9425 17958
[4,] 9425 16023
[5,] 9425 7166
[6,] 5801 18451
[7,] 5801 17958
[8,] 5801 16023
[9,] 5801 7166
[10,] 18451 17958
[11,] 18451 16023
[12,] 18451 7166
: : :
: : :
You already have the solutions for permutations. For breaking the array and merging it, open the csv read line by line and append to list.
from itertools import chain
import itertools
#Create Empty Dictionary
list= []
for i, eline in enumerate(CSVfile.readlines()):
list.append(eline.strip())
MergedArray= {i for j in (list) for i in j}
#Use your permutations code below
print list(itertools.permutations(MergedArray, 2))