how to create edges for nodes?

2019-07-23 06:48发布

问题:

In need a find the degree of every protein in the input file which is as shown below

A   B
a   b
c   d
a   c
c   b

I have used networkx to get the nodes. How do I create the edges using my input file on the created nodes?

Code:

import pandas as pd
df = pd.read_csv('protein.txt',sep='\t', index_col =0)
df = df.reset_index()
df.columns = ['a', 'b']

distinct = pd.concat([df['a'], df['b']]).unique()

import networkx as nx
G=nx.Graph()

nodes= []
for i in distinct:
    node=G.add_node(1)
    nodes.append(node)

回答1:

From networkx documentation, use add_edge in the loop or collect edges first then use add_edges_from:

>>> G = nx.Graph()   # or DiGraph, MultiGraph, MultiDiGraph, etc
>>> e = (1,2)
>>> G.add_edge(1, 2)           # explicit two-node form
>>> G.add_edge(*e)             # single edge as tuple of two nodes
>>> G.add_edges_from( [(1,2)] ) # add edges from iterable container

Then G.degree() gives you the degree of nodes.



回答2:

At first, the function read_csv was used incorrectly to read the input file. The columns are separated by spaces, not tab, thus sep should be '\s+' instead of '\t'. Also, there is no index column in the input file, thus the parameter index_col should not be set to 0.

After having correctly read the input file into a DataFrame, we can convert it to a networkx graph using the function from_pandas_edgelist.

import networkx as nx
import pandas as pd

df = pd.read_csv('protein.txt', sep='\s+')
g = nx.from_pandas_edgelist(df, 'A', 'B')