How to write Hierarchical query in PYTHON

2020-08-09 04:07发布

The given input is like:

EMPLOYEE_ID NAME     MANAGER_ID
   101         A       10
   102         B       11
   10          C       1
   11          D       1
   1           E       null


Employee     Cycle      LEVEL Path
   101         A         101/10/1
   102         B         102/11/1
   10          C         10/1
   11          D         11/1
   1           E         1

It will be great if one can solve it using python "pandas" library. I am not sure if it can be achieved using pandas or not. Other solutions are also welcomed.

标签: python pandas
3条回答
Juvenile、少年°
2楼-- · 2020-08-09 04:58

dictionary with EMPLOYEE_ID and MANAGER_ID:

dct = dict(zip(df.EMPLOYEE_ID.values, df.MANAGER_ID.values))

function to create hierarchy string

def heirarchy(id):
  boss = str(id) + '/'
  while dct[id] != 'null':
    boss += dct[id] + '/'
    id = int(dct[id])
  return boss[:-1]

apply

df['LEVEL'] = df.EMPLOYEE_ID.apply(heirarchy)

# Result

   EMPLOYEE_ID NAME MANAGER_ID     LEVEL
0          101    A         10  101/10/1
1          102    B         11  102/11/1
2           10    C          1      10/1
3           11    D          1      11/1
4            1    E       null         1
查看更多
ら.Afraid
3楼-- · 2020-08-09 04:59

You can create a dictionary mapping children to parents.

Then use pd.Series.apply to construct your path string via a while loop.

Note I assume null actually means NaN, which makes more sense for a numeric column.

child_parent_dict = df.set_index('EMPLOYEE_ID')['MANAGER_ID'].to_dict()

def get_all_parents(child):
    """Get all parents from hierarchy structure"""
    while child == child:
        child = child_parent_dict[child]
        if child == child:
            yield int(child)

def get_path(x):
    """Calculate path and construct string"""
    return '/'.join(list(map(str, [x]+list(get_all_parents(x)))))

df['Path'] = df['EMPLOYEE_ID'].apply(get_path)

print(df)

#    EMPLOYEE_ID NAME  MANAGER_ID      Path
# 0          101    A          10  101/10/1
# 1          102    B          11  102/11/1
# 2           10    C           1      10/1
# 3           11    D           1      11/1
# 4            1    E         NaN         1
查看更多
我只想做你的唯一
4楼-- · 2020-08-09 05:05

I find the method/approach used by user3483203 pretty neat and to the point; the code is simple to follow. The only thing that I'd add is instead of the function returning a '/' delimited string, I'd output a native python structure like a list. Something like this:

def get_managerial_hierarchy(employee_id, manager_list=None):
if manager_list == None:
    manager_list = []
manager_list.append(employee_id)
employee_id = parent_child[employee_id]
if employee_id != '':
    get_managerial_hierarchy(employee_id, manager_list)
return manager_list

The output will look like this:

['101', '10', '1']
['102', '11', '1']
['10', '1']
['11', '1']
['1']

You can always wrap/change the list, if necessary, to comply with your desired output. However, now you can now just as easily quantify the distance/level of managerial hierarchy (how many layers apart is the CEO from the lowest staff) by using the len() function. By the way, I used a recursive approach. To scale it out, I'd stick to an iterative solution

查看更多
登录 后发表回答