Python pandas, build a dataframe from 2 dataframes

2019-08-20 07:38发布

问题:

This question already has an answer here:

  • Pandas left outer join multiple dataframes on multiple columns 2 answers

i'm in need to solve this issue.

I need to build a whole dataframe from two dataframes, but include only certain info from a second Dataframe if required.

EXAMPLE:

DF1:

    MATERIAL_N° Description DATE DUE
0   123123300   Lightbulb X 01/05/2018
1   220466      Lightbulb Y 04/04/2018
2   220000      Lightbulb Z 07/07/2018
3   1241241     Lightbulb A 02/01/2019
4   7775447     Lightbulb B 02/01/2019

DF2:

    BG GROUP    MATERIAL N° TRANSIT TIME
0   9001        123123300   45D
1   9002        220466      30D
2   9004        220000      30D
3   9003        44124       20D
4   9000        2512222     15D
5   9002        1241241     40D

EXPECTED RESULT DF3:

    MATERIAL N° Description     DATE DUE    BG GROUP TRANSIT TIME
0   123123300   Lightbulb X     01/05/2018  9001.0  45D
1   220466      Lightbulb Y     04/04/2018  9002.0  30D
2   220000      Lightbulb Z     07/07/2018  9004.0  30D
3   1241241     Lightbulb A     02/01/2019  9002.0  40D
4   7775447     Lightbulb B     02/01/2019  NaN     NaN

I hope this example is clear enough, i need to make a big dataframe (DF3) adding info in adjoint columns, only in those cases where there is info avaliable from DF2, otherwise leave empty that position.

THANKS! JL

回答1:

You need merge with parameter left

df1.merge(df2, how = 'left')

    MATERIAL N° Description DATE DUE    BG GROUP    TRANSIT TIME
0   123123300   Lightbulb X 01/05/2018  9001.0      45D
1   220466      Lightbulb Y 04/04/2018  9002.0      30D
2   220000      Lightbulb Z 07/07/2018  9004.0      30D
3   1241241     Lightbulb A 02/01/2019  9002.0      40D
4   7775447     Lightbulb B 02/01/2019  NaN         NaN