python - Resolve Pandas data frame merge conflicts with a function? -
Assume that I have two data frames, which I would like to merge, but there is a conflict because rows and columns overlap . Instead of duplicating rows, I want to pass a function to solve the conflict. it has to be done?
Pd.date_range ("2014-01-01", duration = 4) Date 2 = pd.date_range ("2014- 1- 3 "," time "= 4) cols1 = list (" ABCD ") cols2 = list (" CDEF ") df1 = pd.DataFrame (NP.OnS ([4, 4], dtype =" bool "), index = [317]: df1 out [317] in df2 = pd.DataFrame (np.zeros ([4, 4], dtype = "bool"), index = dates2, column = column2) df2 = 1, column = cols1) : ABCD 2014-01-01 True Truth True Truth 2014-01-02 True Truth Truly True 2014-01-03 True Truth Truced Truth 2014-01-04 True Truth Is True [318]: df2 Out [318]: CDEF 2014-01 -03 false false false false 2014-01-04 false false False false 2014-01-05 false false false false 2014-01-06 false false false mistake
So you can see that two data frames are overlap in column C and D, and the rows In 2014-01-03 and 2014-01-04. So now when I merge them, I get rows repeatedly due to this conflict: [321]: pd.concat ([df1, df2]) Out [321]: ABCDEF 2014-01-01 True Truth Really True No EN 2014-01-02 True Truth Is not True No NN 2014-01-03 True Truth Is Not True Truth NON 2014-01-04 True Truth Is Not True Truth N NEW 2014-01 -03 na nn nh false false false false 2014-01-04 na nn na false false false false 2014-01-05 na nn na false false false false 2014-01-06 Nain nor false false false false
when I actually I am the true value to override Falses (or Nain), which I could have done, for example, to solve such a repetition conflict passed with a "or" function, what can it be done in pandas is?
The result should look like this:
ABCDEF 2014-01-01 True True Truth Is Not True En 2014-01-02 True True Truth Nine Nine 2014 -01-03 True Truth Is True Truth The True False 2014-01-04 True Truth Is Not True The True False 2014-01-05 No Anne False False False False 2014-01-06 No En No False False False
That is, where there is no duplication, there is a value in two data frames, where there is no data in any frame, a nan is returned, but where the data in both frames is true Security override fall (that is, "or").
"text">
instead of using the conset access merge :
& gt; & Gt; Pd.merge (df1, df2, on = (df1.columns and df2.columns) .tolist (), how = 'external', left_index = true, right_index = true) ABCDEF 2014-01-01 True True True Truth NaN No 2014-01-02 True Truth Is not True No No 2014-01-03 True Truth Is Not True The True False 2014-01-04 True Truth Is Not True The True False 2014-01-05 No En NO False False False lie 2014-01-06 nan nan false false false false
at = (df1.column and df2.column). Logic gives you a list of overlapping columns (in this case,
['C', 'D']
)
how-to = External '
is a union of keys from both frames (SQL)
Comments
Post a Comment