0

我有两个Pandas DataFrame。我有一个包含 3 列兴趣的数据框,其中包含不同产品的客户 ID。我有第二个包含客户姓名的数据框。我想扩展第一个数据框以将包含客户姓名的新列作为单独的列。例如,我想扩展与第二个数据帧映射的主数据帧,但使用新列,如customer_1_name, customer_2_name, customer_3_name

请记住,一个客户可以有不同的客户 ID 包。

为了解决我的问题。我附上了两个数据框的片段

# DataFrame 1
dFrame_main = ({'Pageviews': [22.0,
  22.0,
  21.0,
  20.0,
  18.0,
  15.0,
  14.0,
  14.0,
  13.0,
  13.0,
  12.0,
  12.0,
  12.0,
  12.0,
  12.0,
  12.0,
  12.0,
  12.0,
  11.0,
  11.0],
 'Unique Pageviews': [8.0,
  8.0,
  16.0,
  14.0,
  14.0,
  12.0,
  13.0,
  12.0,
  13.0,
  13.0,
  8.0,
  8.0,
  5.0,
  11.0,
  12.0,
  7.0,
  9.0,
  9.0,
  5.0,
  5.0],
 'Avg. Time on Page': ['0:00:23',
  '0:00:23',
  '0:03:49',
  '0:00:31',
  '0:00:21',
  '0:00:27',
  '0:00:38',
  '0:00:15',
  '0:01:24',
  '0:00:20',
  '0:00:13',
  '0:00:13',
  '0:02:14',
  '0:00:33',
  '0:00:46',
  '0:00:14',
  '0:01:08',
  '0:01:08',
  '0:01:51',
  '0:01:51'],
 'CustomerID_1': ['465',
  '465',
  '162',
  '124',
  '920',
  '920',
  '920',
  '920',
  '920',
  '920',
  '165',
  '165',
  '166',
  '920',
  '920',
  '920',
  '162',
  '162',
  '1846',
  '118'],
 'CustomerID_2': ['702',
  '702',
  '446',
  '125',
  '470',
  '470',
  '470',
  '470',
  '470',
  '212',
  '1920',
  '1920',
  '868',
  '470',
  '470',
  '470',
  '873',
  '873',
  '862',
  '862'],
 'CustomerID_3': ['167',
  '167',
  '570',
  np.nan,
  '212',
  '212',
  '212',
  '212',
  '212',
  np.nan,
  '1670',
  '1670',
  '274',
  '212',
  '212',
  '212',
  '764',
  '764',
  '584',
  '584']})
# DataFrame 2
dFrame = pd.DataFrame({'CustomerID': [569,
  923,
  162,
  1798,
  920,
  470,
  1943,
  1798,
  162,
  124,
  1053,
  212,
  923,
  1747,
  1921,
  166,
  165,
  465,
  862,
  584],
 'CustomerNames': ['Thomas Bills',
  'Demi Boras',
  'Jerry wills',
  'Pills Wilson',
  'Jerry wills',
  'Harsh wilson',
  'Alli Pees',
  'Pills Wilson',
  'Jerry wills',
  'Pills Wilson',
  'Fedolls Feba',
  'Pills Wilson',
  'Demi Boras',
  'Harsh wilson',
  'Matt Lills',
  'Pills Wilson',
  'Twist Tells',
  'Jerry wills',
  'Matt Lills',
  'Balls tails']})

请注意:这只是大型数据框的一个片段,以便您了解我要解决的问题。

我试过这个如何将一个数据帧映射到另一个(python pandas)?,不幸的是,这对我的情况没有帮助。

谢谢你的时间

4

1 回答 1

1
# Build a mapper series. For this to work we need to ensure that 
# - the datatypes are the same (string)
# - that we have no repeated values

mapper = dFrame.astype(str).drop_duplicates().set_index('CustomerID')['CustomerNames']

# Apply the mapping (the value of the column will be looked up in the index of the `mapper` series)
dFrame_main['CustomerID_1_name'] = dFrame_main['CustomerID_1'].map(mapper)
dFrame_main['CustomerID_2_name'] = dFrame_main['CustomerID_2'].map(mapper)
dFrame_main['CustomerID_3_name'] = dFrame_main['CustomerID_3'].map(mapper)
于 2020-03-22T00:33:28.977 回答