0

我有以下字符串的文本:

{whatever}:::duplicateString:::{whatever}
{whatever}:::duplicateString:::{whatever}
....
{whatever}:::duplicateString:::{whatever}
{whatever}:::duplicateString:::{whatever}

如何从文本中删除重复字符串:主要思想是从行中删除第二个单词,如果它出现超过一次。

第一个想法是逐行读取它们并用“ ::: ”分割,以便创建数组并通过向 TreeSet 添加条目来迭代数组。好的。但是如何再次粘合线呢?

我不记得有任何机制来解决这样的任务。语言无关紧要,只是近似解决方案?

示例文本:

Appliances:::Main
Appliances:::Main:::Appliance Warranties
Appliances:::Main:::Beer Keg Refrigerators
Appliances:::Main:::Beverage Refrigerators
Appliances:::Main:::Ceiling Fans & Accessories
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories:::Downrod Couplers
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories:::Downrods
Appliances:::Main:::Ceiling Fans & Accessories:::Accessories:::Fan Replacement Blades

理想情况下,它必须像:

Appliances:::Main
Appliances:::Appliance Warranties
Appliances:::Beer Keg Refrigerators
Appliances:::Beverage Refrigerators
Appliances:::Ceiling Fans & Accessories
Appliances:::Ceiling Fans & Accessories:::Accessories
Appliances:::Ceiling Fans & Accessories:::Accessories:::Downrod Couplers
Appliances:::Ceiling Fans & Accessories:::Accessories:::Downrods
Appliances:::Ceiling Fans & Accessories:::Accessories:::Fan Replacement Blades
4

1 回答 1

1

如果 duplicateString 可能仅作为第二个单词出现,您可以(在 Python 中):

lastWord = None
for line in open('file.txt'):
  w = line.split(':::')
  thisWord = w[1]
  if lastWord==w[1]:
    del w[1]
  lastWord = thisWord
  print ':::'.join(w)
于 2012-12-13T16:00:11.790 回答