python - 如何找到用a指定的东西的字符位置...？Python

Question

当它们嵌入到这样的东西中时，<我试图获得它们的位置。>tag<tag "510270">calculate</>

我有这样的句子：

sentence = "After six weeks and seventeen tentative approaches the only serious 
tender came from Daniel. He had offered a paltry #2 a week for the one-time 
woodman's home, sane enough in this, at least, to <tag "510270">calculate</> 
safety to the nearest new penny piece. "

sentence2 = "After six weeks and seventeen tentative approaches the only serious 
tender came from Daniel. He had offered a paltry #2 a week for the one-time 
woodman's < home, sane enough in this, at least, to <tag "510270">calculate</> 
safety to the nearest new penny > piece. "

sentence3 = "After six weeks and seventeen tentative approaches the only serious 
tender came from Daniel. He had offered a paltry #2 a week for the one-time 
woodman's > home, sane enough in this, at least, to <tag "510270">calculate</> 
safety to the nearest new penny < piece. "

我需要 cfrom 和 incfrom 成为第一个和第二个<的位置，我需要 cto 和 incto<tag "XXXX">...</>成为第二个和第一个>的位置<tag "XXXX">...</>

<对于句子 2和>句子 3之类的句子，我怎么能做到这一点<tag "XXXX">...</>？

对于 sentence1，我可以简单地这样做：

cfrom,cto = 0,0
for i,c in enumerate(sentence1):
  if c == "<":
    cfrom == i
  break

for i,c in enumerate(sentence1.reverse):
  if c == ">":
    cto == len(sentence)-i
  break

incfrom incto = 0,0
fromtrigger, totrigger = False, False
for i,c in enumerate(sentence1[cfrom:]):
  if c == ">":
    incfrom = cfrom+i
  break

for i,c in enumerate(sentence1[incfrom:cto]):
  if c == "<":
    incto = i
  break

score 1 · Accepted Answer

如果您在找到标签时跟踪您所在的位置，如下所示：

def parseSentence(sentence):
    cfrom, cto, incfrom, incto = 0, 0, 0, 0
    place = '' #to keep track of where we are

    for i in range(len(sentence)):
        c = sentence[i]
        if (c == '<'):
            #check for 'cfrom'
            if (sentence[i : i + 4] == '<tag'):
                cfrom = i
                place = 'botag' #begin-open-tag
            #check for 'incfrom'
            elif (sentence[i + 1] == '/' and place == 'intag'):
                incfrom = i
                place = 'bctag' #begin-close-tag
        elif (c == '>'):
            #check for 'cto'
            if (place == 'botag'): #just after '<tag...'
                cto = i
                place = 'intag' #now within the XML tag
            #check for 'incto'
            elif (place == 'bctag'):
                incto = i
                place = ''
                yield (cfrom, cto, incfrom, incto)

这应该适用于您的所有句子~~，但请注意，只有当您的句子中只有一个句子时，它才会真正正常工作<tag>...</>。如果有多个，它将返回最后一个的位置<tag>...</>。~~

编辑：如果您将 a 添加到函数中，如果您有多个标签，yield它将遍历句子中所有标签的位置（见上文）。<tag>...</>

score 0 · Accepted Answer

如果我理解正确，这应该有效（假设您不更改变量i ,c）

cfrom,cto = 0,0
for i,c in enumerate(sentence1):
  if c == "<tag":
    cfrom == i 
  break

for i,c in enumerate(sentence1):
  if c == ">":
    cto == i \\going forward from cfrom
  break

incfrom incto = 0,0
fromtrigger, totrigger = False, False
for i,c in enumerate(sentence1[cto:]):\\after the tag is opened, look for the start of closing tag
  if c == "</":
    incfrom = i
  break
for i,c in enumerate(sentence1[cto:]):
  if c == ">":
    incto = i
  break

python - 如何找到用a指定的东西的字符位置...？Python

2 回答 2

Related

Reference