0

我有一个文本文件,如下所示:

node13 
    state = free 
np = 8 
properties = beta,eightcores 
ntype = cluster 
status = opsys=linux,uname=Linux node13 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100 x86_64,sessions=? 15201,nsessions=? 01,nusers=0,idletime=6837317,totmem=20506268kb,availmem=20259728kb,physmem=20506268kb,ncpus=8,loadave=0.00,gres=,netload=17130666575,se=free,jobs=,varattr=,rectime=1333639375 

node14 
    state = job-exclusive 
np = 8 
properties = beta,eightcores 
ntype = cluster

我只想在节点空闲的情况下获取节点。为此,我必须制作一个正则表达式,node(..)仅当以下行具有state = free. 你能帮我解决这个问题吗?

编辑

到目前为止没有任何效果。可能是因为我没有在文件中阅读,但是

proc = subprocess.Popen("pbsnodes", stdout=subprocess.PIPE)
listOfFreeNodes = proc.stdout.read()

它会以某种方式损害解决方案吗?这是完整的pbsnodes输出:

node01                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node01 2.6.27.19-5-01,nusers=0,idletime=861913,totmem=16432576kb,availmem=16=free,jobs=,varattr=,rectime=1333641123                  

node02                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node02 2.6.27.19-5-nusers=2,idletime=5357510,totmem=16432576kb,availmem=1617ree,jobs=,varattr=,rectime=1333641107                    

node03                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node03 2.6.27.19-5-s=1,idletime=8564681,totmem=16432576kb,availmem=16029924kobs=60966.hpchead.linux,varattr=,rectime=1333641119      

node04                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node04 2.6.27.19-5-01,nusers=0,idletime=8564678,totmem=16432576kb,availmem=1e=free,jobs=,varattr=,rectime=1333641124                 

node05                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node05 2.6.27.19-5-01,nusers=0,idletime=2072593,totmem=16432652kb,availmem=1=free,jobs=,varattr=,rectime=1333641091                  

node06                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node06 2.6.27.19-5-s=1,idletime=9038,totmem=16432576kb,availmem=16200960kb,p,varattr=,rectime=1333641096                             

node07                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node07 2.6.27.19-5-s=1,idletime=8564671,totmem=16432576kb,availmem=16173848kobs=,varattr=,rectime=1333641134                         

node08                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node08 2.6.27.19-5- 21356,nsessions=5,nusers=1,idletime=8564604,totmem=1643219260329746,state=free,jobs=,varattr=,rectime=1333641095 

node09                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node09 2.6.27.19-5-01,nusers=0,idletime=8564648,totmem=16432552kb,availmem=1e=free,jobs=,varattr=,rectime=1333641126                 

node10                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node10 2.6.27.19-5-2,nsessions=5,nusers=1,idletime=6821493,totmem=16432552kb036941,state=free,jobs=,varattr=,rectime=1333641133      

node11                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node11 2.6.27.19-5-01,nusers=0,idletime=8564599,totmem=16432556kb,availmem=1e=free,jobs=,varattr=,rectime=1333641120                 

node12                                                   
     state = free                                        
     np = 8                                              
     properties = alpha,eightcores                       
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node12 2.6.27.19-5-01,nusers=0,idletime=8564627,totmem=16432556kb,availmem=1e=free,jobs=,varattr=,rectime=1333641121                 

node13                                                   
     state = free                                        
     np = 8                                              
     properties = beta,eightcores                        
     ntype = cluster                                     
     status = opsys=linux,uname=Linux node13 2.6.27.19-5-01,nusers=0,idletime=6839072,totmem=20506268kb,availmem=2e=free,jobs=,varattr=,rectime=1333641130                 

node14                                                   
     state = job-exclusive                               
     np = 8                                              
     properties = beta,eightcores                        
     ntype = cluster                                     
     jobs = 0/66481.hpchead.linux, 1/66481.hpchead.linux,chead.linux, 6/66481.hpchead.linux, 7/66481.hpchead.linux
     status = opsys=linux,uname=Linux node14 2.6.27.19-5-,nusers=1,idletime=8568052,totmem=24635060kb,availmem=206free,jobs=66481.hpchead.linux,varattr=,rectime=1333641132

node15                                                   
     state = job-exclusive                               
     np = 8                                              
     properties = beta,eightcores                        
     ntype = cluster                                     
     jobs = 0/66482.hpchead.linux, 1/66482.hpchead.linux,chead.linux, 6/66482.hpchead.linux, 7/66482.hpchead.linux
     status = opsys=linux,uname=Linux node15 2.6.27.19-5-,nusers=1,idletime=8567636,totmem=24635012kb,availmem=212free,jobs=66482.hpchead.linux,varattr=,rectime=1333641092

node16                                                   
     state = job-exclusive                               
     np = 8                                              
     properties = beta,eightcores                        
     ntype = cluster                                     
     jobs = 0/66481.hpchead.linux, 1/66481.hpchead.linux,chead.linux, 6/66481.hpchead.linux, 7/66481.hpchead.linux
     status = opsys=linux,uname=Linux node16 2.6.27.19-5-=1,idletime=8564418,totmem=24634928kb,availmem=20700104kbbs=66481.hpchead.linux,varattr=,rectime=1333641117       

node17                                                   
     state = job-exclusive                               
     np = 8                                              
     properties = beta,eightcores                        
     ntype = cluster                                     
     jobs = 0/66482.hpchead.linux, 1/66482.hpchead.linux,chead.linux, 6/66482.hpchead.linux, 7/66482.hpchead.linux
     status = opsys=linux,uname=Linux node17 2.6.27.19-5-s=1,idletime=6824915,totmem=24634928kb,availmem=20598068kbs=66482.hpchead.linux,varattr=,rectime=1333641113       

node21                                                   
     state = job-exclusive                               
     np = 12                                             
     properties = blade                                  
     ntype = cluster                                     
     jobs = 0/66483.hpchead.linux, 1/66483.hpchead.linux,chead.linux, 6/66483.hpchead.linux, 7/66483.hpchead.linux.hpchead.linux                                           
     status = opsys=linux,uname=Linux node21 2.6.27.19-5-,nusers=1,idletime=8569176,totmem=26790348kb,availmem=203e=free,jobs=66483.hpchead.linux,varattr=,rectime=13336411

node22                                                   
     state = job-exclusive                               
     np = 12                                             
     properties = blade                                  
     ntype = cluster                                     
     jobs = 0/66475.hpchead.linux, 1/66475.hpchead.linux,chead.linux, 6/66475.hpchead.linux, 7/66475.hpchead.linux.hpchead.linux                                           
     status = opsys=linux,uname=Linux node22 2.6.27.19-5-users=1,idletime=8569178,totmem=26790348kb,availmem=21384free,jobs=66475.hpchead.linux,varattr=,rectime=1333641118

node23                                                   
     state = job-exclusive                               
     np = 12                                             
     properties = blade
     ntype = cluster
     jobs = 0/66484.hpchead.linux, 1/66484.hpchead.linux, 2/66484.hpchead.linux, 3/66484.hpchead.linux, 4/66484.hpchead.linux, 5/66484.hpchead.linux, 6/66484.hpchead.linux, 7/66484.hpchead.linux, 8/66484.hpchead.linux, 9/66484.hpchead.linux, 10/66484.hpchead.linux, 11/66484.hpchead.linux
     status = opsys=linux,uname=Linux node23 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100 x86_64,sessions=10309 10370,nsessions=2,nusers=1,idletime=8569255,totmem=26790348kb,availmem=20165484kb,physmem=24685876kb,ncpus=12,loadave=12.01,gres=,netload=21742922098,state=free,jobs=66484.hpchead.linux,varattr=,rectime=1333641120

node24
     state = job-exclusive
     np = 12
     properties = blade
     ntype = cluster
     jobs = 0/66485.hpchead.linux, 1/66485.hpchead.linux, 2/66485.hpchead.linux, 3/66485.hpchead.linux, 4/66485.hpchead.linux, 5/66485.hpchead.linux, 6/66485.hpchead.linux, 7/66485.hpchead.linux, 8/66485.hpchead.linux, 9/66485.hpchead.linux, 10/66485.hpchead.linux, 11/66485.hpchead.linux
     status = opsys=linux,uname=Linux node24 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100 x86_64,sessions=11157 11218,nsessions=2,nusers=1,idletime=8569254,totmem=26790348kb,availmem=21489804kb,physmem=24685876kb,ncpus=12,loadave=12.05,gres=,netload=18486923435,state=free,jobs=66485.hpchead.linux,varattr=,rectime=1333641114

node25
     state = job-exclusive
     np = 12
     properties = blade
     ntype = cluster
     jobs = 0/66469.hpchead.linux, 1/66469.hpchead.linux, 2/66469.hpchead.linux, 3/66469.hpchead.linux, 4/66469.hpchead.linux, 5/66469.hpchead.linux, 6/66469.hpchead.linux, 7/66469.hpchead.linux, 8/66469.hpchead.linux, 9/66469.hpchead.linux, 10/66469.hpchead.linux, 11/66469.hpchead.linux
     status = opsys=linux,uname=Linux node25 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100 x86_64,sessions=6711 6772,nsessions=2,nusers=1,idletime=8569282,totmem=26790348kb,availmem=21082316kb,physmem=24685876kb,ncpus=12,loadave=12.00,gres=,netload=15199518313,state=free,jobs=66469.hpchead.linux,varattr=,rectime=1333641095

编辑

感谢所有回答的人。

4

6 回答 6

4

这应该返回正确的节点值

r'node\d+(?=[^\n]*\n\s*state\s*=\s*free)'

这使用积极的前瞻来窥视行尾,但不捕获它找到的任何东西。它只匹配节点值。

l = re.findall(r'node\d+(?=[^\n]*\n\s*state\s*=\s*free)', s)
print l
>>> ['node13']

编辑:受@hexparrot 评论的启发,我意识到有一种更简单的方法。这个正则表达式r'node\d+(?=\s*state\s*=\s*free)'更简单,也可以工作,即使它没有明确搜索换行符(因为\s包含 EOL 字符)。但是......它也不保证state=free会在以下行中找到,如 OP 的要求中所述。它也将node99 state=free在同一行上匹配。所以明确地寻找\n更好的满足OP的要求。

于 2012-04-05T15:30:22.697 回答
3

如果您可以依赖生成的文件是可靠构造的(如,遵循与您所示相同的格式),则正则表达式有时比必要的要重一些。

因此,这是一种使用简单迭代的方法:

with open('yourfile.txt', 'r') as fp:
    node_dict = {}
    node = None
    for line in fp:
        if line[0:4] == 'node':
            node = line.strip()
            node_dict[node] = 0
        elif "state" in line:
            node_dict[node] = line.split('=')[1].strip()

print node_dict

退货

{'node13': 'free', 'node14': 'job-exclusive'}

然后很容易获得“免费”节点:

>>> print [k for k,v in node_dict.items() if v == 'free']
['node13']
于 2012-04-05T15:42:11.327 回答
2

我建议先将文本解析为 python 结构,然后再操作该结构。正则表达式对于这项工作来说太复杂太脆弱了。考虑:

doc = """
node13 
    state = free 
np = 8 
properties = beta,eightcores 
ntype = cluster 
status = opsys=linux,uname=Linux node13 2.6.27.19-5-default etc

node14 
    state = job-exclusive 
np = 8 
properties = beta,eightcores 
ntype = cluster
"""

data = {}
lastkey = None
for line in map(str.strip, doc.splitlines()):
    if ' = ' in line and lastkey:
        k, v = line.split(' = ', 1)
        data[lastkey][k] = v
    elif len(line):
        lastkey = line
        data[lastkey] = {}

这将创建一个像这样的字典:

{'node13': {'np': '8',
            'ntype': 'cluster',
            'properties': 'beta,eightcores',
            'state': 'free',
            'status': 'opsys=linux,uname=Linux node13 2.6.27.19-5-default etc'},
 'node14': {'np': '8',
            'ntype': 'cluster',
            'properties': 'beta,eightcores',
            'state': 'job-exclusive'}}

你可以用普通的python方式操作它:

 free_nodes = [v for v in data.values() if v['state'] == 'free']
于 2012-04-05T15:42:59.517 回答
1

您可以使用re.DOTALL标志,以便.匹配包括换行在内的所有内容。这是一个示例

>>> st="""
node13 
    state = free 
np = 8 
properties = beta,eightcores 
ntype = cluster 
status = opsys=linux,uname=Linux node13 2.6.27.19-5-default #1 SMP 2009-02-28 04:40:21 +0100 x86_64,sessions=? 15201,nsessions=? 01,nusers=0,idletime=6837317,totmem=20506268kb,availmem=20259728kb,physmem=20506268kb,ncpus=8,loadave=0.00,gres=,netload=17130666575,se=free,jobs=,varattr=,rectime=1333639375 

node14 
    state = job-exclusive 
np = 8 
properties = beta,eightcores 
ntype = cluster
"""

>>> re.findall("(node\d+).*?state.*?free",st,re.DOTALL)
['node13']

请注意,这也可以在没有正则表达式的情况下完成

>>> stlines=st.splitlines()
>>> [stlines[i]  for i in xrange(0,len(stlines)-1) if stlines[i+1].partition("=")[-1].strip() == 'free']
['node13']
>>> 

注意***如果您需要更健壮的正则表达式,正如弗朗西斯在他的示例中所示,您可以使用以下

>>> re.findall("(node\d+).*?state[ ]*=[ ]*free",st,re.DOTALL)
['node13']
>>> 
于 2012-04-05T15:32:32.487 回答
1

我同意@thg435 的观点,即正则表达式对于这项工作来说太强大了。我更喜欢一个非常简单的解决方案:

lines = data.split('\n')
num_lines = len(lines)
[lines[i] for i in range(numlines - 1) if 'state = free' in lines[i+1]]

这确实抓住了您想要做的事情的本质:如果下一行 ( lines[i+1]) 包含所需的文本,则当前行(可能是节点的名称)进入列表。

于 2012-04-05T15:52:05.770 回答
1

向后看往往比向前看更容易。因此,当下一行包含某些内容时,不要考虑获取当前行;当当前行包含某些内容时,您想获取上一行。以这些术语为框架,很容易构思和实施:

def find_free_node(doc):
    prevline = ""
    for line in doc.splitlines():
       if line.strip() == "state = free" and previine.startswith("node"):
           return prevline.strip()
       prevline = line

另一种方法是跟踪您所在的节点,而不是前一行。即使该state = free行没有立即跟随节点名称行,这也将起作用。

def find_free_node(doc):
    node = ""
    for line in doc.splitlines():
        if line.startswith("node"):
            node = line.strip()
        elif line.strip() = "state = free" and node:
            return node

对我来说,这些比基于多行正则表达式的解决方案要清晰得多。

于 2012-04-05T18:24:52.363 回答