1

我是 Python 和 Pandas 的新手,所以如果有人能在这件事上帮助我,我会非常高兴。我的问题如下:

如果我有一个 .txt 文件,其中包含一组作为字符串(R1、R2...)的反应。每个反应都有化合物 (A,B,C,D...),它们具有各自的化学计量系数 (1, 2, 3...),例如:

R1: A + 2B + C <=> D

R2: A + B <=> C

如何在 python 中以化学计量矩阵的格式创建数据框(化合物作为行 X 反应作为列),如下所示:

  R1 R2
A -1 -1 
B -2 -1
C -1  1
D  1  0

观察:等式左侧的化合物应具有负化学计量值,而右侧的化合物应为正

谢谢=D

4

1 回答 1

1

尝试这个:

import pandas as pd
import re  # regular expressions

def coeff_comp(s):
    # Separate stoichiometric coefficient and compound
    result = re.search('(?P<coeff>\d*)(?P<comp>.*)', s)
    coeff = result.group('coeff')
    comp = result.group('comp')
    if not coeff:
        coeff = '1'                          # coefficient=1 if it is missing
    return comp, int(coeff)

equations = ['R1: A + 2B + C <=> D', 'R2: A + B <=> C']  # some test data
reactions_dict = {}                          # results dictionary

for equation in equations:
    compounds = {}                           # dict -> compound: coeff 
    eq = equation.replace(' ', '')  
    r_id, reaction = eq.split(':')           # separate id from chem reaction
    lhs, rhs = reaction.split('<=>')         # split left and right hand side
    reagents = lhs.split('+')                # get list of reagents
    products = rhs.split('+')                # get list of products
    for reagent in reagents:
        comp, coeff = coeff_comp(reagent)
        compounds[comp] = - coeff            # negative on lhs
    for product in products:
        comp, coeff = coeff_comp(product)
        compounds[comp] = coeff              # positive on rhs
    reactions_dict[r_id] = compounds         

# insert dict into DataFrame, replace NaN with 0, let values be int
df = pd.DataFrame(reactions_dict).fillna(value=0).astype(int)

输出看起来像

   R1  R2
A  -1  -1
B  -2  -1
C  -1   1
D   1   0
于 2018-04-18T14:07:12.170 回答