我正在尝试从字符串中解析 C# 中的化学式(格式,例如:Al2O3
orO3
或C
or C11H22O12
)。除非特定元素只有一个原子(例如 中的氧原子H2O
),否则它可以正常工作。我该如何解决这个问题,此外,有没有比我现在更好的方法来解析化学式字符串?
ChemicalElement 是表示化学元素的类。它具有 AtomicNumber (int)、Name (string)、Symbol (string) 属性。ChemicalFormulaComponent 是表示化学元素和原子数(例如,公式的一部分)的类。它具有 Element (ChemicalElement)、AtomCount (int) 属性。
其余的应该足够清楚以理解(我希望),但如果我能澄清任何事情,请在您回答之前通过评论告诉我。
这是我当前的代码:
/// <summary>
/// Parses a chemical formula from a string.
/// </summary>
/// <param name="chemicalFormula">The string to parse.</param>
/// <exception cref="FormatException">The chemical formula was in an invalid format.</exception>
public static Collection<ChemicalFormulaComponent> FormulaFromString(string chemicalFormula)
{
Collection<ChemicalFormulaComponent> formula = new Collection<ChemicalFormulaComponent>();
string nameBuffer = string.Empty;
int countBuffer = 0;
for (int i = 0; i < chemicalFormula.Length; i++)
{
char c = chemicalFormula[i];
if (!char.IsLetterOrDigit(c) || !char.IsUpper(chemicalFormula, 0))
{
throw new FormatException("Input string was in an incorrect format.");
}
else if (char.IsUpper(c))
{
// Add the chemical element and its atom count
if (countBuffer > 0)
{
formula.Add(new ChemicalFormulaComponent(ChemicalElement.ElementFromSymbol(nameBuffer), countBuffer));
// Reset
nameBuffer = string.Empty;
countBuffer = 0;
}
nameBuffer += c;
}
else if (char.IsLower(c))
{
nameBuffer += c;
}
else if (char.IsDigit(c))
{
if (countBuffer == 0)
{
countBuffer = c - '0';
}
else
{
countBuffer = (countBuffer * 10) + (c - '0');
}
}
}
return formula;
}