java - STEP文件中实例的正则表达式？

Question

我必须解析来自不同 CAD 系统的一些 STEP 文件 (ISO-10303-21)，它们的结构总是不同的。以下是可能出现的形式：

#95=STYLED_ITEM('',(#94),#92);
#12 = CARTESIAN_POINT ( 'NONE',  ( 1.213489432997839200,
5.617300827691964000, -7.500000000000001800 ) ) ;
#263 = TEST ( 'Spaces must not be ignored here' ) ;

我认为正则表达式会对我有所帮助，所以我创建了这个（http://rubular.com/r/EtJ25Hfg77）：

(\#\d+)\s*=\s*([A-Z_]+)\s*\(\s*(.*)*\s*\)\s*;

这给了我：

Match 1:
1: #95
2: STYLED_ITEM
3:

Match 2:
1: #12
2: CARTESIAN_POINT
3:

Match 3:
1: #263
2: TEST
3:

所以前两组按预期工作。但我还需要括号内的属性，如下所示：

Match 1:
1: #95
2: STYLED_ITEM
3: ''
4: (#94)
5: #92

Match 2:
1: #12
2: CARTESIAN_POINT
3: 'NONE'
4: ( 1.213489432997839200, 5.617300827691964000, -7.500000000000001800 )

Match 3:
1: #263
2: TEST
3: 'Spaces must not be ignored here'

请帮我找到最后一组（(.*)目前）的正确表达方式。

score 3 · Accepted Answer

具有非商业用途的 AGPL 许可证 JSDAI 是免费的开源 Java 工具包，用于处理 STEP 文件

http://www.jsdai.net/

BSD 许可证，因此始终免费和开源的是 STEPcode 项目，它生成 C++ 和 python API 和示例 STEP 文件读取器/写入器，其他开源项目如 BRL-CAD、SCView 和 OpenVSP 使用它。

www.stepcode.org

OpenCasCade 有 C++，pythonOCC 有 python，node-occ 有 javascript API，用于处理从 STEP 转换的数据，并且也是免费和开源的。OCE 可在更多平台上运行，并修复了更多错误

https://github.com/tpaviot/oce

score 1 · Accepted Answer

feuerball，你要求一个正则表达式......这个捕获了你想要的五个组。

我以自由间距模式格式化了正则表达式，以使其更易于阅读。我没有详细解释，但每一行都有注释，我相信你能理解。:)

regexp = /(?x)   # free-spacing mode
^                # assert head of string
(\#\d+)          # captures the digits into Group 1
\s*=\s*          # gets us past the equal and spaces
([A-Z_]+)        # captures the name into Group 2
\s*\(\s*'        # gets us inside the opening quote
([^']*?)'        # captures the string in Group 3
(?:              # start optional non-capturing group, let's call it A
\s*,\s*            # get over the comma and spaces
(\([^)]*?\))       # capture parens to Group 4
(?:\s*,\s*         # start optional non-capturing group, let's call it B
([^\s)]+)            # capture last string to Group 5
)?                 # end optional non-capturing group B
)?               # end optional non-capturing group A
\s*\)\s*;        # close string
/

subject.scan(regexp) {|result|
# If the regex has capturing groups, subject is an array with the text matched by each group (but without the overall match)
# If the regex has no capturing groups, subject is a string with the overall regex match
}

score 0 · Accepted Answer

我不认为正则表达式是在这种情况下要走的路。STEP 是一种非常常见的格式，并且有它的解析器。既然您使用的是 Java，为什么不看看这个：

http://www.steptools.com/support/stdev_docs/javalib/programming.html#SEC0-5-0

我认为这是您使用的格式，对吗？

除非您考虑整个架构，否则您一定会遇到正则表达式的问题。即使您确实设法解释了所有内容，您也只是编写了一种解析器。为什么要重新发明轮子？

java - STEP文件中实例的正则表达式？

3 回答 3

Related

Reference