java - 字符串标记中 * 和 / 的 Java 分隔符

Question

我有一个像 22 + 4 * 3 / 4 这样的字符串

现在，我需要从这个字符串中提取标记。这是我的一行代码：

String[] tokens  = str.split( [ +-*/]+ )

基本上我的分隔符字符串是 [+-*/] 因为我想在符号上拆分 + - * /

但不幸的是，这与 * 的正则表达式版本冲突，/ 我尝试将反斜杠添加到 * 和 / 作为 [+-\*\/] 但它没有帮助。

如何按字面意思使 Java 编译 *、/？我以为我已经按照关于模式http://docs.oracle.com/javase/6/docs/api/java/util/regex/Pattern.html#sum的 java 文档完成了

我在这里错过了什么？

谢谢

score 3 · Accepted Answer

实际上，当在字符类中使用时，*会+失去它们的特殊含义（毕竟它们在字符类中没有意义）。因此，我们不需要转义这些字符。相反，- 仅在字符类中具有特殊含义，但仅在字符之间使用时才具有特殊含义，它表示范围。如果在开头或结尾使用，则没有特殊含义。所以，我们有：

[ +*/-]+

正则表达式可视化

调试演示

score 2 · Accepted Answer

在字符类 [...] -中是用于创建字符范围的特殊字符，例如a-z. 为了使它成为文字，您需要将它放在类字符的开头，类字符[-...]的结尾，[...-]或者只是简单地将它转义\，在 Java 中需要写为"\\-". 试试这个方法

String[] tokens  = str.split("[ +\\-*/]+");

score 0 · Accepted Answer

Are you trying to parse your string? My guess would be you are trying to perform lexical analysis (scanning) of an input stream.

You could hand-roll a scanner by building strtok, and character lookahead/pushback.
You could use a took like lex, or flex to build a lexical scanner
You could a series of regular expressions and case statements for a poor mans parser

Suppose you do want to tokenize your algebraic string. You need to define a grammar and what tokens you want to recognize. You need something like BNF (Backus-Naur Formalism), or you could use 'railroad syntax diagrams' (personally, I prefer BNF, but some people like railroad diagrams).

Here is a start:

expression --> sexpr | nil
parenexpr  --> '(' sexpr ')'
sexpr   --> parenexpr | addexpr | thing | nil
addexpr --> mulexpr addop mulexpr | mulexpr
mulexpr --> parenexpr
thing   --> symbol | integer | real | scientific
integer --> { '+' | '-' }? digit+
real    --> { '+' | '-' }? digit+ { . digit+ }?
scientific --> { + | - }? digit+ { . digit+ } e { '+' | '-' }? digit+
addop   --> '+' | '-'
mulop   --> '/' | '*' | '^' | '%'
relop   --> '||' | '&&' | '!'
symbol  --> { character | '_' } { character | '_' | digit }*
digit   --> [0-9]
character --> [A-Za-z]
//etc

这意味着，语法产生符号 (-->) 左侧的每一项都扩展为右侧的一项。请注意，此定义是递归的，它可以让您了解所需的编程类型。无论如何，您将需要扫描并识别每个令牌以收集整数、实数、科学、符号、addop、mulop、relop 以及您想要提取的任何其他令牌。在此过程中，您需要决定如何处理空格（制表符、空格、换行符）和其他未定义的符号。

java - 字符串标记中 * 和 / 的 Java 分隔符

3 回答 3

Related

Reference