groovy - 流集的 Groovy 脚本，用于解析大约 1500 个字符的字符串

Question

这是针对流集的，我正在尝试编写 groovy 脚本。我有长度为 1500 个字符的字符串。没有分隔符。模式是前 4 个字符是一些代码，接下来的 4 个字符是单词的长度，然后是单词。再一次，它是一些代码的 4 个字符和 4 个单词长度的字符，然后是单词。例如 22010005PHONE00010002IN00780004ROSE

当你解码时，它会像

2201 - 代码 0005 - 单词 PHONE 的长度 - 单词

0001 - 代码 0002 - 字的长度 IN - 字

0078 - 代码 0004 - 单词的长度 ROSE - 单词等等..

如果代码以 00 开头，我需要有关 groovy 脚本的帮助来创建字符串。因此最终的字符串将是 INROSE。

我正在尝试使用 while 循环和 str:substring。很感谢任何形式的帮助。

谢谢

def dtx_buf = record.value['TXN_BUFFER']
def fieldid = []
def fieldlen = []
def dtx_out = []
def i = 13
def j = 0
while (i < dtx_buf.size())
{    
//   values = record.value['TXN_BUFFER']
    fieldid[j] = str.substring(values,j,4)      
    output.write(record)
}

预期结果“INROSE”

score 4 · Accepted Answer

一种方法是编写一个包含解析输入规则的迭代器：

class Tokeniser implements Iterator {
    String buf
    String code
    String len
    String word

    // hasNext is true if there's still chars left in `buf`        
    boolean hasNext() { buf }

    Object next() {
        // Get the code and the remaining string
        (code, buf) = token(buf)

        // Get the length and the remaining string
        (len, buf) = token(buf)

        // Get the word (of the given length), and the remaining string
        (word, buf) =  token(buf, len as Integer)

        // Return a map of the code and the word
        [code: code, word: word]
    }

    // This splits the string into the first `length` chars, and the rest
    private token(String input, int length = 4) {
        [input.take(length), input.drop(length)]
    }

}

然后，我们可以用它来做：

def result = new Tokeniser(buf: '22010005PHONE00010002IN00780004ROSE')
    .findAll { it.code.startsWith('00') }
    .word
    .join()

结果是INROSE

拿 2

我们可以尝试另一种没有内部类的迭代方法，看看它在您的环境中是否更有效：

def input = '22010005PHONE00010002IN00780004ROSE'
def pos = 0
def words = []

while (pos < input.length() - 8) {
    def code = input.substring(pos, pos + 4)
    def len = input.substring(pos + 4, pos + 8) as Integer
    def word = input.substring(pos + 8, pos + 8 + len)
    if (code.startsWith('00')) {
        words << word
    }
    pos += 8 + len
}

def result = words.join()

groovy - 流集的 Groovy 脚本，用于解析大约 1500 个字符的字符串

1 回答 1

拿 2

Related

Reference