5

我应该如何使用nom解析类似于 rust 的原始字符串的带引号的字符串?我想解析以下内容:

"A standard string"
#"A string containing ["] a quote"#
##"A string containing ["#] a quote and hash "##

我将如何做到这一点,在开始和结束时需要相同数量的“#”符号,同时允许 #'ed 字符串包含未转义的引号和哈希?

4

1 回答 1

5

这将是我的方法(使用nom-5.1.1):

extern crate nom;

use nom::{
  IResult,
  multi::{count, fold_many0, many_till},
  bytes::complete::{tag, take},
  sequence::pair
};

fn quoted_str(input: &str) -> IResult<&str, &str> {

  // Count number of leading #
  let (remaining, hash_count) = fold_many0(tag("#"), 0, |acc, _| acc + 1)(input)?;

  // Match "
  let (remaining, _) = tag("\"")(remaining)?;

  // Take until closing " plus # (repeated hash_count times)
  let closing = pair(tag("\""), count(tag("#"), hash_count));
  let (remaining, (inner, _)) = many_till(take(1u32), closing)(remaining)?;

  // Extract inner range
  let offset = hash_count + 1;
  let length = inner.len();

  Ok((remaining, &input[offset .. offset + length]))
}

#[test]
fn run_test() {
  assert_eq!(quoted_str("\"ABC\""), Ok(("", "ABC")));
  assert_eq!(quoted_str("#\"ABC\"#"), Ok(("", "ABC")));
  assert_eq!(quoted_str("##\"ABC\"##"), Ok(("", "ABC")));
  assert_eq!(quoted_str("###\"ABC\"###"), Ok(("", "ABC")));

  assert_eq!(quoted_str("#\"ABC\"XYZ\"#"), Ok(("", "ABC\"XYZ")));
  assert_eq!(quoted_str("#\"ABC\"#XYZ\"#"), Ok(("XYZ\"#", "ABC")));
  assert_eq!(quoted_str("#\"ABC\"##XYZ\"#"), Ok(("#XYZ\"#", "ABC")));

  assert_eq!(quoted_str("##\"ABC\"XYZ\"##"), Ok(("", "ABC\"XYZ")));
  assert_eq!(quoted_str("##\"ABC\"#XYZ\"##"), Ok(("", "ABC\"#XYZ")));
  assert_eq!(quoted_str("##\"ABC\"##XYZ\"##"), Ok(("XYZ\"##", "ABC")));
  assert_eq!(quoted_str("##\"ABC\"###XYZ\"##"), Ok(("#XYZ\"##", "ABC")));

  assert_eq!(quoted_str("\"ABC\"XYZ"), Ok(("XYZ", "ABC")));
  assert_eq!(quoted_str("#\"ABC\"#XYZ"), Ok(("XYZ", "ABC")));
  assert_eq!(quoted_str("##\"ABC\"##XYZ"), Ok(("XYZ", "ABC")));
}

如果性能对您很重要,则可以通过基于 和 的代码编写函数来many_till避免in 的隐式向量分配。目前似乎没有提供这样的功能。fold_many_tillfold_many0many_fillnom

于 2020-05-24T16:59:58.303 回答