我建议查看 POSIX 字符类的 Regexp 文档。有几个可能适合您的需求。我建议[:graph:]
作为起点,然后根据需要缩小范围。
从文档:
/[[:alnum:]]/ - Alphabetic and numeric character
/[[:alpha:]]/ - Alphabetic character
/[[:blank:]]/ - Space or tab
/[[:cntrl:]]/ - Control character
/[[:digit:]]/ - Digit
/[[:graph:]]/ - Non-blank character (excludes spaces, control characters, and similar)
/[[:lower:]]/ - Lowercase alphabetical character
/[[:print:]]/ - Like [:graph:], but includes the space character
/[[:punct:]]/ - Punctuation character
/[[:space:]]/ - Whitespace character ([:blank:], newline, carriage return, etc.)
/[[:upper:]]/ - Uppercase alphabetical
/[[:xdigit:]]/ - Digit allowed in a hexadecimal number (i.e., 0-9a-fA-F)
Ruby 还支持以下非 POSIX 字符类:
/[[:word:]]/ - A character in one of the following Unicode general categories Letter, Mark, Number, Connector_Punctuation
出于您的目的,例如:
/\s(#[[:graph:]]+)/
将捕获您的两个示例字符串。前面的 Rubular 链接有示例。