C# 是否具有对解析页码字符串的内置支持?我所说的页码是指您可能会进入打印对话框的格式,它是逗号和破折号分隔的混合格式。
像这样的东西:
1,3,5-10,12
真正好的解决方案是让我返回某种由字符串表示的所有页码的列表。在上面的例子中,像这样得到一个列表会很好:
1,3,5,6,7,8,9,10,12
如果有一种简单的方法,我只想避免自己动手。
应该很简单:
foreach( string s in "1,3,5-10,12".Split(',') )
{
// try and get the number
int num;
if( int.TryParse( s, out num ) )
{
yield return num;
continue; // skip the rest
}
// otherwise we might have a range
// split on the range delimiter
string[] subs = s.Split('-');
int start, end;
// now see if we can parse a start and end
if( subs.Length > 1 &&
int.TryParse(subs[0], out start) &&
int.TryParse(subs[1], out end) &&
end >= start )
{
// create a range between the two values
int rangeLength = end - start + 1;
foreach(int i in Enumerable.Range(start, rangeLength))
{
yield return i;
}
}
}
编辑:感谢您的修复;-)
它没有内置的方法来做到这一点,但使用 String.Split 会很简单。
简单地拆分','然后你有一系列代表页码或范围的字符串。遍历该系列并执行“-”的 String.Split。如果没有结果,它是一个普通的页码,所以把它放在你的页面列表中。如果有结果,以“-”的左侧和右侧为界限,并使用简单的 for 循环将每个页码添加到该范围内的最终列表中。
只需要 5 分钟即可完成,然后可能需要另外 10 分钟来添加一些健全性检查,以便在用户尝试输入无效数据(如“1-2-3”或其他内容)时抛出错误。
基思的方法似乎不错。我使用列表组合了一种更天真的方法。这有错误检查,所以希望能解决大多数问题:-
public List<int> parsePageNumbers(string input) {
if (string.IsNullOrEmpty(input))
throw new InvalidOperationException("Input string is empty.");
var pageNos = input.Split(',');
var ret = new List<int>();
foreach(string pageString in pageNos) {
if (pageString.Contains("-")) {
parsePageRange(ret, pageString);
} else {
ret.Add(parsePageNumber(pageString));
}
}
ret.Sort();
return ret.Distinct().ToList();
}
private int parsePageNumber(string pageString) {
int ret;
if (!int.TryParse(pageString, out ret)) {
throw new InvalidOperationException(
string.Format("Page number '{0}' is not valid.", pageString));
}
return ret;
}
private void parsePageRange(List<int> pageNumbers, string pageNo) {
var pageRange = pageNo.Split('-');
if (pageRange.Length != 2)
throw new InvalidOperationException(
string.Format("Page range '{0}' is not valid.", pageNo));
int startPage = parsePageNumber(pageRange[0]),
endPage = parsePageNumber(pageRange[1]);
if (startPage > endPage) {
throw new InvalidOperationException(
string.Format("Page number {0} is greater than page number {1}" +
" in page range '{2}'", startPage, endPage, pageNo));
}
pageNumbers.AddRange(Enumerable.Range(startPage, endPage - startPage + 1));
}
下面是我刚刚放在一起的代码。你可以输入这样的格式.. 1-2,5abcd,6,7,20-15,,,,,,
易于添加到其他格式
private int[] ParseRange(string ranges)
{
string[] groups = ranges.Split(',');
return groups.SelectMany(t => GetRangeNumbers(t)).ToArray();
}
private int[] GetRangeNumbers(string range)
{
//string justNumbers = new String(text.Where(Char.IsDigit).ToArray());
int[] RangeNums = range
.Split('-')
.Select(t => new String(t.Where(Char.IsDigit).ToArray())) // Digits Only
.Where(t => !string.IsNullOrWhiteSpace(t)) // Only if has a value
.Select(t => int.Parse(t)) // digit to int
.ToArray();
return RangeNums.Length.Equals(2) ? Enumerable.Range(RangeNums.Min(), (RangeNums.Max() + 1) - RangeNums.Min()).ToArray() : RangeNums;
}
这是我为类似的东西做的东西。
它处理以下类型的范围:
1 single number
1-5 range
-5 range from (firstpage) up to 5
5- range from 5 up to (lastpage)
.. can use .. instead of -
;, can use both semicolon, comma, and space, as separators
它不检查重复值,因此集合1,5,-10将产生序列1, 5, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10。
public class RangeParser
{
public static IEnumerable<Int32> Parse(String s, Int32 firstPage, Int32 lastPage)
{
String[] parts = s.Split(' ', ';', ',');
Regex reRange = new Regex(@"^\s*((?<from>\d+)|(?<from>\d+)(?<sep>(-|\.\.))(?<to>\d+)|(?<sep>(-|\.\.))(?<to>\d+)|(?<from>\d+)(?<sep>(-|\.\.)))\s*$");
foreach (String part in parts)
{
Match maRange = reRange.Match(part);
if (maRange.Success)
{
Group gFrom = maRange.Groups["from"];
Group gTo = maRange.Groups["to"];
Group gSep = maRange.Groups["sep"];
if (gSep.Success)
{
Int32 from = firstPage;
Int32 to = lastPage;
if (gFrom.Success)
from = Int32.Parse(gFrom.Value);
if (gTo.Success)
to = Int32.Parse(gTo.Value);
for (Int32 page = from; page <= to; page++)
yield return page;
}
else
yield return Int32.Parse(gFrom.Value);
}
}
}
}
在你有测试用例之前你不能确定。在我的情况下,我更喜欢用空格分隔而不是逗号分隔。它使解析更复杂一些。
[Fact]
public void ShouldBeAbleToParseRanges()
{
RangeParser.Parse( "1" ).Should().BeEquivalentTo( 1 );
RangeParser.Parse( "-1..2" ).Should().BeEquivalentTo( -1,0,1,2 );
RangeParser.Parse( "-1..2 " ).Should().BeEquivalentTo( -1,0,1,2 );
RangeParser.Parse( "-1..2 5" ).Should().BeEquivalentTo( -1,0,1,2,5 );
RangeParser.Parse( " -1 .. 2 5" ).Should().BeEquivalentTo( -1,0,1,2,5 );
}
请注意,Keith 的答案(或一个小的变体)将无法通过范围标记之间存在空格的最后一次测试。这需要一个分词器和一个具有前瞻功能的适当解析器。
namespace Utils
{
public class RangeParser
{
public class RangeToken
{
public string Name;
public string Value;
}
public static IEnumerable<RangeToken> Tokenize(string v)
{
var pattern =
@"(?<number>-?[1-9]+[0-9]*)|" +
@"(?<range>\.\.)";
var regex = new Regex( pattern );
var matches = regex.Matches( v );
foreach (Match match in matches)
{
var numberGroup = match.Groups["number"];
if (numberGroup.Success)
{
yield return new RangeToken {Name = "number", Value = numberGroup.Value};
continue;
}
var rangeGroup = match.Groups["range"];
if (rangeGroup.Success)
{
yield return new RangeToken {Name = "range", Value = rangeGroup.Value};
}
}
}
public enum State { Start, Unknown, InRange}
public static IEnumerable<int> Parse(string v)
{
var tokens = Tokenize( v );
var state = State.Start;
var number = 0;
foreach (var token in tokens)
{
switch (token.Name)
{
case "number":
var nextNumber = int.Parse( token.Value );
switch (state)
{
case State.Start:
number = nextNumber;
state = State.Unknown;
break;
case State.Unknown:
yield return number;
number = nextNumber;
break;
case State.InRange:
int rangeLength = nextNumber - number+ 1;
foreach (int i in Enumerable.Range( number, rangeLength ))
{
yield return i;
}
state = State.Start;
break;
default:
throw new ArgumentOutOfRangeException();
}
break;
case "range":
switch (state)
{
case State.Start:
throw new ArgumentOutOfRangeException();
break;
case State.Unknown:
state = State.InRange;
break;
case State.InRange:
throw new ArgumentOutOfRangeException();
break;
default:
throw new ArgumentOutOfRangeException();
}
break;
default:
throw new ArgumentOutOfRangeException( nameof( token ) );
}
}
switch (state)
{
case State.Start:
break;
case State.Unknown:
yield return number;
break;
case State.InRange:
break;
default:
throw new ArgumentOutOfRangeException();
}
}
}
}
Split
与和的一条线方法Linq
string input = "1,3,5-10,12";
IEnumerable<int> result = input.Split(',').SelectMany(x => x.Contains('-') ? Enumerable.Range(int.Parse(x.Split('-')[0]), int.Parse(x.Split('-')[1]) - int.Parse(x.Split('-')[0]) + 1) : new int[] { int.Parse(x) });
这是 lassevk 代码的略微修改版本,用于处理 Regex 匹配中的 string.Split 操作。它是作为扩展方法编写的,您可以使用 LINQ 的 Disinct() 扩展轻松处理重复问题。
/// <summary>
/// Parses a string representing a range of values into a sequence of integers.
/// </summary>
/// <param name="s">String to parse</param>
/// <param name="minValue">Minimum value for open range specifier</param>
/// <param name="maxValue">Maximum value for open range specifier</param>
/// <returns>An enumerable sequence of integers</returns>
/// <remarks>
/// The range is specified as a string in the following forms or combination thereof:
/// 5 single value
/// 1,2,3,4,5 sequence of values
/// 1-5 closed range
/// -5 open range (converted to a sequence from minValue to 5)
/// 1- open range (converted to a sequence from 1 to maxValue)
///
/// The value delimiter can be either ',' or ';' and the range separator can be
/// either '-' or ':'. Whitespace is permitted at any point in the input.
///
/// Any elements of the sequence that contain non-digit, non-whitespace, or non-separator
/// characters or that are empty are ignored and not returned in the output sequence.
/// </remarks>
public static IEnumerable<int> ParseRange2(this string s, int minValue, int maxValue) {
const string pattern = @"(?:^|(?<=[,;])) # match must begin with start of string or delim, where delim is , or ;
\s*( # leading whitespace
(?<from>\d*)\s*(?:-|:)\s*(?<to>\d+) # capture 'from <sep> to' or '<sep> to', where <sep> is - or :
| # or
(?<from>\d+)\s*(?:-|:)\s*(?<to>\d*) # capture 'from <sep> to' or 'from <sep>', where <sep> is - or :
| # or
(?<num>\d+) # capture lone number
)\s* # trailing whitespace
(?:(?=[,;\b])|$) # match must end with end of string or delim, where delim is , or ;";
Regex regx = new Regex(pattern, RegexOptions.IgnorePatternWhitespace | RegexOptions.Compiled);
foreach (Match m in regx.Matches(s)) {
Group gpNum = m.Groups["num"];
if (gpNum.Success) {
yield return int.Parse(gpNum.Value);
} else {
Group gpFrom = m.Groups["from"];
Group gpTo = m.Groups["to"];
if (gpFrom.Success || gpTo.Success) {
int from = (gpFrom.Success && gpFrom.Value.Length > 0 ? int.Parse(gpFrom.Value) : minValue);
int to = (gpTo.Success && gpTo.Value.Length > 0 ? int.Parse(gpTo.Value) : maxValue);
for (int i = from; i <= to; i++) {
yield return i;
}
}
}
}
}
我想出的答案:
static IEnumerable<string> ParseRange(string str)
{
var numbers = str.Split(',');
foreach (var n in numbers)
{
if (!n.Contains("-"))
yield return n;
else
{
string startStr = String.Join("", n.TakeWhile(c => c != '-'));
int startInt = Int32.Parse(startStr);
string endStr = String.Join("", n.Reverse().TakeWhile(c => c != '-').Reverse());
int endInt = Int32.Parse(endStr);
var range = Enumerable.Range(startInt, endInt - startInt + 1)
.Select(num => num.ToString());
foreach (var s in range)
yield return s;
}
}
}
正则表达式的效率不如以下代码。字符串方法比正则表达式更有效,应尽可能使用。
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Text.RegularExpressions;
namespace ConsoleApplication1
{
class Program
{
static void Main(string[] args)
{
string[] inputs = {
"001-005/015",
"009/015"
};
foreach (string input in inputs)
{
List<int> numbers = new List<int>();
string[] strNums = input.Split(new char[] { '/' }, StringSplitOptions.RemoveEmptyEntries);
foreach (string strNum in strNums)
{
if (strNum.Contains("-"))
{
int startNum = int.Parse(strNum.Substring(0, strNum.IndexOf("-")));
int endNum = int.Parse(strNum.Substring(strNum.IndexOf("-") + 1));
for (int i = startNum; i <= endNum; i++)
{
numbers.Add(i);
}
}
else
numbers.Add(int.Parse(strNum));
}
Console.WriteLine(string.Join(",", numbers.Select(x => x.ToString())));
}
Console.ReadLine();
}
}
}
我的解决方案:
自动完成:1,-3,5-,8 (Nmax=9) => 1,3,5,6,7,8,9,8
public static List<int> pageRangeToList(string pageRg, int Nmax = 0)
{
List<int> ls = new List<int>();
int lb,ub,i;
foreach (string ss in pageRg.Split(','))
{
if(int.TryParse(ss,out lb)){
ls.Add(Math.Abs(lb));
} else {
var subls = ss.Split('-').ToList();
lb = (int.TryParse(subls[0],out i)) ? i : 0;
ub = (int.TryParse(subls[1],out i)) ? i : Nmax;
ub = ub > 0 ? ub : lb; // if ub=0, take 1 value of lb
for(i=0;i<=Math.Abs(ub-lb);i++)
ls.Add(lb<ub? i+lb : lb-i);
}
}
Nmax = Nmax > 0 ? Nmax : ls.Max(); // real Nmax
return ls.Where(s => s>0 && s<=Nmax).ToList();
}