38

有没有办法在.Net中检查用于路径的字符串是否包含无效字符?我知道我可以遍历 Path.InvalidPathChars 中的每个字符以查看我的 String 是否包含一个字符,但我更喜欢一个简单的,也许更正式的解决方案。

有吗?

我发现如果我只检查 Get,我仍然会遇到异常

更新:

我发现 GetInvalidPathChars 并没有涵盖每个无效的路径字符。GetInvalidFileNameChars 还有 5 个,包括我遇到的“?”。我将切换到那个,如果它也被证明是不充分的,我会报告。

更新 2:

GetInvalidFileNameChars 绝对不是我想要的。它包含':',任何绝对路径都将包含它(“C:\whatever”)。我想我毕竟只需要使用 GetInvalidPathChars 并添加“?” 以及任何其他在出现时给我带来问题的角色。欢迎更好的解决方案。

4

9 回答 9

48

不推荐使用 InvalidPathChars。请改用 GetInvalidPathChars():

    public static bool FilePathHasInvalidChars(string path)
    {

        return (!string.IsNullOrEmpty(path) && path.IndexOfAny(System.IO.Path.GetInvalidPathChars()) >= 0);
    }

编辑:稍长,但在一个函数中处理路径与文件无效字符:

    // WARNING: Not tested
    public static bool FilePathHasInvalidChars(string path)
    {
        bool ret = false;
        if(!string.IsNullOrEmpty(path))
        {
            try
            {
                // Careful!
                //    Path.GetDirectoryName("C:\Directory\SubDirectory")
                //    returns "C:\Directory", which may not be what you want in
                //    this case. You may need to explicitly add a trailing \
                //    if path is a directory and not a file path. As written, 
                //    this function just assumes path is a file path.
                string fileName = System.IO.Path.GetFileName(path);
                string fileDirectory = System.IO.Path.GetDirectoryName(path);

                // we don't need to do anything else,
                                    // if we got here without throwing an 
                                    // exception, then the path does not
                                    // contain invalid characters
            }
            catch (ArgumentException)
            {
                                    // Path functions will throw this 
                                    // if path contains invalid chars
                ret = true;
            }
        }
        return ret;
    }
于 2010-03-12T21:13:35.260 回答
9

依赖 时要小心Path.GetInvalidFileNameChars,它可能不像你想象的那么可靠。请注意 MSDN 文档中的以下注释Path.GetInvalidFileNameChars

不保证从此方法返回的数组包含文件和目录名称中无效的完整字符集。完整的无效字符集可能因文件系统而异。例如,在基于 Windows 的桌面平台上,无效路径字符可能包括 ASCII/Unicode 字符 1 到 31,以及引号 (")、小于 (<)、大于 (>)、竖线 (|)、退格 ( \b)、空 (\0) 和制表符 (\t)。

方法也好不到哪里去Path.GetInvalidPathChars。它包含完全相同的注释。

于 2011-11-16T13:34:24.107 回答
5

.NET 4.7.2开始,Path.GetInvalidFileNameChars()报告以下 41 个“坏”字符。

0x0000 0 '\0' | 0x000d 13 '\r' | 0x001b 27 '\u001b'
0x0001 1 '\u0001' | 0x000e 14 '\u000e' | 0x001c 28 '\u001c'
0x0002 2 '\u0002' | 0x000f 15 '\u000f' | 0x001d 29 '\u001d'
0x0003 3 '\u0003' | 0x0010 16 '\u0010' | 0x001e 30 '\u001e'
0x0004 4 '\u0004' | 0x0011 17 '\u0011' | 0x001f 31 '\u001f'
0x0005 5 '\u0005' | 0x0012 18 '\u0012' | 0x0022 34 '"'
0x0006 6 '\u0006' | 0x0013 19 '\u0013' | 0x002a 42 '*'
0x0007 7 '\a' | 0x0014 20 '\u0014' | 0x002f 47 '/'
0x0008 8 '\b' | 0x0015 21 '\u0015' | 0x003a 58 ':'
0x0009 9 '\t' | 0x0016 22 '\u0016' | 0x003c 60 '<'
0x000a 10 '\n' | 0x0017 23 '\u0017' | 0x003e 62 '>'
0x000b 11 '\v' | 0x0018 24 '\u0018' | 0x003f 63 '?'
0x000c 12 '\f' | 0x0019 25 '\u0019' | 0x005c 92 '\\'
                        | 0x001a 26 '\u001a' | 0x007c 124 '|'

正如另一张海报所指出的,这是.返回的字符集的正确超集Path.GetInvalidPathChars()

以下函数检测上面显示的确切的 41 个字符集:

public static bool IsInvalidFileNameChar(Char c) => c < 64U ?
        (1UL << c & 0xD4008404FFFFFFFFUL) != 0 :
        c == '\\' || c == '|';
于 2018-09-04T00:39:47.593 回答
4

我最终借用并结合了一些内部 .NET 实现来提出一种高性能方法:

/// <summary>Determines if the path contains invalid characters.</summary>
/// <remarks>This method is intended to prevent ArgumentException's from being thrown when creating a new FileInfo on a file path with invalid characters.</remarks>
/// <param name="filePath">File path.</param>
/// <returns>True if file path contains invalid characters.</returns>
private static bool ContainsInvalidPathCharacters(string filePath)
{
    for (var i = 0; i < filePath.Length; i++)
    {
        int c = filePath[i];

        if (c == '\"' || c == '<' || c == '>' || c == '|' || c == '*' || c == '?' || c < 32)
            return true;
    }

    return false;
}

然后我像这样使用它,但为了安全起见,也将它包裹在一个 try/catch 块中:

if ( !string.IsNullOrWhiteSpace(path) && !ContainsInvalidPathCharacters(path))
{
    FileInfo fileInfo = null;

    try
    {
        fileInfo = new FileInfo(path);
    }
    catch (ArgumentException)
    {            
    }

    ...
}
于 2015-12-08T05:55:40.713 回答
2

对您来说可能为时已晚,但可能会帮助其他人。我遇到了同样的问题,需要找到一种可靠的方法来清理路径。

这是我最终使用的内容,分 3 个步骤:

第 1 步:自定义清洁。

public static string RemoveSpecialCharactersUsingCustomMethod(this string expression, bool removeSpecialLettersHavingASign = true)
{
    var newCharacterWithSpace = " ";
    var newCharacter = "";

    // Return carriage handling
    // ASCII LINE-FEED character (LF),
    expression = expression.Replace("\n", newCharacterWithSpace);
    // ASCII CARRIAGE-RETURN character (CR) 
    expression = expression.Replace("\r", newCharacterWithSpace);

    // less than : used to redirect input, allowed in Unix filenames, see Note 1
    expression = expression.Replace(@"<", newCharacter);
    // greater than : used to redirect output, allowed in Unix filenames, see Note 1
    expression = expression.Replace(@">", newCharacter);
    // colon: used to determine the mount point / drive on Windows; 
    // used to determine the virtual device or physical device such as a drive on AmigaOS, RT-11 and VMS; 
    // used as a pathname separator in classic Mac OS. Doubled after a name on VMS, 
    // indicates the DECnet nodename (equivalent to a NetBIOS (Windows networking) hostname preceded by "\\".). 
    // Colon is also used in Windows to separate an alternative data stream from the main file.
    expression = expression.Replace(@":", newCharacter);
    // quote : used to mark beginning and end of filenames containing spaces in Windows, see Note 1
    expression = expression.Replace(@"""", newCharacter);
    // slash : used as a path name component separator in Unix-like, Windows, and Amiga systems. 
    // (The MS-DOS command.com shell would consume it as a switch character, but Windows itself always accepts it as a separator.[16][vague])
    expression = expression.Replace(@"/", newCharacter);
    // backslash : Also used as a path name component separator in MS-DOS, OS/2 and Windows (where there are few differences between slash and backslash); allowed in Unix filenames, see Note 1
    expression = expression.Replace(@"\", newCharacter);
    // vertical bar or pipe : designates software pipelining in Unix and Windows; allowed in Unix filenames, see Note 1
    expression = expression.Replace(@"|", newCharacter);
    // question mark : used as a wildcard in Unix, Windows and AmigaOS; marks a single character. Allowed in Unix filenames, see Note 1
    expression = expression.Replace(@"?", newCharacter);
    expression = expression.Replace(@"!", newCharacter);
    // asterisk or star : used as a wildcard in Unix, MS-DOS, RT-11, VMS and Windows. Marks any sequence of characters 
    // (Unix, Windows, later versions of MS-DOS) or any sequence of characters in either the basename or extension 
    // (thus "*.*" in early versions of MS-DOS means "all files". Allowed in Unix filenames, see note 1
    expression = expression.Replace(@"*", newCharacter);
    // percent : used as a wildcard in RT-11; marks a single character.
    expression = expression.Replace(@"%", newCharacter);
    // period or dot : allowed but the last occurrence will be interpreted to be the extension separator in VMS, MS-DOS and Windows. 
    // In other OSes, usually considered as part of the filename, and more than one period (full stop) may be allowed. 
    // In Unix, a leading period means the file or folder is normally hidden.
    expression = expression.Replace(@".", newCharacter);
    // space : allowed (apart MS-DOS) but the space is also used as a parameter separator in command line applications. 
    // This can be solved by quoting, but typing quotes around the name every time is inconvenient.
    //expression = expression.Replace(@"%", " ");
    expression = expression.Replace(@"  ", newCharacter);

    if (removeSpecialLettersHavingASign)
    {
        // Because then issues to zip
        // More at : http://www.thesauruslex.com/typo/eng/enghtml.htm
        expression = expression.Replace(@"ê", "e");
        expression = expression.Replace(@"ë", "e");
        expression = expression.Replace(@"ï", "i");
        expression = expression.Replace(@"œ", "oe");
    }

    return expression;
}

第 2 步:检查任何尚未删除的无效字符。

一个额外的验证步骤,我使用Path.GetInvalidPathChars()上面发布的方法来检测任何尚未删除的潜在无效字符。

public static bool ContainsAnyInvalidCharacters(this string path)
{
    return (!string.IsNullOrEmpty(path) && path.IndexOfAny(Path.GetInvalidPathChars()) >= 0);
}

第 3 步:清除第 2 步中检测到的所有特殊字符。

最后,我用这个方法作为最后一步来清理剩下的东西。(来自如何从路径和文件名中删除非法字符?):

public static string RemoveSpecialCharactersUsingFrameworkMethod(this string path)
{
    return Path.GetInvalidFileNameChars().Aggregate(path, (current, c) => current.Replace(c.ToString(), string.Empty));
}

我记录了在第一步中未清除的任何无效字符。一旦检测到“泄漏”,我就会选择采用这种方式来改进我的自定义方法。Path.GetInvalidFileNameChars()由于上面报告的以下陈述(来自 MSDN),我不能依赖:

“不保证从此方法返回的数组包含文件和目录名称中无效的完整字符集。”

它可能不是理想的解决方案,但考虑到我的应用程序环境和所需的可靠性水平,这是我找到的最佳解决方案。

于 2015-11-09T12:18:04.880 回答
1

我建议使用 aHashSet来提高效率:

private static HashSet<char> _invalidCharacters = new HashSet<char>(Path.GetInvalidPathChars());

然后您可以简单地检查字符串是否为空/空并且没有任何无效字符:

public static bool IsPathValid(string filePath)
{
    return !string.IsNullOrEmpty(filePath) && !filePath.Any(pc => _invalidCharacters.Contains(pc));
}

在线尝试

于 2019-11-26T04:34:58.310 回答
0

仅供参考,该框架具有执行此操作的内部方法 - 但不幸的是,它们已被标记internal

此处参考相关位,类似于此处接受的答案。

internal static bool HasIllegalCharacters(string path, bool checkAdditional = false) => (AppContextSwitches.UseLegacyPathHandling || !PathInternal.IsDevice(path)) && PathInternal.AnyPathHasIllegalCharacters(path, checkAdditional);

    internal static bool AnyPathHasIllegalCharacters(string path, bool checkAdditional = false)
    {
      if (path.IndexOfAny(PathInternal.InvalidPathChars) >= 0)
        return true;
      return checkAdditional && PathInternal.AnyPathHasWildCardCharacters(path);
    }

    internal static bool HasWildCardCharacters(string path)
    {
      int startIndex = AppContextSwitches.UseLegacyPathHandling ? 0 : (PathInternal.IsDevice(path) ? "\\\\?\\".Length : 0);
      return PathInternal.AnyPathHasWildCardCharacters(path, startIndex);
    }

    internal static bool AnyPathHasWildCardCharacters(string path, int startIndex = 0)
    {
      for (int index = startIndex; index < path.Length; ++index)
      {
        switch (path[index])
        {
          case '*':
          case '?':    
            return true;
          default:
            continue;
        }
      }
      return false;
    }
于 2020-11-04T22:23:18.880 回答
0

我也太晚了。但是,如果任务是验证用户是否输入了有效的路径作为路径,则有一个路径的组合解决方案。

Path.GetInvalidFileNameChars()返回文件的非法字符列表,但目录遵循文件的规则,除了分隔符(我们可以从系统中获取)和根说明符(C:我们可以将其从搜索中删除)。是的,Path.GetInvalidFileNameChars()返回的不是完整的集合,但比尝试手动找到所有集合要好。

所以:

private static bool CheckInvalidPath(string targetDir)
{
  string root;
  try
  {
    root = Path.GetPathRoot(targetDir);
  }
  catch
  {
    // the path is definitely invalid if it has crashed
    return false;
  }

  // of course it is better to cache it as it creates
  // new array on each call
  char[] chars = Path.GetInvalidFileNameChars();

  // ignore root
  for (int i = root.Length; i < targetDir.Length; i++)
  {
    char c = targetDir[i];

    // separators are allowed
    if (c == Path.DirectorySeparatorChar || c == Path.AltDirectorySeparatorChar)
      continue;

    // check for illegal chars
    for (int j = 0; j < chars.Length; j++)
      if (c == chars[j])
        return false;
  }

  return true;
}

我发现类似的方法Path.GetFileName不会因为路径C:\*(完全无效)而崩溃,甚至基于异常的检查也是不够的。唯一会崩溃的Path.GetPathRoot是无效的根(如CC:\someDir)。所以其他的一切都应该手动完成。

于 2018-07-04T15:01:11.463 回答
0

简单而正确,因为它可以考虑 MS 文档:

bool IsPathValid(String path)
{
    for (int i = 0; i < path.Length; ++i)
        if (Path.GetInvalidFileNameChars().Contains(path[i]))
            return false
    return true;
}
于 2020-07-08T00:35:41.247 回答