38

Azure TableStorage RowKeys 中是否存在受限字符模式?我无法通过多次搜索找到任何记录。但是,在某些性能测试中,我得到的行为暗示了这种行为。

我对包含随机字符的 RowKeys 有一些奇怪的行为(测试驱动程序确实阻止了受限字符(/ \ # ?)加上阻止单引号出现在 RowKey 中)。结果是我有一个 RowKey 可以很好地插入到表中,但不能被查询(结果是 InvalidInput)。例如:

RowKey: 9}5O0J=5Z,4,D,{!IKPE,~M]%54+9G0ZQ&G34!G+

尝试通过此 RowKwy(相等)进行查询将导致错误(在我们的应用程序中,使用 Azure Storage Explorer 和 Cloud Storage Studio 2)。我查看了通过 Fiddler 发送的请求:

GET /foo()?$filter=RowKey%20eq%20'9%7D5O0J=5Z,4,D,%7B!IKPE,~M%5D%54+9G0ZQ&G34!G+' HTTP/1.1

RowKey 中的 %54 似乎没有在过滤器中转义。有趣的是,对于使用包含此 RowKey 的批处理 XML 中的 URI 对表存储的批处理请求,我得到了类似的行为。我也看到了带有嵌入双引号的 RowKeys 的类似行为,尽管我还没有隔离这种模式。

有没有人遇到过这种行为?我可以轻松地限制其他字符出现在 RowKeys 中,但我真的很想知道“规则”。

4

5 回答 5

60

PartitionKey 和 RowKey 字段中不允许使用以下字符:

  • 正斜杠 ( /) 字符
  • 反斜杠 ( \) 字符
  • 数字符号 ( #) 字符
  • 问号 ( ?) 字符

进一步阅读Azure Docs > 了解表服务数据模型

于 2012-07-17T02:47:50.977 回答
28

public static readonly Regex DisallowedCharsInTableKeys = new Regex(@"[\\\\#%+/?\u0000-\u001F\u007F-\u009F]");

Detection of Invalid Table Partition and Row Keys:

bool invalidKey = DisallowedCharsInTableKeys.IsMatch(tableKey);

Sanitizing the Invalid Partition or Row Key:

string sanitizedKey = DisallowedCharsInTableKeys.Replace(tableKey, disallowedCharReplacement);

At this stage you may also want to prefix the sanitized key (Partition Key or Row Key) with the hash of the original key to avoid false collisions of different invalid keys having the same sanitized value.

Do not use the string.GetHashCode() though since it may produce different hash code for the same string and shall not be used to identify uniqueness and shall not be persisted.

I use SHA256: https://msdn.microsoft.com/en-us/library/s02tk69a(v=vs.110).aspx

to create the byte array hash of the invalid key, convert the byte array to hex string and prefix the sanitized table key with that.

Also see related MSDN Documentation: https://msdn.microsoft.com/en-us/library/azure/dd179338.aspx

Related Section from the link: Characters Disallowed in Key Fields

The following characters are not allowed in values for the PartitionKey and RowKey properties:

The forward slash (/) character

The backslash (\) character

The number sign (#) character

The question mark (?) character

Control characters from U+0000 to U+001F, including:

  • The horizontal tab (\t) character

  • The linefeed (\n) character

  • The carriage return (\r) character

Control characters from U+007F to U+009F

Note that in addition to the mentioned chars in the MSDN article, I also added the % char to the pattern since I saw in a few places where people mention it being problematic. I guess some of this also depends on the language and the tech you are using to access the table storage.

If you detect additional problematic chars in your case, then you can add those to the regex pattern, nothing else needs to change.

于 2016-06-10T13:30:23.173 回答
10

我刚刚(艰难地)发现“+”号是允许的,但不能在 PartitionKey 中查询。

于 2014-12-12T09:18:36.697 回答
8

我发现除了 Igorek 的答案中列出的字符之外,这些也会导致问题(例如插入会失败):

  • |
  • []
  • {}
  • <>
  • $^&

使用 Azure Node.js SDK 进行测试。

于 2012-12-06T12:37:22.950 回答
2

我使用此功能转换密钥:

private static string EncodeKey(string key)
{
    return HttpUtility.UrlEncode(key);
}

当然,这需要为插入和检索完成。

于 2021-04-01T12:27:51.217 回答