根据你后面的计算,我一直使用 320。允许更多*不会花费您任何费用,除非人们滥用它并将垃圾放入其中。允许更少可能会花费您,因为如果用户拥有合法更长的电子邮件地址,您将遇到令人沮丧的用户,现在您必须返回并更新架构、代码、参数等。在我以前工作的系统中与(电子邮件服务提供商)一起,我遇到的最长的电子邮件地址自然是大约 120 个字符 - 很明显,他们只是为了微笑而制作了一个长电子邮件地址。
*并非完全正确,因为内存授予估计是基于可变宽度列是半填充的假设,因此存储相同数据的更宽列可能会导致某些查询的性能特征大不相同。
我一直在争论是否NVARCHAR
需要电子邮件地址。我还没有遇到带有 Unicode 字符的电子邮件地址——我知道标准支持它们,但是很多现有系统不支持,如果那是您的电子邮件地址,那将是非常令人沮丧的。
And while it's true that NVARCHAR
costs double the space, with SQL Server 2008 R2 you can benefit from Unicode compression, which basically treats all non-Unicode characters in an NVARCHAR
column as ASCII, so you get those extra bytes back. Of course compression is only available in Enterprise+...
Another way to reduce space requirements is to use a central lookup table for all observed domain names, and store LocalPart
and DomainID
with the user, and store each unique domain name only once. Yes this makes for more cumbersome programming, but if you have 80,000 hotmail.com addresses, the cost is 80,0000 x 4 bytes instead of 80,000 x 11 bytes (or less with compression). If storage or I/O is your bottleneck, and not CPU, this is definitely an option worth investigating.
I wrote about this here: