How to implement a single byte based string?
Application uses a large list of words.
Words come from SQL and is varchar (single byte).
Each word also has in Int32 ID.
Download the words to:
Dictionionary<Int32,string>
for performance.
Problem is the Dictionary gets so large that will get an out of memory exception.
We end up splitting up the data.
The app hits the list so much that hitting SQL for each request is not an option.
The database is already very active.
Dynamically paging into and out of the Dictionary is not an option - it is bound to ListView and with virtualiztion works great.
Words are only loaded at night - the user just needs a static list.
They use the words to search and process other data but they don't process the words.
Since it is char thought could just implement a single byte based word:
public class StringByte1252 : Object, IComparable, IComparable<StringByte1252>
{
static Encoding win1252 = Encoding.GetEncoding("Windows-1252");
public Int32 ID { get; private set; }
public byte[] Bytes { get; private set; }
public string Value { get { return win1252.GetString(Bytes); } }
public Int32 Length { get { return Bytes.Length; } }
public int CompareTo(object obj)
{
if (obj == null)
{
return 1;
}
StringByte1252 other = obj as StringByte1252;
if (other == null)
{
throw new ArgumentException("A StringByte1252 object is required for comparison.", "obj");
}
return this.CompareTo(other);
}
public int CompareTo(StringByte1252 other)
{
if (object.ReferenceEquals(other, null))
{
return 1;
}
return string.Compare(this.Value, other.Value, StringComparison.OrdinalIgnoreCase);
}
public override bool Equals(Object obj)
{
//Check for null and compare run-time types.
if (obj == null || !(obj is StringByte1252)) return false;
StringByte1252 item = (StringByte1252)obj;
return (this.Bytes == item.Bytes);
}
public override int GetHashCode() { return ID; }
public StringByte1252(Int32 id, byte[] bytes) { ID = id; Bytes = bytes; }
}
This above works but it is NOT more memory efficient than
Dictionionary<Int32,string>
Dictionary with Int16 based characters actually uses slightly less memory.
Where did I go wrong?
Does a byte array take more space than the sum of the bytes?
Is there a way to achieve single byte string?