You could always create your own parser. What I use is:
`var ANSI = (Encoding) Encoding.GetEncoding(1252).Clone();
ANSI.EncoderFallback = new EncoderReplacementFallback(string.Empty);`
The first line of this creates a clone of the Win-1252 encoding (as the database I deal with works with Win-1252, you'd probably want to use UTF-8 or ASCII). The second line - when parsing characters - returns an empty string if there is no equivalent to the original character.
After this you'd want to preferably filter out all command characters (excluding tabs, spaces, line feeds and carriage returns depending on what you need).
Below is my personal encoding-parser which I set up to correct data entering our database.
private string RetainOnlyPrintableCharacters(char c)
{
//even if the character comes from a different codepage altogether,
//if the character exists in 1252 it will be returned in 1252 format.
var ansiBytes = _ansiEncoding.GetBytes(new char[] {c});
if (ansiBytes.Any())
{
if (ansiBytes.First().In(_printableCharacters))
{
return _ansiEncoding.GetString(ansiBytes);
}
}
return string.Empty;
}
_ansiEncoding comes from the var ANSI = (Encoding) Encoding.GetEncoding(1252).Clone(); with the fallback value set
if ansiBytes is not empty, it means that there is an encoding available for that particular character being passed in, so it is compared with a list of all the printable characters and if it exists - it is an acceptable character so is returned.