I have a csv file named data_export_20130206-F.csv. It contains data that contains double quotes (") which is making it very messy to parse.
File looks kind of like this (but with more fields)
"stuff","zipcode"
"<?xml version="1.0" encoding="utf-8" ?>","90210"
I want to "escape" the quotes that are within the fields so it will look like this (Note: the quotes within the xml have been doubled):
"stuff","zipcode"
"<?xml version=""1.0"" encoding=""utf-8"" ?>","90210"
But when I run this:
cat data_export_20130206-F.csv| sed -E 's@([^,])(\")([^,])@\1""\3@g'
Unfortunately, It adds an additional double quote at the end of each line making the document invalid.
"stuff","zipcode""
"<?xml version=""1.0"" encoding=""utf-8"" ?>","90210""
How do I replace double quotes within csv fields but not add a trailing double quote to each line?