1

我有一个代表银行交易的 csv 文件,如下所示:

"Date","Description","Original Description","Amount","Transaction Type","Category","Account Name","Labels","Notes"

"10/18/2012","Amazon","AMAZON.COM","27.60","debit","Shopping","CHASE COLLEGE","",""

"10/19/2012","Virgin America","VIRGIN AMER","155.90","debit","Air Travel","CREDIT CARD","",""

"10/20/2012","Airport Express","AIR EXP","16.00","credit","Credit Card Payment","CREDIT CARD","",""

我正在尝试对其进行转换,以便第 4 列中的值是 +/-,具体取决于第 5 列的值。如果 5 表示“借方”,则第 4 列的值应为“-”,如果它表示“贷方”,则col 4 的值应该是“+”

所以输出会是这样的:

"Date","Description","Original Description","Amount","Transaction Type","Category","Account Name","Labels","Notes"

"10/18/2012","Amazon","AMAZON.COM","-27.60","debit","Shopping","CHASE COLLEGE","",""

"10/19/2012","Virgin America","VIRGIN AMER","-155.90","debit","Air Travel","CREDIT CARD","",""

"10/20/2012","Airport Express","AIR EXP","16.00","credit","Credit Card Payment","CREDIT CARD","",""

最好的方法是什么?我曾考虑编写一个带有 If 语句的 MATLAB 程序以读取文件,但我想知道是否有一种简单的方法可以从终端执行此操作,例如在 Vim 中使用 AWK 或 RegEx!

4

1 回答 1

3

从技术上讲,您可以使用正则表达式(或至少一对)来做到这一点,但awk更适合。一般来说,awk 对带引号的字段不是很好,但是由于每个字段都被引用并且我们不需要使用第一个字段,我们可以解决这个问题。

awk 'BEGIN{FS=OFS="\",\""}$5=="credit"{$4="+"$4}$5=="debit"{$4="-"$4}1' file.csv

解释

awk '
BEGIN {
    # Set the input and output field separators to ",", *with* the quotes.
    FS=OFS="\",\"" 
}

# On every line where field 5 is "credit" ...
$5 == "credit" { 
    # ... Prepend "+" to the fourth field.
    $4="+"$4 
}

# On every line where the fifth field is "debit" ...
$5 == "debit" { 
    # ... Prepend "-" to the fourth field.
    $4="-"$4
}

# Print the line
1
' test.in
于 2013-10-31T04:20:30.207 回答