0

I have a dataset which includes a text field and a date field. Sample in .csv format below:

ID,Text,Date,BP,Person
1,This is Text!,6/24/2013,120,Bob
2,I don't like Text.,6/24/2013,114,Bob
3,Text files are stupid.,6/24/2013,310,Genny
4,"The Cezanne, for 500 please.",6/24/2013,45,Glenn
5,I enhanced my coffee with Kahlua,6/25/2013,105,Genny
6,And something else here.,6/24/2013,200,Bob

I want to remove any record where the Text field does not contain the word "[Tt]ext" & is dated 6/24/2013. So, records 4 & 6 would be dropped while everything else remains.

I've tried subsetting the frame like so:

newframe <- frame[!which(grep('[Tt]ext', frame$Text) &
                         frame$Date == '6/24/2013'), ] 

But that got me nowhere.

4

2 回答 2

2

请您以可更改的结构提供您的数据。使用grepl,这应该工作:

frame[with(frame,
   !grepl('[Tt]ext', Text) & Date == '6/24/2013'),]
  ID                         Text      Date  BP Person
4  4 The Cezanne, for 500 please. 6/24/2013  45  Glenn
6  6     And something else here. 6/24/2013 200    Bob
于 2013-07-08T14:15:29.640 回答
0

这是我的最终代码的样子,使用grepl命令:

newframe <- frame[ !(!grepl('[Tt]ext', frame$Plain.Text) & frame$Date == '6/24/2013', ]

这让我得到了原始数据框,不包括 2013 年 6 月 24 日不包含“[Tt]ext”的记录。

于 2013-07-08T18:27:31.187 回答