1

我有variable1字符串,例如"asdfsad What do you do", "qwer What is your name", "Zebra"

variable2字符串"asdfsad", "qwer", "Animal"

我想从 variable1 中的字符串中删除第一个单词,如果它等于variable2. 到目前为止,我唯一能想到的就是分别替换每个单词:

variable1=tranwrd(variable1, "asdfsad", "");等。但是我有很多词要替换。

非常感谢您的帮助。

4

4 回答 4

2

像这样的东西怎么样:

data sample;
  length variable1 variable2 $100;
  variable1= "asdfsad What do you do"; variable2 = "asdfsad"; output;
  variable1= "qwer What is your name"; variable2 = "qwer";    output;
  variable1= "Zebra"                 ; variable2 = "Animal";  output;
run;

data fixed;
  length first_word $100;

  set sample;

  first_word = scan(variable1,1);
  if first_word eq variable2 then do;
    start_pos = length(first_word) + 1;
    variable1 = substr(variable1,start_pos); 
  end;
run;

这将适用于匹配整个第一个单词。它会在剩余文本中留下空格或其他标点符号,但如果您愿意,您应该能够轻松更改。

如果您的问题是逐个字符匹配而不是在整个第一个单词上匹配,那么这将是一个非常不同的问题,我建议您发布一个新问题。

于 2012-09-05T01:46:46.047 回答
0

如果您对 tranwrd 的结果感到满意,您也可以使用它。你只需要注意空格

variable1 = strip(tranwrd(variable1, strip(variable2), ''));
于 2012-09-05T04:13:38.483 回答
0
if scan(variable1,1)=variable2 then
  variable1=substr(variable1,index(variable1," "));
于 2012-09-05T10:21:24.400 回答
0

对于数千个单词,这可能不会有效或不可行,但您可以使用 Perl 正则表达式(例如s/search/replacement/)通过prxchange

/* words to match delimited by "|" */
%let words = asdfsad|qwer|Animal|foo|bar|horse;

/* example data */
data example;
  infile datalines dlm=',' dsd;
  input string: $256.;
datalines;
asdfsad What do you do
qwer What is your name
Zebra
food is in the fridge
foo    A horse entered a bar
;
run;

/* cleaned data */
data example_clean;
  set example;

  /*
    regular expression is:
      - created once on first row (_n_ = 1)
      - cached (retain regex)
      - dropped at the end (drop regex).
  */
  if _n_ = 1 then do;
    retain regex;
    drop regex;
    regex = prxparse("s/^(&words)\s+//");
  end;

  string = prxchange(regex, 1, string);  /* apply the regex (once) */
run;

正则^表达式中的符号prxparse(构造在不匹配)。|\s+

于 2012-09-09T08:20:22.553 回答