0

我有一张这样的桌子:

 |Num |  Label
-----------------------
1|1   |  a thing
2|2   |  another thing
3|3   |  something else
4|4   |  whatever

我想用更通用的东西替换我的标签列的值,例如前两行:标签一,接下来的两个标签二......

 |Num |  Label
-----------------------
1|1   |  label One
2|2   |  label One
3|3   |  label Two
4|4   |  label Two

我怎样才能使用proc format过程来做到这一点?我想知道我是否可以使用行数或其他列,如 Num。

我需要做这样的事情:

proc format;
value label_f
low-2 = "label One"
3-high = "label Two"
;
run;

但我想指定行号Num 列的值

4

2 回答 2

0

Gatsby:

It sounds like you want to format NUM instead of LABEL.

Where you want the use the 'generic' representation defined by your format simply place a FORMAT statement in the Proc being used:

PROC PRINT data=have;
  format num label_f.;
RUN;

If you want both num and generic, you will need to add a new column to the data for use during processing. This can be done with a view:

data have_view / view=have_view;
  set have;
  num_replicate1 = num;
  attrib num_replicate1 format=label_f. label='Generic';

  num_replacement = put (num,label_f.);
  attrib num_replacement label='Generic';   %* no format because the value is the formatted value of the original num;
run;

PROC PRINT data=have_view;
  var num num_replicate1 num_replacement;
RUN;

If you want a the 'generic' representation of the NUM column to be used in by-processing as a grouping variable, you have several scenarios:

  • know apriori the generic representation is by-group clustered
    • use a view and process with BY or BY ... NOTSORTED if clusters are not in sort order
  • force ordering for use with by-group processing
    • use an ordered SQL view containing the replicate and process with BY
    • add a replicate variable to the data set, sort by the formatted value and process with BY

A direct backmap from label to num to generic is possible only if the label is known to be unique, or you know apriori the transformation backmap-num + num-map is unique.

Proc FORMAT also has a special value construct [format] that can be used to map different ranges of values according to different formatting rules. The other range can also map to a different format that itself has an other range that maps to yet another different format. The SAS format engine will log an error if you happen to define a recursive loop using this advanced kind of format mapping.

propaedeutics

One of my favorite Dorfman words.

Format does not replace underlying values. Format is a map from the underlying data value to a rendered representation. The map can be 1:1, many:1. The MultiLabel Format (MLF) feature of the format system can even perform 1:many and many:many mappings in procedures many MLF enabled procedures (which is most of them)

To replace an underlying value with it's formatted version you need to use the PUT, PUTC or PUTN functions. The PUT functions always outputs a character value.

  • character ⇒ PUT ⇒ character [ FILE / PUT ]
  • numeric ⇒ PUT ⇒ character [ FILE / PUT ]

There is no guarantee a mapped value will mapped to the same value, it depends on the format.

INFORMATs are similar to FORMATs, however the target value depend on the in format type

  • character ⇒ INPUT ⇒ character [ INFILE / INPUT ]
  • numeric ⇒ INPUT ⇒ character
  • character ⇒ INPUT ⇒ numeric [ INFILE / INPUT ]
  • numeric ⇒ INPUT ⇒ numeric

Custom formats are created with Proc FORMAT. The construction of a format is specified by either the VALUE statement, or the CNTLIN= option. CNTLIN lets you create formats directly from data and avoids really large VALUE statements that are hand-entered or code-generated (via say macro)

Data-centric 'formatting' performs the mapping through a left-join. This is prevalent in SQL data bases. Left-joins in SAS can be done through SQL, DATA Step MERGE BY and FORMAT application. 1:1 left-joins can also be done via Hash object SET POINT=

于 2017-11-14T18:36:18.077 回答
0

words您可以使用该格式执行您所描述的操作。您可以在下面的函数中换出numfor以使用观察数而不是值(如果它们不总是相等):_N_ceilnum

data have;
length num 8 label $20;
infile datalines dlm='|';
input num label $;
datalines;
1|a thing
2|another thing
3|something else
4|whatever
5|whatever else
6|so many things
;
run;

data want;
set have;
label=catx(' ','label',propcase(put(ceil(num/2),words.)));
run;

尽管此答案可能对您的示例有点过于具体,并且可能不适用于您的实际上下文。

于 2017-11-14T16:08:30.900 回答