我有一个包含数据的区域表。对于特定操作,我想排除顶部和底部 1% 的区域,因为它们包含极端异常值。
在我看来,前进的道路是:
SORT CASES BY theVariableIwantToAnalyse (A) .
NUMERIC id (F12.0) . * create a casenum label "id"
COMPUTE id = $CASENUM. * populate it with casenum
EXECUTE.
NUMERIC idmax (F12.4) . * create a variable to contain the highest value for "id"
NUMERIC id1perc (F12.4) . * create a variable to contain 1% of the highest value for "id"
COMPUTE idmax = MAX(id) . * determine the highest value for id. This 'mock-syntax' line does not work.
COMPUTE id1perc = idmax / 100 . * 1% of the highest value for "id"
SELECT CASES WHERE ID >= id1perc or ID <= idmax - id1perc .
绘制图表等。然后我需要
SORT CASES BY theNextVariableIwantToAnalyse (A) .
COMPUTE id = $CASENUM. * populate it with the NEW casenum order
EXECUTE.
ETC ...