0

我想在我的 PMML 回归模型中添加一个额外的目标(“outputState”)。

  • outputState = 0:没有丢失/无效的输入值(-> 回归模型中没有插补)
  • outputState = 1:存在缺失/无效的无效值(->回归模型中的插补)

我尝试使用多个模型,但我不知道如何正确处理多个模型/目标/输出。

示例(以下解释):

    <?xml version="1.0" encoding="UTF-8" standalone="yes"?>
     <PMML xmlns="http://www.dmg.org/PMML-4_3" xmlns:data="http://jpmml.org/jpmml-model/InlineTable" version="4.3"><Header><Application name="JPMML-R" version="1.3.14"/><Timestamp>2020-01-07T15:56:07Z</Timestamp></Header>
    <DataDictionary>
      <DataField name="outputState" optype="categorical" dataType="integer"/>
      <DataField name="outputResult" optype="continuous" dataType="double"/>
      <DataField name="inputA" optype="continuous" dataType="double">
        <Interval closure="closedClosed" leftMargin="-1" rightMargin="1"/>
        <Value property="missing" value="NA"/>
      </DataField>
      <DataField name="inputB" optype="continuous" dataType="double">
        <Interval closure="closedClosed" leftMargin="-1" rightMargin="1"/>
        <Value property="missing" value="NA"/>
      </DataField>
      <DataField name="inputC" optype="continuous" dataType="double">
        <Interval closure="closedClosed" leftMargin="-1" rightMargin="1"/>
        <Value property="missing" value="NA"/>
      </DataField>
    </DataDictionary>
    <TransformationDictionary/>
    <MiningModel functionName="mixed">
      <MiningSchema>
      <MiningField name="outputState" usageType="target"/>
      <MiningField name="outputResult" usageType="target"/>
      <MiningField name="inputA"/>
      <MiningField name="inputB"/>
      <MiningField name="inputC"/>
    </MiningSchema>
    <Output>
      <OutputField name="outputState" optype="categorical" dataType="integer" targetField="outputState"/>
      <OutputField name="outputResult" optype="continuous" dataType="double" targetField="outputResult"/>
    </Output>
    <Segmentation multipleModelMethod="selectAll">
      <Segment id="1">
        <True/>
        <TreeModel modelName="TEST" functionName="classification" noTrueChildStrategy="returnLastPrediction">
          <MiningSchema>
            <MiningField name="outputState" usageType="target"/>
            <MiningField name="inputA" invalidValueTreatment="asMissing"/>
            <MiningField name="inputB" invalidValueTreatment="asMissing"/>
            <MiningField name="inputC" invalidValueTreatment="asMissing"/>
          </MiningSchema>
          <Node score="0">
          <True/>
            <Node score="1">    
              <CompoundPredicate booleanOperator="or">
              <SimplePredicate field="inputA" operator="isMissing"/>
              <SimplePredicate field="inputB" operator="isMissing"/>
              <SimplePredicate field="inputC" operator="isMissing"/>
              </CompoundPredicate>
            </Node> 
          </Node>
        </TreeModel>
      </Segment>
      <Segment id="2">
        <True/>
        <RegressionModel functionName="regression">
          <MiningSchema>
            <MiningField name="outputResult" usageType="target"/>
            <MiningField name="inputA" missingValueReplacement="0" missingValueTreatment="asMean" invalidValueTreatment="asMissing"/>
            <MiningField name="inputB" missingValueReplacement="0" missingValueTreatment="asMean" invalidValueTreatment="asMissing"/>
            <MiningField name="inputC" missingValueReplacement="0" missingValueTreatment="asMean" invalidValueTreatment="asMissing"/>
          </MiningSchema>
          <RegressionTable intercept="2">
            <NumericPredictor name="inputA" coefficient="1"/>
            <NumericPredictor name="inputB" coefficient="2"/>
            <NumericPredictor name="inputC" coefficient="3"/>
          </RegressionTable>
        </RegressionModel>
      </Segment>
    </Segmentation>
    </MiningModel>
    </PMML>

解释:

  1. DataDictionary(左右边距)
  2. MiningModel(functionName="mixed" 好像错了?;Segmentation multipleModelMethod="selectAll" 也错了?):
    • 输出定义(似乎也错了?因为不同的目标?)
    • 简单分类树模型(检测缺失/估算值)-> 目标:outputState
    • 简单回归模型 -> 目标:输出结果

任何人的想法或更好的建议?

4

0 回答 0