2

Tensorflow 中已经有跨列创建特征的功能tf.feature_column.crossed_column,但更多用于类别数据。数值数据呢?

例如,已经有 2 列

age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")

如果我想像这样基于年龄和education_num创建第三个和第四个特征列

my_feature = age * education_num
my_another_feature = age * age

怎么做到呢?

4

1 回答 1

4

您可以声明一个自定义数值列并将其添加到输入函数中的数据框中:

# Existing features
age = tf.feature_column.numeric_column("age")
education_num = tf.feature_column.numeric_column("education_num")
# Declare a custom column just like other columns
my_feature = tf.feature_column.numeric_column("my_feature")

...
# Add to the list of features
feature_columns = { ... age, education_num, my_feature, ... }

...
def input_fn():
  df_data = pd.read_csv("input.csv")
  df_data = df_data.dropna(how="any", axis=0)
  # Manually update the dataframe
  df_data["my_feature"] = df_data["age"] * df_data["education_num"]

  return tf.estimator.inputs.pandas_input_fn(x=df_data,
                                             y=labels,
                                             batch_size=100,
                                             num_epochs=10)

...
model.train(input_fn=input_fn())
于 2017-10-24T13:17:06.227 回答