tensorflow - 使用 keras 调谐器的相关超参数

Question

我的目标是调整满足以下标准的可能网络架构：

第 1 层可以具有此列表中任意数量的隐藏单元：[32, 64, 128, 256, 512]

然后，要为其余层探索的隐藏单元的数量应始终取决于在其上方层中所做的特定选择，特别是：

第 2 层可以具有与第 1 层相同或一半的单元。
第 3 层可以具有与第 2 层相同或一半的单元。
第 4 层可以具有与第 3 层相同或一半的单元。

正如我目前正在实施的那样，第 2、3 和 4 层的 hp.Choice 选项一旦首次建立就永远不会更新。

例如，假装在调谐器的第一遍，num_layers = 4这意味着将创建所有四个层。例如，如果第 1 层选择 256 个隐藏单元，则选项变为：

第 2 层 --> [128, 256]

第 3 层 --> [64, 128]

第 4 层 --> [32, 64]

第 2 层、第 3 层和第 4 层在随后的每次迭代中都停留在这些选择上，而不是更新以适应第 1 层的未来选择。

这意味着在未来的迭代中，当第 1 层中隐藏单元的数量发生变化时，第 2、3 和 4 层的选项不再满足探索选项的预期目标，其中每个后续层可以包含相同或一半的隐藏单元上一层。

def build_and_tune_model(hp, train_ds, normalize_features, ohe_features, max_tokens, passthrough_features):
    
    all_inputs, encoded_features = get_all_preprocessing_layers(train_ds,
                                                            normalize_features=normalize_features,
                                                            ohe_features=ohe_features,
                                                            max_tokens=max_tokens,
                                                            passthrough=passthrough_features)

    
    
    # Possible values for the number of hidden units in layer 1.
    # Defining here because we will always have at least 1 layer.
    layer_1_hidden_units = hp.Choice('layer1_hidden_units', values=[32, 64, 128, 256, 512])

    # Possible number of layers to include
    num_layers = hp.Choice('num_layers', values=[1, 2, 3, 4])
    
    print("================= starting new round =====================")
    print(f"Layer 1 hidden units = {hp.get('layer1_hidden_units')}")
    print(f"Num layers is {hp.get('num_layers')}")
    
    
    all_features = layers.concatenate(encoded_features)
    
    x = layers.Dense(layer_1_hidden_units,
                     activation="relu")(all_features)

    
    if hp.get('num_layers') >= 2:
        
        with hp.conditional_scope("num_layers", [2, 3, 4]):
            
            # Layer 2 hidden units can either be half the layer 1 hidden units or the same.
            layer_2_hidden_units = hp.Choice('layer2_hidden_units', values=[(int(hp.get('layer1_hidden_units') / 2)),
                                                                            hp.get('layer1_hidden_units')])

            
            print("\n==========================================================")
            print(f"In layer 2")
            print(f"num_layers param = {hp.get('num_layers')}")
            print(f"layer_1_hidden_units = {hp.get('layer1_hidden_units')}")
            print(f"layer_2_hidden_units = {hp.get('layer2_hidden_units')}")
            print("==============================================================\n")

            x = layers.Dense(layer_2_hidden_units,
                             activation="relu")(x)

    if hp.get('num_layers') >= 3:
        
        with hp.conditional_scope("num_layers", [3, 4]):
        
            # Layer 3 hidden units can either be half the layer 2 hidden units or the same.
            layer_3_hidden_units = hp.Choice('layer3_hidden_units', values=[(int(hp.get('layer2_hidden_units') / 2)),
                                                                            hp.get('layer2_hidden_units')])


            print("\n==========================================================")
            print(f"In layer 3")
            print(f"num_layers param = {hp.get('num_layers')}")
            print(f"layer_1_hidden_units = {hp.get('layer1_hidden_units')}")
            print(f"layer_2_hidden_units = {hp.get('layer2_hidden_units')}")
            print(f"layer_3_hidden_units = {hp.get('layer3_hidden_units')}")
            print("==============================================================\n")

            x = layers.Dense(layer_3_hidden_units,
                             activation="relu")(x)

    if hp.get('num_layers') >= 4:
        
        with hp.conditional_scope("num_layers", [4]):
        
            # Layer 4 hidden units can either be half the layer 3 hidden units or the same.
            # Extra stipulation applied here, layer 4 hidden units can never be less than 8.
            layer_4_hidden_units = hp.Choice('layer4_hidden_units', values=[max(int(hp.get('layer3_hidden_units') / 2), 8),
                                                                            hp.get('layer3_hidden_units')])


            print("\n==========================================================")
            print(f"In layer 4")
            print(f"num_layers param = {hp.get('num_layers')}")
            print(f"layer_1_hidden_units = {hp.get('layer1_hidden_units')}")
            print(f"layer_2_hidden_units = {hp.get('layer2_hidden_units')}")
            print(f"layer_3_hidden_units = {hp.get('layer3_hidden_units')}")
            print(f"layer_4_hidden_units = {hp.get('layer4_hidden_units')}")
            print("==============================================================\n")

            x = layers.Dense(layer_4_hidden_units,
                             activation="relu")(x)

    
    output = layers.Dense(1, activation='sigmoid')(x)
    
    model = tf.keras.Model(all_inputs, output)
    
    model.compile(optimizer=tf.keras.optimizers.Adam(),
                  metrics = ['accuracy'],
                  loss='binary_crossentropy')
    
    print(">>>>>>>>>>>>>>>>>>>>>>>>>>>> End of round <<<<<<<<<<<<<<<<<<<<<<<<<<<<<")
    
    return model

有谁知道告诉 Keras Tuner 探索每一层隐藏单元的所有可能选项的正确方法，其中要探索的区域满足标准，即第一层之后的每一层允许具有与前一层相同或一半的隐藏单元层，并且第一层可以有来自列表 [32, 64, 128, 256, 512] 的多个隐藏单元？

score 0 · Accepted Answer

为此，我们首先需要了解如何选择超参数及其值，在控制到达我们的应用程序之前，Keras Tuner 从超参数空间中选择所有活动的超参数，一个活动的超参数意味着它的相关条件得到满足（注意：默认情况下，超参数没有分配任何条件），然后 Keras 调谐器将从与每个活动超参数关联的值列表中生成随机值，这意味着选择超参数并且它的值已经完成在控制到达我们的应用程序之前，在我们的应用程序中它只是提取已经生成的值，这就是为什么您总是会看到超参数一旦首次建立就不会更新。

在您的情况下，让我们考虑一个场景，假设在第一次试验中，它生成 256 作为第一层的单元计数，然后下面的代码将为第二层创建一个超参数“layer2_hidden_units”，其可能的值集为 [128, 256]

layer_2_hidden_units = hp.Choice('layer2_hidden_units', values=[(int(hp.get('layer1_hidden_units') / 2)),  hp.get('layer1_hidden_units')])

在第二次试验中，在控制您的应用程序之前，它已经从列表 [128, 256] 中获取了一个值，比如说 128，因此超参数“layer2_hidden_units”的值将是 128，然后在您的应用程序中它只是提取已经生成的值。

您的查询的解决方案是动态生成超参数，如下所示

hidden_units = hp.Choice('units_layer_' + str(layer_index), values=[(int(hp.get('layer1_hidden_units') / 2)), hp.get('layer1_hidden_units')])

# where 
# hp.get('layer1_hidden_units') = 256 and layer_index = 2
# or hp.get('layer1_hidden_units') = 128 and layer_index = 1
# and so on...

现在让我们以我们已经讨论过的场景为例，其中 Keras 调谐器在第一次试验中选择 256 作为第一层的单元计数，然后对于相同的试验，上面的代码将允许 Keras 调谐器将剩余层的超参数设置为 hidden_units_layer_2 = [128, 256]， hidden_units_layer_1 = [64, 128], hidden_units_layer_0 = [32, 64]

但是现在我们将面临第二个挑战，它总是会在接下来的试验中激活所有超参数，尽管其中一些不是必需的，例如在第二次试验中，如果第一层的选定单元数是 64，那么它也会激活 hidden_units_layer_2= [128, 256] 和 hidden_units_layer_1=[64, 128]，这意味着现在我们需要通过在条件范围内添加它们来禁用它们，如下所示

with hp.conditional_scope(parent_units_name, parent_units_value):
   hidden_units = hp.Choice(child_units_name, values=child_units_value)

最终代码如下所示

# List possible units
possible_units = [32, 64, 128, 256, 512]

possible_layer_units = []
for index, item in enumerate(possible_units[:-1]):
    possible_layer_units.append([item, possible_units[index + 1]])

# possible_layer_units = [[32, 64], [64, 128], [128, 256], [256, 512]] 
# where list index represent layer number 
# and list element represent list of unit possibilities for each layer

first_layer_units = hp.Choice('first_layer_units', values=possible_units)

# Then add first layer
all_features = layers.concatenate(encoded_features)  
x = layers.Dense(first_layer_units, activation="relu")(all_features)

# Get the number of hidden layers based on first layer unit count
hidden_layer_count = possible_units.index(first_layer_units)
if 0 < hidden_layer_count:
    iter_count = 0
    for hidden_layer_index in range(hidden_layer_count - 1, -1, -1):
        if iter_count == 0:
            # Collect HP 'units' details for the second layer
            # Suppose first_layer_units = 512, then
            # HP example: <units_layer_43=[256, 512] condition={first_layer_units:[256, 512]}>
            # where for units_layer_43, 4 indicates there will be total 5 layers and 3 indicates 'units' for the layer 4th from last
            # we are using total hidden layer count in HP name to avoid an issue while getting the unit count value.
            parent_units_name = 'first_layer_units'
            parent_units_value = possible_layer_units[hidden_layer_index]
            child_units_name = 'units_layer_' + str(hidden_layer_count) + str(hidden_layer_index)
            child_units_value = parent_units_value
        else:
            # Collect HP 'units' details for the next layers
            # Suppose units_layer_43 = 256, then
            # HP example: <units_layer_42=[128, 256] condition={units_layer_43:[256, 512]}>
            parent_units_name = 'units_layer_' + str(hidden_layer_count) + str(hidden_layer_index + 1)
            parent_units_value = possible_layer_units[hidden_layer_index + 1]
            child_units_name = 'units_layer_' + str(hidden_layer_count) + str(hidden_layer_index)
            child_units_value = possible_layer_units[hidden_layer_index]

        # Add and Activate child HP under parent HP using conditional scope
        with hp.conditional_scope(parent_units_name, parent_units_value):
            hidden_units = hp.Choice(child_units_name, values=child_units_value)
            
        # Add remaining NN layers one by one
        x = layers.Dense(hidden_units, activation="relu")(x)

        iter_count += 1

因此，只有在满足相关条件的情况下，才会激活那些超参数，因此在我们的例子中，如果在第二次试验中和第一层选择的单元数为 64，那么超参数“units_layer_2”和“units_layer_1”将是由于条件范围而被禁用，并且只有超参数“units_layer_0”将保持活动状态。

tensorflow - 使用 keras 调谐器的相关超参数

1 回答 1

Related

Reference