1

我正在尝试使我的代码可重现。我已经添加了 np.random.seed(...) 和 random.seed(...),目前我没有使用 pytorch 或 tf,因此没有调度器或搜索器可以引入任何随机问题。使用上述代码生成的配置集在多次调用中应该始终相同。然而,事实并非如此。

有人能帮忙吗?

谢谢!

这里的代码:

import ray
from ray import tune
import random
import numpy as np

def training_function(config, data_init):
    print('CONFIG:', config)
    tune.report(end_of_training=1, acc=0, f=0)

if __name__ == '__main__':
    ray.init(num_cpus=12)
    tune_config = {'sentence_classification': False, 
              'norm_word_emb': tune.choice(['True', 'False']), 
              'use_crf': tune.choice(['True', 'False']), 
              'use_char': tune.choice(['True', 'False']), 
              'word_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'char_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'seed_num': 1267}
    data = {'a': 1}
    tune_seed = tune_config['seed_num']
    random.seed(tune_seed)
    np.random.seed(tune_seed)
    n_samples = 15
    exp_name = 'experiment_name'
    analysis = tune.run(
        tune.with_parameters(training_function, data_init={'data': data}),
        name=exp_name,
        metric="f",
        mode="max",
        queue_trials=True,
        config=tune_config,
        num_samples=n_samples,
        resources_per_trial={"cpu": 1},
        checkpoint_at_end=True,
        max_failures=0,
    )
4

2 回答 2

0

函数级 API 无法重现(ray v1.1.0,可能会发生变化)。

等等,但是为什么

  1. tune.run创建一个Experiment对象,在那里传递你的函数。
  2. Experiment 通过调用将函数注册为可训练的register_trainable
  3. register_trainable使用wrap_function
  4. wrap_function将通过从类继承来创建一个类级别的 API(射线 Actor)FunctionRunner
  5. FunctionRunner对方法没有任何回调访问权限setup

工作方式Actor过于简单化,它分布在工作人员之间,然后使用setup方法在不同的进程中初始化。这就是为什么在您的 custom 中传递种子和实现初始化逻辑至关重要的原因Trainable,如答案中所述。需要播种,因为tune.choice它只是random/np.random函数的包装。您可以在 中观察到这一点tune/sample.py

请参阅示例:


import ray
from ray import tune
import random
import numpy as np

class Tunable(tune.Trainable):
    def setup(self, config):
        self.config = config
        self.seed = config['seed_num']
        random.seed(self.seed)
        np.random.seed(self.seed)
    
    def step(self):
        print('CONFIG:', self.config)
        return {tune.result.DONE: 'done', 'acc': 0, 'f': 0}

if __name__ == '__main__':
    ray.init(num_cpus=12)
    tune_config = {'sentence_classification': False, 
              'norm_word_emb': tune.choice(['True', 'False']), 
              'use_crf': tune.choice(['True', 'False']), 
              'use_char': tune.choice(['True', 'False']), 
              'word_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'char_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'seed_num': 1267}
    data = {'a': 1}
    tune_seed = tune_config['seed_num']
    n_samples = 15
    exp_name = 'experiment_name'
    analysis = tune.run(
        Tunable,
        name=exp_name,
        metric="f",
        mode="max",
        queue_trials=True,
        config=tune_config,
        num_samples=n_samples,
        resources_per_trial={"cpu": 1},
        checkpoint_at_end=False,
        max_failures=0,
    )
于 2021-01-08T07:38:54.200 回答
-1

我看到了播种工作的行为。我运行了这个脚本:

import ray
from ray import tune
import numpy as np
import random


def training_function(config, data_init):
    print('CONFIG:', config)
    tune.report(end_of_training=1, acc=0, f=0)

if __name__ == '__main__':
    # ray.init(num_cpus=12)
    tune_config = {'sentence_classification': False, 
              'norm_word_emb': tune.choice(['True', 'False']), 
              'use_crf': tune.choice(['True', 'False']), 
              'use_char': tune.choice(['True', 'False']), 
              'word_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'char_seq_feature': tune.choice(['CNN', 'LSTM', 'GRU']), 
              'seed': 1267}
    data = {'a': 1}
    tune_seed = tune_config['seed']
    random.seed(tune_seed)
    np.random.seed(tune_seed)
    n_samples = 15
    analysis = tune.run(
        tune.with_parameters(training_function, data_init={'data': data}),
        #name=exp_name,
        metric="f",
        mode="max",
        queue_trials=True,
        config=tune_config,
        num_samples=n_samples,
        resources_per_trial={"cpu": 1},
        verbose=2,
        max_failures=0,
    )

我跑了一次:

Resources requested: 0/16 CPUs, 0/0 GPUs, 0.0/27.0 GiB heap, 0.0/9.28 GiB objects
Current best trial: 84b84_00014 with f=0 and parameters={'sentence_classification': False, 'norm_word_emb': 'False', 'use_crf': 'True', 'use_char': 'False', 'word_seq_feature': 'LSTM', 'char_seq_feature': 'GRU', 'seed': 1267}
Number of trials: 15/15 (15 TERMINATED)
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+
| Trial name         | status     | loc   | char_seq_feature   | norm_word_emb   | use_char   | use_crf   | word_seq_feature   |   iter |   total time (s) |   end_of_training |   acc |   f |
|--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----|
| _inner_84b84_00000 | TERMINATED |       | LSTM               | True            | False      | False     | LSTM               |      1 |       0.00149202 |                 1 |     0 |   0 |
| _inner_84b84_00001 | TERMINATED |       | CNN                | False           | True       | False     | CNN                |      1 |       0.0014801  |                 1 |     0 |   0 |
| _inner_84b84_00002 | TERMINATED |       | GRU                | False           | False      | True      | GRU                |      1 |       0.00152397 |                 1 |     0 |   0 |
| _inner_84b84_00003 | TERMINATED |       | GRU                | False           | False      | False     | GRU                |      1 |       0.00165081 |                 1 |     0 |   0 |
| _inner_84b84_00004 | TERMINATED |       | CNN                | False           | False      | False     | CNN                |      1 |       0.00173998 |                 1 |     0 |   0 |
| _inner_84b84_00005 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00219083 |                 1 |     0 |   0 |
| _inner_84b84_00006 | TERMINATED |       | GRU                | True            | False      | False     | LSTM               |      1 |       0.00192428 |                 1 |     0 |   0 |
| _inner_84b84_00007 | TERMINATED |       | LSTM               | True            | False      | False     | CNN                |      1 |       0.00208902 |                 1 |     0 |   0 |
| _inner_84b84_00008 | TERMINATED |       | LSTM               | True            | True       | True      | GRU                |      1 |       0.00146484 |                 1 |     0 |   0 |
| _inner_84b84_00009 | TERMINATED |       | CNN                | False           | False      | True      | CNN                |      1 |       0.00152087 |                 1 |     0 |   0 |
| _inner_84b84_00010 | TERMINATED |       | LSTM               | False           | True       | False     | CNN                |      1 |       0.00124121 |                 1 |     0 |   0 |
| _inner_84b84_00011 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00124812 |                 1 |     0 |   0 |
| _inner_84b84_00012 | TERMINATED |       | LSTM               | True            | True       | True      | LSTM               |      1 |       0.00133514 |                 1 |     0 |   0 |
| _inner_84b84_00013 | TERMINATED |       | LSTM               | True            | False      | True      | CNN                |      1 |       0.00142407 |                 1 |     0 |   0 |
| _inner_84b84_00014 | TERMINATED |       | GRU                | False           | False      | True      | LSTM               |      1 |       0.00120211 |                 1 |     0 |   0 |
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+

以及随后的运行:

Current best trial: 84b84_00014 with f=0 and parameters={'sentence_classification': False, 'norm_word_emb': 'False', 'use_crf': 'True', 'use_char': 'False', 'word_seq_feature': 'LSTM', 'char_seq_feature': 'GRU', 'seed': 1267}
Result logdir: /Users/rliaw/ray_results/_inner_2021-01-07_10-45-31
Number of trials: 15/15 (15 TERMINATED)
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+
| Trial name         | status     | loc   | char_seq_feature   | norm_word_emb   | use_char   | use_crf   | word_seq_feature   |   iter |   total time (s) |   end_of_training |   acc |   f |
|--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----|
| _inner_84b84_00000 | TERMINATED |       | LSTM               | True            | False      | False     | LSTM               |      1 |       0.00149202 |                 1 |     0 |   0 |
| _inner_84b84_00001 | TERMINATED |       | CNN                | False           | True       | False     | CNN                |      1 |       0.0014801  |                 1 |     0 |   0 |
| _inner_84b84_00002 | TERMINATED |       | GRU                | False           | False      | True      | GRU                |      1 |       0.00152397 |                 1 |     0 |   0 |
| _inner_84b84_00003 | TERMINATED |       | GRU                | False           | False      | False     | GRU                |      1 |       0.00165081 |                 1 |     0 |   0 |
| _inner_84b84_00004 | TERMINATED |       | CNN                | False           | False      | False     | CNN                |      1 |       0.00173998 |                 1 |     0 |   0 |
| _inner_84b84_00005 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00219083 |                 1 |     0 |   0 |
| _inner_84b84_00006 | TERMINATED |       | GRU                | True            | False      | False     | LSTM               |      1 |       0.00192428 |                 1 |     0 |   0 |
| _inner_84b84_00007 | TERMINATED |       | LSTM               | True            | False      | False     | CNN                |      1 |       0.00208902 |                 1 |     0 |   0 |
| _inner_84b84_00008 | TERMINATED |       | LSTM               | True            | True       | True      | GRU                |      1 |       0.00146484 |                 1 |     0 |   0 |
| _inner_84b84_00009 | TERMINATED |       | CNN                | False           | False      | True      | CNN                |      1 |       0.00152087 |                 1 |     0 |   0 |
| _inner_84b84_00010 | TERMINATED |       | LSTM               | False           | True       | False     | CNN                |      1 |       0.00124121 |                 1 |     0 |   0 |
| _inner_84b84_00011 | TERMINATED |       | LSTM               | True            | True       | True      | CNN                |      1 |       0.00124812 |                 1 |     0 |   0 |
| _inner_84b84_00012 | TERMINATED |       | LSTM               | True            | True       | True      | LSTM               |      1 |       0.00133514 |                 1 |     0 |   0 |
| _inner_84b84_00013 | TERMINATED |       | LSTM               | True            | False      | True      | CNN                |      1 |       0.00142407 |                 1 |     0 |   0 |
| _inner_84b84_00014 | TERMINATED |       | GRU                | False           | False      | True      | LSTM               |      1 |       0.00120211 |                 1 |     0 |   0 |
+--------------------+------------+-------+--------------------+-----------------+------------+-----------+--------------------+--------+------------------+-------------------+-------+-----+

请注意,试验及其配置完全相同(以相同的顺序)。

于 2021-01-07T18:50:09.290 回答