0

我正在尝试使用 wandb 进行射线调谐,以在某些条件下停止实验。

  • 如果任何试验引发异常,则停止所有实验(这样我就可以修复代码并恢复)
  • 如果我的分数达到 -999,则停止
  • varcannotbezero如果变量为 0 ,则停止

我尝试的以下事情都未能实现预期的行为:

  • 停止={“分数”:-999,“varcannotbezero”:0}
  • max_failures=0
  • 定义一个 Stoper 类也不起作用
class RayStopper(Stopper):
    def __init__(self):
        self._start = time.time()
        #self._deadline = 300
    def __call__(self, trial_id, result):
        self.score=result["score"]
        self.varcannotbezero=result["varcannotbezero"]
        return False
    def stop_all(self):
        if self.score==-999 or self.varcannotbezero==0:
            return True
        else:
            return False

Ray tune 只是继续运行

    wandb_project="ABC"
    wandb_api_key="KEY"
    ray.init(configure_logging=False)

    if current_best_params is None:
        algo = HyperOptSearch()
    else:
        algo = HyperOptSearch(points_to_evaluate=current_best_params,n_initial_points=n_initial_points)
    algo = ConcurrencyLimiter(algo, max_concurrent=1)

    scheduler = AsyncHyperBandScheduler()
    analysis = tune.run(
        tune_obj,
        name="Name",
        resources_per_trial={"cpu": 1},
        search_alg=algo,
        scheduler=scheduler,
        metric="score",
        mode="max",
        num_samples=10,
        stop={"score":-999,"varcannotbezero":0},
        max_failures=0,
        config=config,
        callbacks=[WandbLoggerCallback(project=wandb_project,entity="mycompany",api_key=wandb_api_key,log_config=True)],
        local_dir=local_dir,
        resume="AUTO",
        verbose=0
    )

4

1 回答 1

0

我找到了一个解决方案来停止客户 Stopper 类的实验。但是,实验将停止,我没有找到恢复它继续的方法:(

class RayStopper(Stopper):
    def __init__(self):
        self._start = time.time()
        self.scoretostop=0
    def __call__(self, trial_id, result):
        self.scoretostop=result["scoretostop"]
        return False
    def stop_all(self):
        secs=int(time.time())
        runtime=secs - self._start
        if secs % 20 == 0:
            print(f"-----------------RayStopper--------------")
            print(f"runtime={runtime}")
            print(f"scoretostop={self.scoretostop}")
        if self.scoretostop==1:
            return True
        else:
            return False
于 2022-02-17T13:31:25.427 回答