1

我已经建立了一个学习环境,在这个环境中,球要学会直接冲向目标并与目标发生碰撞。目标的位置是随机的,但目标生成点的半径可以通过游戏控制器中设置的公共变量来控制,以改变难度。

public float distanceToAgent = 1f;

我的意图是通过使用 YAML 课程中的课程来增加距离 - 但是,当我尝试以下操作时,它似乎并没有根据课程计划增加目标生成距离。

我究竟做错了什么?我觉得我在脚本中遗漏了一些将课程价值与课程配置联系起来的东西。

behaviors:
  RollerBall:
    trainer_type: ppo
    hyperparameters:
      batch_size: 10
      buffer_size: 100
      learning_rate: 3.0e-4
      beta: 5.0e-4
      epsilon: 0.2
      lambd: 0.99
      num_epoch: 3
      learning_rate_schedule: linear
    network_settings:
      normalize: false
      hidden_units: 128
      num_layers: 2
    reward_signals:
      extrinsic:
        gamma: 0.99
        strength: 1.0
    max_steps: 900000
    time_horizon: 64
    summary_freq: 10000

environment_parameters:
  distanceToAgent:
    curriculum:
        - name: FirstLesson
          completion_criteria: 
            measure: progress
            behavior: RollerBall
            signal_smoothing: true
            min_lesson_length: 100
            threshold: 0.1
          value: 1.0

        - name: SecondLesson
          completion_criteria: 
           measure: progress
           behavior: RollerBall
           signal_smoothing: true
           min_lesson_length: 1000
           threshold: 0.3
          value: 5.0

        - name: ThirdLesson
          completion_criteria: 
           measure: progress
           behavior: RollerBall
           signal_smoothing: true
           min_lesson_length: 1000
           threshold: 0.5
          value: 10.0

        - name: ThirdLesson
          value: 15.0
4

1 回答 1

0

我认为这很好。

但是你需要:

EnvironmentParameters m_ResetParams;
public override void Initialize()
    {
        m_ResetParams = Academy.Instance.EnvironmentParameters;
    }

//and than your can get you value for example in OnEpisodeBeginn()
distanceToAgent = m_ResetParams.GetWithDefault("distanceToAgent", 1.0f);
于 2021-03-24T15:03:20.650 回答