Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chapter 9: Listing 9.21 #30

Open
cptanalatriste opened this issue May 13, 2021 · 2 comments
Open

Chapter 9: Listing 9.21 #30

cptanalatriste opened this issue May 13, 2021 · 2 comments

Comments

@cptanalatriste
Copy link

I noticed that for both teams, when calling team_step() we are using the same parameter vector param[0] for both teams:

        acts_1, act_means1, qvals1, obs_small_1, ids_1 = \
            team_step(team1,params[0],acts_1,layers) #B
        env.set_action(team1, acts_1.detach().numpy().astype(np.int32)) #C

        acts_2, act_means2, qvals2, obs_small_2, ids_2 = \
            team_step(team2,params[0],acts_2,layers)
        env.set_action(team2, acts_2.detach().numpy().astype(np.int32))

Shouldn't it be param[0] for team 1 and param[1] for team 2? That's the behaviour shown later when calling train:

            loss1 = train(batch_size,replay1,params[0],layers=layers,J=N1)
            loss2 = train(batch_size,replay2,params[1],layers=layers,J=N1)
@Riad123321
Copy link

i think it's wrong as well

@123David-yang
Copy link

i think it's wrong
in order to further explore whether it's wrong or my personal misunderstanding, i had adapt to adjustment to compare with the original script that provide by author.

First, i forbidden the training of team2, keeping other code the same, and got the same result of the original scrip, which probably mean that it is wrong to use the same param[0] to set_ation

Second, i used different param for these two team, which means param[0] for team 1 and param[1] for team 2. you know what, the result is the two team tried to get close to the wall to avoid to battle each other team. So it may not a good idea that using the same strategy for different team even though they were supported different initial paramters.

In conclusion, it should be param[0] for team 1 and param[1] for team 2, and even though u adopt this adjustment, it still not an good example to demonstrate the multi-agent battle.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants