Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

question on freezing target nework #15

Open
hashbangCoder opened this issue Apr 20, 2016 · 8 comments
Open

question on freezing target nework #15

hashbangCoder opened this issue Apr 20, 2016 · 8 comments

Comments

@hashbangCoder
Copy link

Hi @yenchenlin1994 , love your implementation!
I went through your code and I can't seem to find where you've frozen the target network?
Unless Im missing something in my excess-caffeine induced brain fade,you continue to update the target every batch?
Wouldn't that hurt your convergence rate badly?

@yenchenlin
Copy link
Owner

Hello,
Yeah you are right.
Actually I got a reimplemented version.
Will submit soon!
On Wed, Apr 20, 2016 at 17:46 Code-Deep-Blue [email protected]
wrote:

Hi @yenchenlin1994 https://github.com/yenchenlin1994 , love your
implementation!
I went through your code and I can't seem to find where you've frozen the
target network?
Unless Im missing something in my excess-caffeine induced brain fade,you
continue to update the target every batch?
Wouldn't that hurt your convergence rate badly?


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#15

@hashbangCoder
Copy link
Author

Hi again,
i'm trying to reproduce the results on keras and have trained for 400,000 steps and the bird is unable to cross the first pipe consistently. My loss is low though ( 0.2) and Q-values are in the range of [0,8]. How long did it take for you before it actually started working i.e. cross the first pipe consistently?

@yenchenlin
Copy link
Owner

I can't remember the exactly number of iterations, but it's no more than ~1000,000 steps

@xiahouzuoxin
Copy link

Still cannot find freezing target network in current version's code. It's really no effect?

@zsy372901
Copy link

@hashbangCoder
I meet the same question that the silly bird keeps top of the screen.....Did you fix it?

@weijinsong
Copy link

I also couldn't find freezing target network code. But thanks for your code. It's helpful for me.

@initial-h
Copy link

I write a version base on this repo with freezing target network.FlappyBird_DQN_with_target_network

@patrick-llgc
Copy link

patrick-llgc commented Jan 29, 2019

Here is another repo with target network. https://github.com/patrick-12sigma/DRL_FlappyBird

I made target network an option. You can turn it on and off and experiment to see how much it affects the convergence of training.

I refactored the network into a class, and added some logging functionalities to track the training process. I also borrowed the human play function from @initial-h. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants