Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pyTorch #19

Open
dylanthomas opened this issue Mar 16, 2017 · 6 comments
Open

pyTorch #19

dylanthomas opened this issue Mar 16, 2017 · 6 comments

Comments

@dylanthomas
Copy link

Isn't there any plan on the horizon to port this code to pyTorch ?

@ifrosio
Copy link
Collaborator

ifrosio commented Mar 16, 2017

We are not planning implementing it for now, but some people are indeed suggesting that pyTorch may be faster than TF. It would be great if someone can implement GA3C in pyTorch following our guidelines.

@etienne87
Copy link

I did a quick trial in one of my branches . Actually, TF is almost twice as fast, because the naive way I did the vectorized loss is probably involving a lot of function calls. The same issue arises for Chainer version. The loss takes almost more time to compute than the cnn. I think it could work faster if implementing it as a specific layer.

@ppwwyyxx
Copy link

Just FYI, my friend was able to reproduce both the speed and performance of my a3c implementation with his pytorch code.
It batches data differently from GA3C, but the overall structure is similar.

@etienne87
Copy link

interesting @ppwwyyxx !
My naive implementation gives something like this :

results txt

I am not sure if the problem is in the batching, rather than the explicit calls & many steps of computation for the loss.

        p, v = self.model.forward_multistep(x_, c, h)
        probs = F.softmax(p)
        probs = F.relu(probs - Config.LOG_EPSILON)
        log_probs = torch.log(probs) 
        adv = (rewards - v)
        adv = torch.masked_select(adv,mask)
        log_probs_a = torch.masked_select(log_probs,a) #we cannot use it because of variable length input
        piloss = -torch.sum( log_probs_a * Variable(adv.data), 0)  
        entropy = torch.sum(torch.sum(log_probs*probs,1),0) * self.beta
        vloss = torch.sum(adv.pow(2),0) / 2
        loss = piloss + entropy + vloss

If someone knows how to do this more quickly in pytorch ...?

@dylanthomas
Copy link
Author

@ppwwyyxx Is there a public git repo for your friend's pyTorch implementation ?

@ppwwyyxx
Copy link

Unfortunately no..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants