Added array-like min and max actions #29

beelerchris · 2020-09-02T17:35:33Z

Added a minimum action variable, included it as the lower bound for action clipping, and modified necessary range calculations.
Changed minimum and maximum action variables to be array-like instead of single floating point variables. torch.clamp doesn't support tensor min and max inputs therefore torch.min and torch.max have to be used instead.

The changes have been tested on Pendulum-v0 and some custom environments with array-like action spaces to ensure clipping and minimum action are handled properly.

wangjunyi9999 · 2022-09-08T11:03:09Z

TD3.py



 	def forward(self, state):
 		a = F.relu(self.l1(state))
 		a = F.relu(self.l2(a))
-		return self.max_action * torch.tanh(self.l3(a))
+		return (self.max_action - self.min_action) * ((torch.tanh(self.l3(a)) + 1) / 2) + self.min_action


Hi, may I ask the reason why you change here? I have ever tested your code, and comparing with the original one, the performace becomes worse, and the output action results seem very weird.

Added array-like min and max actions

8da3322

beelerchris mentioned this pull request Sep 2, 2020

Minimum and Array-like Actions #28

Closed

wangjunyi9999 reviewed Sep 8, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added array-like min and max actions #29

Added array-like min and max actions #29

beelerchris commented Sep 2, 2020

wangjunyi9999 Sep 8, 2022

Added array-like min and max actions #29

Are you sure you want to change the base?

Added array-like min and max actions #29

Conversation

beelerchris commented Sep 2, 2020

wangjunyi9999 Sep 8, 2022

Choose a reason for hiding this comment