Solution: ValueError: object too deep for desired array #5

Yizheng-Sun · 2022-11-24T13:24:25Z

function softmax_sample shoule be changed to:

def softmax_sample(distribution, temperature: float):
  if temperature == 0:
    temperature = 1
  distribution = numpy.array(distribution)**(1/temperature)
  p_sum = distribution[:,0].sum()
  sample_temp = distribution[:,0]/p_sum
  action = distribution[int(numpy.argmax(numpy.random.multinomial(1, sample_temp, 1)))][1]
  return 0, int(action)

because distribution is a 2d array, every element in it has 2 values. like this
[[ 0. 0.]
[ 4. 1.]
[ 1. 2.]
[ 0. 3.]
[ 0. 4.]
[ 0. 5.]
[ 0. 6.]
[ 0. 7.]
[ 0. 8.]
[ 0. 9.]
[ 0. 10.]
[ 0. 11.]........
The first value is the visit times and the second value is an action index.
p_sum should be calculated based on the first value so we use distribution[:,0].
when choose action index we should return the second value so we use
distribution[int(numpy.argmax(numpy.random.multinomial(1, sample_temp, 1)))][1]

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Solution: ValueError: object too deep for desired array #5

Solution: ValueError: object too deep for desired array #5

Yizheng-Sun commented Nov 24, 2022

Solution: ValueError: object too deep for desired array #5

Solution: ValueError: object too deep for desired array #5

Comments

Yizheng-Sun commented Nov 24, 2022