Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PPO问题 #3

Open
AI-HM opened this issue Oct 8, 2023 · 4 comments
Open

PPO问题 #3

AI-HM opened this issue Oct 8, 2023 · 4 comments

Comments

@AI-HM
Copy link

AI-HM commented Oct 8, 2023

您好,感谢您的工作!

我在运行PPO算法时,其中PPOGitHub - model.py - 337行“state = t.from_numpy(state).to(self.device)”报错,
TypeError: expected np.ndarray (got tuple)

请问是什么问题呢?谢谢您

@Phoenix-Shen
Copy link
Owner

Phoenix-Shen commented Oct 8, 2023

您好,感谢您的工作!

我在运行PPO算法时,其中PPOGitHub - model.py - 337行“state = t.from_numpy(state).to(self.device)”报错, TypeError: expected np.ndarray (got tuple)

请问是什么问题呢?谢谢您

根本原因->Gym版本问题:

这个 是因为gym版本更新之后,执行

state = env.step(action)

的时候,返回的不是state了,而是一个tuple,这就导致了下面代码报错

state = t.from_numpy(state).to(self.device)

解决方法:

  1. 直接按照我给的environment.txt创建一个新环境,我提供了gym版本
  2. 修改代码:
    state = state[0]
    state = t.from_numpy(state).to(self.device)
    这样应该可以,我也遇到过这种毛病。

实在抱歉,最近没有研究RL了,跑去搞联邦学习了,所以这个代码仓库没有更新。

@AI-HM
Copy link
Author

AI-HM commented Oct 9, 2023

感谢您的回答!
您给的两种方案我都尝试了一下,
方案1报错:AttributeError: 'EntryPoints' object has no attribute 'get' (按照requirements.txt,把环境全换了)
方案2报错:在代码next_state, reward, done, _ = env.step(action.item()),报错ValueError: too many values to unpack (expected 4)

再次感谢您的回复!

@Phoenix-Shen
Copy link
Owner

感谢您的回答! 您给的两种方案我都尝试了一下, 方案1报错:AttributeError: 'EntryPoints' object has no attribute 'get' (按照requirements.txt,把环境全换了) 方案2报错:在代码next_state, reward, done, _ = env.step(action.item()),报错ValueError: too many values to unpack (expected 4)

再次感谢您的回复!

第一种方案不知道为啥,估计是什么库犯病了,比较麻烦

第二种方案报错是这个原因:

由于GYM库修改了,env.reset()env.step()返回的值都加了个{}。所以在这里这行代码也应该修改成:

# 再加个,_  接收5个参数
next_state, reward, done, _ ,_= env.step(action.item())

这样就不会报错too many values to unpack,原本返回5个信息,我们要用4个变量去接收,所以报这个错

由于GYM库的更新,所有包含env.step()env.reset()的代码都应该修改一下:

  • state = env.reset() 修改成state,_ = env.reset()
  • next_state, reward, done, _ = env.step(action.item()) 修改成next_state, reward, done, _ ,_= env.step(action.item())

@AI-HM
Copy link
Author

AI-HM commented Oct 9, 2023

感谢您的回答! 您给的两种方案我都尝试了一下, 方案1报错:AttributeError: 'EntryPoints' object has no attribute 'get' (按照requirements.txt,把环境全换了) 方案2报错:在代码next_state, reward, done, _ = env.step(action.item()),报错ValueError: too many values to unpack (expected 4)
再次感谢您的回复!

第一种方案不知道为啥,估计是什么库犯病了,比较麻烦

第二种方案报错是这个原因:

由于GYM库修改了,env.reset()env.step()返回的值都加了个{}。所以在这里这行代码也应该修改成:

# 再加个,_  接收5个参数
next_state, reward, done, _ ,_= env.step(action.item())

这样就不会报错too many values to unpack,原本返回5个信息,我们要用4个变量去接收,所以报这个错

由于GYM库的更新,所有包含env.step()env.reset()的代码都应该修改一下:

  • state = env.reset() 修改成state,_ = env.reset()
  • next_state, reward, done, _ = env.step(action.item()) 修改成next_state, reward, done, _ ,_= env.step(action.item())

感谢!已经跑通

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants