You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I add one question regarding the Sarsa agent implementation. In the official pseudo-algorihtm of Sarsa lambda (slide 29) the Q value and the Eligibility Traces are updated at each step for every state-action pair of the environment.
If I correctly understood your code, it seams to me that you only update the current step state-action pair.
Hey @Matyyas. Glad to hear you've found the repo useful. To be honest, it's been so long since I touched this repository that I can't recall exactly what my thinking there was. But it's likely that I was not aware of such difference, and had I been I probably would've not implemented it differently 🙂 Nice to hear you caught this though! Let me know what the difference is if you end up trying out both ways.
Hi @hartikainen,
Thank you for the super cool repo 👍
I add one question regarding the Sarsa agent implementation. In the official pseudo-algorihtm of Sarsa lambda (slide 29) the Q value and the Eligibility Traces are updated at each step for every state-action pair of the environment.
If I correctly understood your code, it seams to me that you only update the current step state-action pair.
Did you make your implementation knowing such a difference?
Thanks a lot @hartikainen
The text was updated successfully, but these errors were encountered: