Experimental Evaluation of Proximal Policy Optimization and Advantage Actor-Critic RL Algorithms using MiniGrid Environment

Wen-Chung (Andy) Cheng, Florida Atlantic University
Zhen Ni, Florida Atlantic University
Xiangnan Zhong, Florida Atlantic University

Start Date

13-5-2021 2:10 PM

End Date

13-5-2021 2:30 PM

Document Type

Full Paper

Keywords

Reinforcement learning (RL), proximal policy optimization (PPO), advantage actor critic (A2C), minigrid navigation, and neural networks

Description

This paper studies the extended experiments of agent-environment interactions of recent reinforcement learning (RL) algorithms, i.e., Proximal Policy Optimization (PPO) and Advantage Actor Critic (A2C). In addition to the evaluation of accumulated rewards in the MiniGrid environment, this work also assesses the learned value tables and others. The experiment platform is expanded with arbitrary shape of mazes, customized training scheme, and the average computation from different initial seeds. The key RL formulas and hyperparameters are explicitly discussed with MiniGrid settings. From the obtained results, the conclusion provides the strengths and weaknesses of neural network implementation for RL methods. This is an undergraduate project from summer 2020.

DOI

https://doi.org/10.5038/LZTZ6050

Download

COinS

May 13th, 2:10 PM May 13th, 2:30 PM

Experimental Evaluation of Proximal Policy Optimization and Advantage Actor-Critic RL Algorithms using MiniGrid Environment

Experimental Evaluation of Proximal Policy Optimization and Advantage Actor-Critic RL Algorithms using MiniGrid Environment

Start Date

End Date

Document Type

Keywords

Description

DOI

Conference Links

Search

Browse By

Links

Experimental Evaluation of Proximal Policy Optimization and Advantage Actor-Critic RL Algorithms using MiniGrid Environment

Author Information

Start Date

End Date

Document Type

Keywords

Description

DOI

Share

Conference Links

Search

Browse By

Links