Posts by Tags

Conservative Q learning for Offline Reinforcement Learning

March 07, 2023

We develop a conservative Q-learning (CQL) algorithm, such that the expected value of a policy under the learned Q-function lower-bounds its true value. A lower bound on the Q-value prevents the over-estimation that is common in offline RL settings due to OOD actions and function approximation error. We start by focusing on policy evaluation step in CQL, which could be used by itself as an off-policy evaluation procedure, or integrated into a complete offline RL algorithm.

Building an Academic Website

June 30, 2020

If you’re an academic, you need a website. Obviously I agree with this since you’re reading this on my website, but if you don’t have one, you should get one. Most universities these days provide a free option, usually powered by WordPress (both WashU and UNC use WordPress for their respective offerings). While these sites are quick to set up and come with the prestige of a .edu URL, they have several drawbacks that have been extensively written on.

Building an Academic Website

June 30, 2020

If you’re an academic, you need a website. Obviously I agree with this since you’re reading this on my website, but if you don’t have one, you should get one. Most universities these days provide a free option, usually powered by WordPress (both WashU and UNC use WordPress for their respective offerings). While these sites are quick to set up and come with the prestige of a .edu URL, they have several drawbacks that have been extensively written on.

Conservative Q learning for Offline Reinforcement Learning

March 07, 2023

We develop a conservative Q-learning (CQL) algorithm, such that the expected value of a policy under the learned Q-function lower-bounds its true value. A lower bound on the Q-value prevents the over-estimation that is common in offline RL settings due to OOD actions and function approximation error. We start by focusing on policy evaluation step in CQL, which could be used by itself as an off-policy evaluation procedure, or integrated into a complete offline RL algorithm.

Building an Academic Website

June 30, 2020

If you’re an academic, you need a website. Obviously I agree with this since you’re reading this on my website, but if you don’t have one, you should get one. Most universities these days provide a free option, usually powered by WordPress (both WashU and UNC use WordPress for their respective offerings). While these sites are quick to set up and come with the prestige of a .edu URL, they have several drawbacks that have been extensively written on.

Dingrong Wang

Posts by Tags

Theory Derivation

Conservative Q learning for Offline Reinforcement Learning

git

Building an Academic Website

github

Building an Academic Website

offline RL

Conservative Q learning for Offline Reinforcement Learning

website

Building an Academic Website