Blog posts

2023

Conservative Q learning for Offline Reinforcement Learning

We develop a conservative Q-learning (CQL) algorithm, such that the expected value of a policy under the learned Q-function lower-bounds its true value. A lower bound on the Q-value prevents the over-estimation that is common in offline RL settings due to OOD actions and function approximation error. We start by focusing on policy evaluation step in CQL, which could be used by itself as an off-policy evaluation procedure, or integrated into a complete offline RL algorithm.

2020

Building an Academic Website

If you’re an academic, you need a website. Obviously I agree with this since you’re reading this on my website, but if you don’t have one, you should get one. Most universities these days provide a free option, usually powered by WordPress (both WashU and UNC use WordPress for their respective offerings). While these sites are quick to set up and come with the prestige of a .edu URL, they have several drawbacks that have been extensively written on.