ETH Zurich & UC Berkeley Method Automates Deep Reward-Learning by Simulating the Past | Synced

A research team from ETH and UC Berkeley proposes a Deep Reward Learning by Simulating the Past (Deep RLSP) algorithm that represents rewards directly as a linear combination of features learned th...

By · · 1 min read

Source: Synced | AI Technology & Industry Review

A research team from ETH and UC Berkeley proposes a Deep Reward Learning by Simulating the Past (Deep RLSP) algorithm that represents rewards directly as a linear combination of features learned through self-supervised representation learning and enables agents to simulate human actions backwards in time to infer what they must have done.