# Actor-critic algorithm

> The actor-critic algorithm (AC) is a family of reinforcement learning (RL) algorithms that combine policy-based RL algorithms such as policy gradient methods, and value-based RL algorithms such as value iteration, Q-learning, SARSA, and TD learning. An AC algorithm consists of two main components: an &#8220;actor&#8221; that determines which actions to take according to a policy [&hellip;]

The **actor-critic algorithm** (AC) is a family of reinforcement learning (RL) algorithms that combine policy-based RL algorithms such as policy gradient methods, and value-based RL algorithms such as value iteration, Q-learning, SARSA, and TD learning.

An AC algorithm consists of two main components: an “**actor**” that determines which actions to take according to a policy function, and a “**critic**” that evaluates those actions according to a value function. Some AC algorithms are on-policy, some are off-policy. Some apply to either continuous or discrete action spaces. Some work in both cases.

## Overview

The actor-critic methods can be understood as an improvement over pure policy gradient methods like REINFORCE via introducing a baseline.

### Actor

The **actor** uses a policy function


        π
        (
        a


…

*Source: [Wikipedia](https://en.wikipedia.org/wiki/Actor-critic_algorithm)*

---

## Metadata

- **URL:** https://wpsearchai.com/actor-critic-algorithm/
- **Published:** 2026-01-28T18:45:48+00:00
- **Modified:** 2026-01-28T18:45:48+00:00
- **Author:** admin
- **Categories:** Artificial intelligence