`module` `moplayground.envs.generic.mobase`

`class` `MultiObjectiveBase`

Base environment that emits a vector-valued reward.

Extends minimal_mjx.envs.generic.base.SwappableBase so concrete environments can be constructed against either NumPy or JAX backends. Per-step reward components are bucketed into objective groups defined by env_params.reward.optimization.objectives (one entry per output dimension of the reward vector) and an optional set of shared_objectives that are added to every dimension.

Args:

xml_path: Path to the MuJoCo XML model.
env_params: ConfigDict with at least reward.weights, reward.optimization.objectives (list of lists of reward keys, one list per objective dimension), and reward.optimization.shared_objectives (list of reward keys added to every objective).
backend: 'jnp' for JAX (training), 'np' for NumPy (eval).
num_free: Number of free joints in the model. Forwarded to SwappableBase.

`method` `MultiObjectiveBase.init`

__init__(
    xml_path: pathlib.Path,
    env_params: ml_collections.config_dict.config_dict.ConfigDict,
    backend: str = 'jnp',
    num_free: int = 3
)

`property` MultiObjectiveBase.action_size

Required action size for the environment

`property` MultiObjectiveBase.dt

Control timestep for the environment.

`property` MultiObjectiveBase.mj_model

`property` MultiObjectiveBase.mjx_model

`property` MultiObjectiveBase.n_substeps

Number of sim steps per control step.

`property` MultiObjectiveBase.observation_size

`property` MultiObjectiveBase.sim_dt

Simulation timestep for the environment.

`property` MultiObjectiveBase.unwrapped

`property` MultiObjectiveBase.xml_path

`method` `MultiObjectiveBase.get_reward_and_metrics`

get_reward_and_metrics(
    rewards: jax.Array,
    metrics: dict
) → tuple[jax.Array, dict[str, jax.Array]]

Combine per-key rewards into a vector reward plus updated metrics.

Each entry of self.objectives maps to one component of the returned reward vector, computed as a weighted sum of the listed per-key rewards. Shared objectives are then added to every component.

Args:

rewards: Mapping from reward key to scalar reward for the current step.
metrics: Existing metrics dict to extend.

Returns: (reward, metrics) where reward has shape (len(self.objectives),) and metrics is the updated metrics dict.

`class` `Multi2SingleObjective`

Wrap a multi-objective env to expose a scalar reward.

Replaces the vector reward from the wrapped environment with the inner product reward · weighting, so the wrapped env can be plugged into standard single-objective PPO. All other attributes/methods are delegated to the underlying env via __getattr__.

Args:

env: A MultiObjectiveBase (or compatible) environment whose reset/step return states with vector rewards.
weighting: Per-objective weights, length must match the env’s reward dimension.

`method` `Multi2SingleObjective.init`

__init__(env, weighting)

`method` `Multi2SingleObjective.reset`

reset(rng)

`method` `Multi2SingleObjective.step`

step(state, action)

module moplayground.envs.generic.mobase

class MultiObjectiveBase

method MultiObjectiveBase.__init__

property MultiObjectiveBase.action_size

property MultiObjectiveBase.dt

property MultiObjectiveBase.mj_model

property MultiObjectiveBase.mjx_model

property MultiObjectiveBase.n_substeps

property MultiObjectiveBase.observation_size

property MultiObjectiveBase.sim_dt

property MultiObjectiveBase.unwrapped

property MultiObjectiveBase.xml_path

method MultiObjectiveBase.get_reward_and_metrics

class Multi2SingleObjective

method Multi2SingleObjective.__init__

method Multi2SingleObjective.reset

method Multi2SingleObjective.step

`module` `moplayground.envs.generic.mobase`

`class` `MultiObjectiveBase`

`method` `MultiObjectiveBase.init`

`property` MultiObjectiveBase.action_size

`property` MultiObjectiveBase.dt

`property` MultiObjectiveBase.mj_model

`property` MultiObjectiveBase.mjx_model

`property` MultiObjectiveBase.n_substeps

`property` MultiObjectiveBase.observation_size

`property` MultiObjectiveBase.sim_dt

`property` MultiObjectiveBase.unwrapped

`property` MultiObjectiveBase.xml_path

`method` `MultiObjectiveBase.get_reward_and_metrics`

`class` `Multi2SingleObjective`

`method` `Multi2SingleObjective.init`

`method` `Multi2SingleObjective.reset`

`method` `Multi2SingleObjective.step`