Environments
Multi Cartpole
Ver. 1.3.6 (2025-08-03)
This module provides an environment with multivariate state and action spaces based on the OpenAI Gym environment ‘CartPole-v1’.
- class mlpro_int_gymnasium.envs.multicartpole.MultiCartPole(p_num_envs=2, p_reward_type=0, p_visualize: bool = True, p_logging=True)
Bases:
EnvironmentThis environment multivariate space and action spaces by duplicating the OpenAI Gym environment ‘CartPole-v1’. The number of internal CartPole sub-enironments can be parameterized.
- C_NAME = 'MultiCartPole'
- C_LATENCY = datetime.timedelta(seconds=1)
- C_INFINITY = np.float32(3.4028235e+38)
- C_PLOT_ACTIVE: bool = True
- _init_cartpoles()
- static load(p_path, p_filename)
Static method to load an object of the current class from a file using pickle/dill. During unpickling the given file, standard method __setstate__() is called. This in turn is implemented specifically and calls the MLPro custom method _complete_state(). This method allows the completion of the unpickled object from further externally stored data.
- Parameters:
p_path (str) – Path where file will be saved
p_filename (str = None) – File name (if None an internal filename will be used)
- Returns:
Object of the given class that was unpickled from the given file.
- Return type:
Object
- _save(p_path, p_filename) bool
- switch_logging(p_logging)
- static setup_spaces()
Static template method to set up and return state and action space of environment.
- Returns:
state_space (MSpace) – State space object
action_space (MSpace) – Action space object
- _setup_spaces()
- collect_substates() State
- get_cycle_limit()
Returns limit of cycles per training episode.
- _reset(p_seed=None)
Custom method to reset the system to an initial/defined state. Use method _set_status() to set the state.
- Parameters:
p_seed (int) – Seed parameter for an internal random generator
- simulate_reaction(p_state: State, p_action: Action) State
Simulates a state transition based on a state and an action. The simulation step itself is carried out either by an internal custom implementation in method _simulate_reaction() or by an embedded external function.
- Parameters:
p_state (State) – Current state.
p_action (Action) – Action.
- Returns:
Subsequent state after transition
- Return type:
State
- _compute_reward(p_state_old: State, p_state_new: State) Reward
Custom reward method. See method compute_reward() for further details.
- compute_success(p_state: State) bool
Assesses the given state whether it is a ‘success’ state. Assessment is carried out either by a custom implementation in method _compute_success() or by an embedded external function.
- Parameters:
p_state (State) – State to be assessed.
- Returns:
success – True, if the given state is a ‘success’ state. False otherwise.
- Return type:
bool
- compute_broken(p_state: State) bool
Assesses the given state whether it is a ‘broken’ state. Assessment is carried out either by a custom implementation in method _compute_broken() or by an embedded external function.
- Parameters:
p_state (State) – State to be assessed.
- Returns:
broken – True, if the given state is a ‘broken’ state. False otherwise.
- Return type:
bool
- init_plot(p_figure: Figure = None, p_plot_settings: list = Ellipsis, p_plot_depth: int = 0, p_detail_level: int = 0, p_step_rate: int = 0, **p_kwargs)
Initializes the plot functionalities of the class.
- Parameters:
p_figure (Matplotlib.figure.Figure, optional) – Optional MatPlotLib host figure, where the plot shall be embedded. The default is None.
p_plot_settings (PlotSettings) – Optional plot settings. If None, the default view is plotted (see attribute C_PLOT_DEFAULT_VIEW).
- update_plot(**p_kwargs)
Updates the plot.
- Parameters:
**p_kwargs – Implementation-specific plot data and/or parameters.
Rubiks Cube 2x2x2
Ver. 1.0.3 (2026-03-10)
This module provides a standardized MLPro environment for a 2x2x2 Rubik’s Cube based on the rubiks-cube-gym package (https://github.com/DoubleGremlin181/RubiksCubeGym).
The environment wraps the registered Gymnasium environment ‘rubiks-cube-222-v0’ using WrEnvGYM2MLPro and exposes it as a single-agent MLPro RL environment.
Observation space : Discrete(3674160) – index into the full state dictionary Action space : Discrete(3) – moves F=0, R=1, U=2 Reward : Defined by reward strategy. Default: -1 per step, +100 on solve.
- class mlpro_int_gymnasium.envs.rubikscube2x2x2.RubiksCube222(p_shaped_reward: bool = False, p_visualize: bool = True, p_logging=True)
Bases:
EnvironmentMLPro RL environment for the 2x2x2 Rubik’s Cube.
Wraps the Gymnasium environment ‘rubiks-cube-222-v0’ from the rubiks-cube-gym package using WrEnvGYM2MLPro.
- C_NAME = 'RubiksCube222'
- C_LATENCY = datetime.timedelta(seconds=1)
- C_PLOT_ACTIVE: bool = True
- _init_env()
Instantiates the gym env, optionally wraps it with ShapedRewardCubeWrapper, then wraps the result with WrEnvGYM2MLPro. Called once during __init__ and again after loading from file (see load()). Guarded against premature calls before spaces are initialised.
- static load(p_path, p_filename)
Static method to load an object of the current class from a file using pickle/dill. During unpickling the given file, standard method __setstate__() is called. This in turn is implemented specifically and calls the MLPro custom method _complete_state(). This method allows the completion of the unpickled object from further externally stored data.
- Parameters:
p_path (str) – Path where file will be saved
p_filename (str = None) – File name (if None an internal filename will be used)
- Returns:
Object of the given class that was unpickled from the given file.
- Return type:
Object
- _save(p_path, p_filename) bool
- switch_logging(p_logging)
- static setup_spaces()
Static template method to set up and return state and action space of environment.
- Returns:
state_space (MSpace) – State space object
action_space (MSpace) – Action space object
- _setup_spaces()
- Defines the MLPro state and action spaces:
State : one integer dimension, index in [0, 3674159]
Action : one integer dimension, index in [0, 2] (F=0, R=1, U=2)
- _reset(p_seed=None)
Custom method to reset the system to an initial/defined state. Use method _set_status() to set the state.
- Parameters:
p_seed (int) – Seed parameter for an internal random generator
- simulate_reaction(p_state: State, p_action: Action) State
Clips the continuous action value produced by SB3 to a valid discrete index {0, 1, 2} before passing it to the wrapped gym environment.
- _compute_reward(p_state_old: State, p_state_new: State) Reward
Default reward: +100 on solve, -1 on every other step. When p_shaped_reward=True, ShapedRewardCubeWrapper overrides the gym env’s reward signal before it reaches this method.
- compute_success(p_state: State) bool
Assesses the given state whether it is a ‘success’ state. Assessment is carried out either by a custom implementation in method _compute_success() or by an embedded external function.
- Parameters:
p_state (State) – State to be assessed.
- Returns:
success – True, if the given state is a ‘success’ state. False otherwise.
- Return type:
bool
- compute_broken(p_state: State) bool
Assesses the given state whether it is a ‘broken’ state. Assessment is carried out either by a custom implementation in method _compute_broken() or by an embedded external function.
- Parameters:
p_state (State) – State to be assessed.
- Returns:
broken – True, if the given state is a ‘broken’ state. False otherwise.
- Return type:
bool
- init_plot(p_figure=None, p_plot_settings=Ellipsis, p_plot_depth: int = 0, p_detail_level: int = 0, p_step_rate: int = 0, **p_kwargs)
Initializes the plot functionalities of the class.
- Parameters:
p_figure (Matplotlib.figure.Figure, optional) – Optional MatPlotLib host figure, where the plot shall be embedded. The default is None.
p_plot_settings (PlotSettings) – Optional plot settings. If None, the default view is plotted (see attribute C_PLOT_DEFAULT_VIEW).
- update_plot(**p_kwargs)
Updates the plot.
- Parameters:
**p_kwargs – Implementation-specific plot data and/or parameters.