eve.app package

Subpackages

Submodules

eve.app.algo module

eve.app.buffers module

class eve.app.buffers.RolloutBufferSamples(observations, actions, old_values, old_log_prob, advantages, returns)

Bases: tuple

Create new instance of RolloutBufferSamples(observations, actions, old_values, old_log_prob, advantages, returns)

property observations

Alias for field number 0

property actions

Alias for field number 1

property old_values

Alias for field number 2

property old_log_prob

Alias for field number 3

property advantages

Alias for field number 4

property returns

Alias for field number 5

class eve.app.buffers.ReplayBufferSamples(observations, actions, next_observations, dones, rewards)

Bases: tuple

Create new instance of ReplayBufferSamples(observations, actions, next_observations, dones, rewards)

property observations

Alias for field number 0

property actions

Alias for field number 1

property next_observations

Alias for field number 2

property dones

Alias for field number 3

property rewards

Alias for field number 4

class eve.app.buffers.RolloutReturn(episode_reward, episode_timesteps, n_episodes, continue_training)

Bases: tuple

Create new instance of RolloutReturn(episode_reward, episode_timesteps, n_episodes, continue_training)

property episode_reward

Alias for field number 0

property episode_timesteps

Alias for field number 1

property n_episodes

Alias for field number 2

property continue_training

Alias for field number 3

eve.app.buffers.get_action_dim(action_space: eve.app.space.EveSpace) int

Get the dimension of the action space.

Parameters

action_space

Returns

eve.app.buffers.get_obs_shape(observation_space: eve.app.space.EveSpace) Tuple[int, ...]

Get the shape of the observation (useful for the buffers).

Parameters

observation_space

Returns

class eve.app.buffers.BaseBuffer(buffer_size: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace, device: Union[torch.device, str] = 'cpu', n_envs: int = 1, sample_episode: bool = False)

Bases: abc.ABC

Base class that represent a buffer (rollout or replay)

Parameters
  • buffer_size – Max number of element in the buffer

  • observation_space – Observation space

  • action_space – Action space

  • device – PyTorch device to which the values will be converted

  • n_envs – Number of parallel environments

  • sample_episode – If False, we will sample the observations in a ramdon states format, and will return batch_size states. If False, we will sample the observation in a random episode formot, and will return batch_size episodes. NOTE: if True, all the episodes length should keep the same, or the batch size should be 1, otherwise, we can’t stack differnt length of episodes.

static swap_and_flatten(arr: numpy.ndarray) numpy.ndarray

Swap and then flatten axes 0 (buffer_size) and 1 (n_envs) to convert shape from [n_steps, n_envs, …] (when … is the shape of the features) to [n_steps * n_envs, …] (which maintain the order)

Parameters

arr

Returns

size() int
Returns

The current size of the buffer

add(*args, **kwargs) None

Add elements to the buffer.

extend(*args, **kwargs) None

Add a new batch of transitions to the buffer

reset() None

Reset the buffer.

sample(batch_size: int, env: Optional[VecNormalize] = None)
Parameters
  • batch_size – Number of element to sample

  • env – associated VecEnv to normalize the observations/rewards when sampling

Returns

if episode sample, return a list with episode length and contains BufferSamples, else, return BufferSamples.

to_torch(array: numpy.ndarray, copy: bool = True) torch.Tensor

Convert a numpy array to a PyTorch tensor. Note: it copies the data by default

Parameters
  • array

  • copy – Whether to copy or not the data (may be useful to avoid changing things be reference)

Returns

class eve.app.buffers.ReplayBuffer(buffer_size: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace, device: Union[torch.device, str] = 'cpu', n_envs: int = 1, sample_episode: bool = False)

Bases: eve.app.buffers.BaseBuffer

Replay buffer used in off-policy algorithms like SAC/TD3.

Parameters
  • buffer_size – Max number of element in the buffer

  • observation_space – Observation space

  • action_space – Action space

  • device

  • n_envs – Number of parallel environments

add(obs: numpy.ndarray, next_obs: numpy.ndarray, action: numpy.ndarray, reward: numpy.ndarray, done: numpy.ndarray) None
class eve.app.buffers.RolloutBuffer(buffer_size: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace, device: Union[torch.device, str] = 'cpu', gae_lambda: float = 1, gamma: float = 0.99, n_envs: int = 1, sample_episode: bool = False)

Bases: eve.app.buffers.BaseBuffer

Rollout buffer used in on-policy algorithms like A2C/PPO. It corresponds to buffer_size transitions collected using the current policy. This experience will be discarded after the policy update. In order to use PPO objective, we also store the current value of each state and the log probability of each taken action.

The term rollout here refers to the model-free notion and should not be used with the concept of rollout used in model-based RL or planning. Hence, it is only involved in policy and value function training but not action selection.

Parameters
  • buffer_size – Max number of element in the buffer

  • observation_space – Observation space

  • action_space – Action space

  • device

  • gae_lambda – Factor for trade-off of bias vs variance for Generalized Advantage Estimator Equivalent to classic advantage when set to 1.

  • gamma – Discount factor

  • n_envs – Number of parallel environments

reset() None
compute_returns_and_advantage(last_values: torch.Tensor, dones: numpy.ndarray) None

Post-processing step: compute the returns (sum of discounted rewards) and GAE advantage. Adapted from Stable-Baselines PPO2.

Uses Generalized Advantage Estimation (https://arxiv.org/abs/1506.02438) to compute the advantage. To obtain vanilla advantage (A(s) = R - V(S)) where R is the discounted reward with value bootstrap, set gae_lambda=1.0 during initialization.

Parameters
  • last_values

  • dones

add(obs: numpy.ndarray, action: numpy.ndarray, reward: numpy.ndarray, done: numpy.ndarray, value: torch.Tensor, log_prob: torch.Tensor) None
Parameters
  • obs – Observation

  • action – Action

  • reward

  • done – End of episode signal.

  • value – estimated value of the current state following the current policy.

  • log_prob – log probability of the action following the current policy.

eve.app.callbacks module

eve.app.callbacks.sync_envs_normalization(env: EveEnv, eval_env: EveEnv) None

Sync eval env and train env when using VecNormalize

Parameters
  • env

  • eval_env

eve.app.callbacks.evaluate_policy(model: algo.BaseAlgorithm, env: EveEnv, n_eval_episodes: int = 10, deterministic: bool = True, callback: Optional[Callable[[Dict[str, Any], Dict[str, Any]], None]] = None, reward_threshold: Optional[float] = None, return_episode_rewards: bool = False, warn: bool = True) Union[Tuple[float, float], Tuple[List[float], List[int]]]

Runs policy for n_eval_episodes episodes and returns average reward. This is made to work only with one env.

Note

If environment has not been wrapped with Monitor wrapper, reward and episode lengths are counted as it appears with env.step calls. If the environment contains wrappers that modify rewards or episode lengths (e.g. reward scaling, early episode reset), these will affect the evaluation results as well. You can avoid this by wrapping environment with Monitor wrapper before anything else.

Parameters
  • model – The RL agent you want to evaluate.

  • env – The environment. In the case of a VecEnv this must contain only one environment.

  • n_eval_episodes – Number of episode to evaluate the agent

  • deterministic – Whether to use deterministic or stochastic actions

  • callback – callback function to do additional checks, called after each step. Gets locals() and globals() passed as parameters.

  • reward_threshold – Minimum expected reward per episode, this will raise an error if the performance is not met

  • return_episode_rewards – If True, a list of rewards and episde lengths per episode will be returned instead of the mean.

  • warn – If True (default), warns user about lack of a Monitor wrapper in the evaluation environment.

Returns

Mean reward per episode, std of reward per episode. Returns ([float], [int]) when return_episode_rewards is True, first list containing per-episode rewards and second containing per-episode lengths (in number of steps).

class eve.app.callbacks.BaseCallback(verbose: int = 0)

Bases: abc.ABC

Base class for callback.

Parameters

verbose

init_callback(model: algo.BaseAlgorithm) None

Initialize the callback by saving references to the RL model and the training environment for convenience.

on_training_start(locals_: Dict[str, Any], globals_: Dict[str, Any]) None
on_rollout_start() None
on_step() bool

This method will be called by the model after each call to env.step().

For child callback (of an EventCallback), this will be called when the event is triggered.

Returns

If the callback returns False, training is aborted early.

on_training_end() None
on_rollout_end() None
update_locals(locals_: Dict[str, Any]) None

Update the references to the local variables.

Parameters

locals – the local variables during rollout collection

update_child_locals(locals_: Dict[str, Any]) None

Update the references to the local variables on sub callbacks.

Parameters

locals – the local variables during rollout collection

class eve.app.callbacks.EventCallback(callback: Optional[eve.app.callbacks.BaseCallback] = None, verbose: int = 0)

Bases: eve.app.callbacks.BaseCallback

Base class for triggering callback on event.

Parameters
  • callback – Callback that will be called when an event is triggered.

  • verbose

init_callback(model: algo.BaseAlgorithm) None
update_child_locals(locals_: Dict[str, Any]) None

Update the references to the local variables.

Parameters

locals – the local variables during rollout collection

class eve.app.callbacks.CallbackList(callbacks: List[eve.app.callbacks.BaseCallback])

Bases: eve.app.callbacks.BaseCallback

Class for chaining callbacks.

Parameters

callbacks – A list of callbacks that will be called sequentially.

update_child_locals(locals_: Dict[str, Any]) None

Update the references to the local variables.

Parameters

locals – the local variables during rollout collection

class eve.app.callbacks.CheckpointCallback(save_freq: int, save_path: str, name_prefix: str = 'rl_model', verbose: int = 0)

Bases: eve.app.callbacks.BaseCallback

Callback for saving a model every save_freq steps

Parameters
  • save_freq

  • save_path – Path to the folder where the model will be saved.

  • name_prefix – Common prefix to the saved models

  • verbose

class eve.app.callbacks.ConvertCallback(callback: Callable[[Dict[str, Any], Dict[str, Any]], bool], verbose: int = 0)

Bases: eve.app.callbacks.BaseCallback

Convert functional callback (old-style) to object.

Parameters
  • callback

  • verbose

class eve.app.callbacks.EvalCallback(eval_env: EveEnv, callback_on_new_best: Optional[eve.app.callbacks.BaseCallback] = None, n_eval_episodes: int = 5, eval_freq: int = 10000, log_path: str = None, best_model_save_path: str = None, deterministic: bool = True, verbose: int = 1, warn: bool = True)

Bases: eve.app.callbacks.EventCallback

Callback for evaluating an agent.

Parameters
  • eval_env – The environment used for initialization

  • callback_on_new_best – Callback to trigger when there is a new best model according to the mean_reward

  • n_eval_episodes – The number of episodes to test the agent

  • eval_freq – Evaluate the agent every eval_freq call of the callback.

  • log_path – Path to a folder where the evaluations (evaluations.npz) will be saved. It will be updated at each evaluation.

  • best_model_save_path – Path to a folder where the best model according to performance on the eval env will be saved.

  • deterministic – Whether the evaluation should use a stochastic or deterministic actions.

  • verbose

  • warn – Passed to evaluate_policy (warns if eval_env has not been wrapped with a Monitor wrapper)

update_child_locals(locals_: Dict[str, Any]) None

Update the references to the local variables.

Parameters

locals – the local variables during rollout collection

class eve.app.callbacks.StopTrainingOnRewardThreshold(reward_threshold: float, verbose: int = 0)

Bases: eve.app.callbacks.BaseCallback

Stop the training once a threshold in episodic reward has been reached (i.e. when the model is good enough).

It must be used with the EvalCallback.

Parameters
  • reward_threshold – Minimum expected reward per episode to stop training.

  • verbose

class eve.app.callbacks.EveryNTimesteps(n_steps: int, callback: eve.app.callbacks.BaseCallback)

Bases: eve.app.callbacks.EventCallback

Trigger a callback every n_steps timesteps

Parameters
  • n_steps – Number of timesteps between two trigger.

  • callback – Callback that will be called when the event is triggered.

class eve.app.callbacks.StopTrainingOnMaxEpisodes(max_episodes: int, verbose: int = 0)

Bases: eve.app.callbacks.BaseCallback

Stop the training once a maximum number of episodes are played.

For multiple environments presumes that, the desired behavior is that the agent trains on each env for max_episodes and in total for max_episodes * n_envs episodes.

Parameters
  • max_episodes – Maximum number of episodes to stop training.

  • verbose – Select whether to print information about when training ended by reaching max_episodes

class eve.app.callbacks.TrialEvalCallback(eval_env: eve.app.env.VecEnv, trial: optuna.trial._trial.Trial, n_eval_episodes: int = 5, eval_freq: int = 10000, deterministic: bool = True, verbose: int = 0)

Bases: eve.app.callbacks.EvalCallback

Callback used for evaluating and reporting a trial.

class eve.app.callbacks.SaveVecNormalizeCallback(save_freq: int, save_path: str, name_prefix: Optional[str] = None, verbose: int = 0)

Bases: eve.app.callbacks.BaseCallback

Callback for saving a VecNormalize wrapper every save_freq steps.

Parameters
  • save_freq (int) –

  • save_path (str) – Path to the folder where VecNormalize will be saved, as vecnormalize.pkl

  • name_prefix (str) – Common prefix to the saved VecNormalize, if None (default) only one file will be kept.

eve.app.env module

class eve.app.env.EveEnv

Bases: object

The main OpenAI class. It encapsulates an environment with arbitrary behind-the-scenes dynamics. An environment can be partially or fully observed.

The main API methods that users of this class need to know are:

step reset render close seed

And set the following attributes:

action_space: The Space object corresponding to valid actions observation_space: The Space object corresponding to valid observations reward_range: A tuple corresponding to the min and max possible rewards

Note: a default reward range set to [-inf,+inf] already exists. Set it if you want a narrower range.

The methods are accessed publicly as “step”, “reset”, etc…

metadata = {'render.modes': []}
reward_range = (-inf, inf)
spec = None
action_space = None
observation_space = None
step(action)

Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.

Accepts an action and returns a tuple (observation, reward, done, info).

Parameters

action (object) – an action provided by the agent

Returns

agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)

Return type

observation (object)

reset()

Resets the environment to an initial state and returns an initial observation.

Note that this function should not reset the environment’s random number generator(s); random variables in the environment’s state should be sampled independently between multiple calls to reset(). In other words, each call of reset() should yield an environment suitable for a new episode, independent of previous episodes.

Returns

the initial observation.

Return type

observation (object)

render(mode='human')

Renders the environment.

The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:

  • human: render to the current display or terminal and return nothing. Usually for human consumption.

  • rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.

  • ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).

Note

Make sure that your class’s metadata ‘render.modes’ key includes

the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.

Parameters

mode (str) – the mode to render with

Example:

class MyEnv(EveEnv):

metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}

def render(self, mode=’human’):
if mode == ‘rgb_array’:

return np.array(…) # return RGB frame suitable for video

elif mode == ‘human’:

… # pop up a window and render

else:

super(MyEnv, self).render(mode=mode) # just raise an exception

close()

Override close in your subclass to perform any necessary cleanup.

Environments will automatically close() themselves when garbage collected or when the program exits.

seed(seed=None)

Sets the seed for this env’s random number generator(s).

Note

Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.

Returns

Returns the list of seeds used in this env’s random

number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.

Return type

list<bigint>

property unwrapped

Completely unwrap this env.

Returns

The base non-wrapped EveEnv instance

Return type

EveEnv

class eve.app.env.GoalEnv

Bases: eve.app.env.EveEnv

A goal-based environment. It functions just as any regular OpenAI environment but it imposes a required structure on the observation_space. More concretely, the observation space is required to contain at least three elements, namely observation, desired_goal, and achieved_goal. Here, desired_goal specifies the goal that the agent should attempt to achieve. achieved_goal is the goal that it currently achieved instead. observation contains the actual observations of the environment as per usual.

reset()
compute_reward(achieved_goal, desired_goal, info)

Compute the step reward. This externalizes the reward function and makes it dependent on a desired goal and the one that was achieved. If you wish to include additional rewards that are independent of the goal, you can include the necessary values to derive it in ‘info’ and compute it accordingly.

Parameters
  • achieved_goal (object) – the goal that was achieved during execution

  • desired_goal (object) – the desired goal that we asked the agent to attempt to achieve

  • info (dict) – an info dictionary with additional information

Returns

The reward that corresponds to the provided achieved goal w.r.t. to the desired goal. Note that the following should always hold true:

ob, reward, done, info = env.step() assert reward == env.compute_reward(ob[‘achieved_goal’], ob[‘goal’], info)

Return type

float

class eve.app.env.Wrapper(env)

Bases: eve.app.env.EveEnv

Wraps the environment to allow a modular transformation.

This class is the base class for all wrappers. The subclass could override some methods to change the behavior of the original environment without touching the original code.

Note

Don’t forget to call super().__init__(env) if the subclass overrides __init__().

property spec
classmethod class_name()
step(action)
reset(**kwargs)
render(mode='human', **kwargs)
close()
seed(seed=None)
compute_reward(achieved_goal, desired_goal, info)
property unwrapped
class eve.app.env.ObservationWrapper(env)

Bases: eve.app.env.Wrapper

reset(**kwargs)
step(action)
observation(observation)
class eve.app.env.RewardWrapper(env)

Bases: eve.app.env.Wrapper

reset(**kwargs)
step(action)
reward(reward)
class eve.app.env.ActionWrapper(env)

Bases: eve.app.env.Wrapper

reset(**kwargs)
step(action)
action(action)
reverse_action(action)
class eve.app.env.FlattenObservation(env)

Bases: eve.app.env.ObservationWrapper

Observation wrapper that flattens the observation.

observation(observation)
class eve.app.env.VecEnv(num_envs: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace)

Bases: abc.ABC

An abstract asynchronous, vectorized environment.

Parameters
  • num_envs – the number of environments

  • observation_space – the observation space

  • action_space – the action space

abstract reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]

Reset all the environments and return an array of observations, or a tuple of observation arrays.

If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.

Returns

observation

abstract step_async(actions: numpy.ndarray) None

Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step.

You should not call this if a step_async run is already pending.

abstract step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]

Wait for the step taken with step_async().

Returns

observation, reward, done, information

abstract close() None

Clean up the environment’s resources.

abstract get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]

Return attribute from vectorized environment.

Parameters
  • attr_name – The name of the attribute whose value to return

  • indices – Indices of envs to get attribute from

Returns

List of values of ‘attr_name’ in all environments

abstract set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None

Set attribute inside vectorized environments.

Parameters
  • attr_name – The name of attribute to assign new value

  • value – Value to assign to attr_name

  • indices – Indices of envs to assign value

Returns

abstract env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]

Call instance methods of vectorized environments.

Parameters
  • method_name – The name of the environment method to invoke.

  • indices – Indices of envs whose method to call

  • method_args – Any positional arguments to provide in the call

  • method_kwargs – Any keyword arguments to provide in the call

Returns

List of items returned by the environment’s method call

abstract env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]

Check if environments are wrapped with a given wrapper.

Parameters
  • method_name – The name of the environment method to invoke.

  • indices – Indices of envs whose method to call

  • method_args – Any positional arguments to provide in the call

  • method_kwargs – Any keyword arguments to provide in the call

Returns

True if the env is wrapped, False otherwise, for each env queried.

step(actions: numpy.ndarray) Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]

Step the environments with the given action

Parameters

actions – the action

Returns

observation, reward, done, information

abstract seed(seed: Optional[int] = None) List[Union[None, int]]

Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.

Parameters

seed – The random seed. May be None for completely random seeding.

Returns

Returns a list containing the seeds for each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.

property unwrapped: eve.app.env.VecEnv
getattr_depth_check(name: str, already_found: bool) Optional[str]

Check if an attribute reference is being hidden in a recursive call to __getattr__

Parameters
  • name – name of attribute to check for

  • already_found – whether this attribute has already been found in a wrapper

Returns

name of module whose attribute is being shadowed, if any.

class eve.app.env.VecEnvWrapper(venv: eve.app.env.VecEnv, observation_space: Optional[eve.app.space.EveSpace] = None, action_space: Optional[eve.app.space.EveSpace] = None)

Bases: eve.app.env.VecEnv

Vectorized environment base class

Parameters
  • venv – the vectorized environment to wrap

  • observation_space – the observation space (can be None to load from venv)

  • action_space – the action space (can be None to load from venv)

step_async(actions: numpy.ndarray) None
abstract reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
abstract step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
seed(seed: Optional[int] = None) List[Union[None, int]]
close() None
get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]
set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None
env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]
env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]
getattr_recursive(name: str) Any

Recursively check wrappers to find attribute.

Parameters

name – name of attribute to look for

Returns

attribute

getattr_depth_check(name: str, already_found: bool) str

See base class.

Returns

name of module whose attribute is being shadowed, if any.

eve.app.env.copy_obs_dict(obs: Dict[str, numpy.ndarray]) Dict[str, numpy.ndarray]

Deep-copy a dict of numpy arrays.

Parameters

obs – a dict of numpy arrays.

Returns

a dict of copied numpy arrays.

eve.app.env.dict_to_obs(space_: eve.app.space.EveSpace, obs_dict: Dict[Any, numpy.ndarray]) Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]

Convert an internal representation raw_obs into the appropriate type specified by space.

Parameters
  • space – an observation space.

  • obs_dict – a dict of numpy arrays.

Returns

returns an observation of the same type as space. If space is Dict, function is identity; if space is Tuple, converts dict to Tuple; otherwise, space is unstructured and returns the value raw_obs[None].

eve.app.env.obs_space_info(obs_space: eve.app.space.EveSpace) Tuple[List[str], Dict[Any, Tuple[int, ...]], Dict[Any, numpy.dtype]]

Get dict-structured information about a eve.app.EveSpace.

Dict spaces are represented directly by their dict of subspaces. Tuple spaces are converted into a dict with keys indexing into the tuple. Unstructured spaces are represented by {None: obs_space}.

Parameters

obs_space – an observation space

Returns

A tuple (keys, shapes, dtypes): keys: a list of dict keys. shapes: a dict mapping keys to shapes. dtypes: a dict mapping keys to dtypes.

class eve.app.env.ObsDictWrapper(venv: eve.app.env.VecEnv)

Bases: eve.app.env.VecEnvWrapper

Wrapper for a VecEnv which overrides the observation space for Hindsight Experience Replay to support dict observations.

Parameters

env – The vectorized environment to wrap.

reset()
step_wait()
static convert_dict(observation_dict: Dict[str, numpy.ndarray], observation_key: str = 'observation', goal_key: str = 'desired_goal') numpy.ndarray

Concatenate observation and (desired) goal of observation dict.

Parameters
  • observation_dict – Dictionary with observation.

  • observation_key – Key of observation in dicitonary.

  • goal_key – Key of (desired) goal in dicitonary.

Returns

Concatenated observation.

class eve.app.env.CloudpickleWrapper(var: Any)

Bases: object

Uses cloudpickle to serialize contents (otherwise multiprocessing tries to use pickle)

Parameters

var – the variable you wish to wrap for pickling with cloudpickle

class eve.app.env.DummyVecEnv(env_fns: List[Callable[[], eve.app.env.EveEnv]])

Bases: eve.app.env.VecEnv

Creates a simple vectorized wrapper for multiple environments, calling each environment in sequence on the current Python process. This is useful for computationally simple environment such as cartpole-v1, as the overhead of multiprocess or multithread outweighs the environment computation time. This can also be used for RL methods that require a vectorized environment, but that you want a single environments to train with.

Parameters

env_fns – a list of functions that return environments to vectorize

step_async(actions: numpy.ndarray) None
step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
seed(seed: Optional[int] = None) List[Union[None, int]]
reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
close() None
get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]

Return attribute from vectorized environment (see base class).

set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None

Set attribute inside vectorized environments (see base class).

env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]

Call instance methods of vectorized environments.

env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]

Check if worker environments are wrapped with a given wrapper

class eve.app.env.SubprocVecEnv(env_fns: List[Callable[[], eve.app.env.EveEnv]], start_method: Optional[str] = None)

Bases: eve.app.env.VecEnv

Creates a multiprocess vectorized wrapper for multiple environments, distributing each environment to its own process, allowing significant speed up when the environment is computationally complex.

For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.

Warning

Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an if __name__ == "__main__": block. For more information, see the multiprocessing documentation.

Parameters
  • env_fns – Environments to run in subprocesses

  • start_method – method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.

step_async(actions: numpy.ndarray) None
step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
seed(seed: Optional[int] = None) List[Union[None, int]]
reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
close() None
get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]

Return attribute from vectorized environment (see base class).

set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None

Set attribute inside vectorized environments (see base class).

env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]

Call instance methods of vectorized environments.

env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]

Check if worker environments are wrapped with a given wrapper

class eve.app.env.RunningMeanStd(epsilon: float = 0.0001, shape: Tuple[int, ...] = ())

Bases: object

Calulates the running mean and std of a data stream https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm

Parameters
  • epsilon – helps with arithmetic issues

  • shape – the shape of the data stream’s output

update(arr: numpy.ndarray) None
update_from_moments(batch_mean: numpy.ndarray, batch_var: numpy.ndarray, batch_count: int) None
eve.app.env.check_for_correct_spaces(env: eve.app.env.EveEnv, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace) None

Checks that the environment has same spaces as provided ones. Used by BaseAlgorithm to check if spaces match after loading the model with given env. Checked parameters: - observation_space - action_space

Parameters
  • env – Environment to check for valid spaces

  • observation_space – Observation space to check against

  • action_space – Action space to check against

class eve.app.env.VecNormalize(venv: eve.app.env.VecEnv, training: bool = True, norm_obs: bool = True, norm_reward: bool = True, clip_obs: float = 10.0, clip_reward: float = 10.0, gamma: float = 0.99, epsilon: float = 1e-08)

Bases: eve.app.env.VecEnvWrapper

A moving average, normalizing wrapper for vectorized environment. has support for saving/loading moving average,

Parameters
  • venv – the vectorized environment to wrap

  • training – Whether to update or not the moving average

  • norm_obs – Whether to normalize observation or not (default: True)

  • norm_reward – Whether to normalize rewards or not (default: True)

  • clip_obs – Max absolute value for observation

  • clip_reward – Max value absolute for discounted reward

  • gamma – discount factor

  • epsilon – To avoid division by zero

set_venv(venv: eve.app.env.VecEnv) None

Sets the vector environment to wrap to venv.

Also sets attributes derived from this such as num_env.

Parameters

venv

step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]

Apply sequence of actions to sequence of environments actions -> (observations, rewards, news)

where ‘news’ is a boolean vector indicating whether each element is new.

normalize_obs(obs: Union[numpy.ndarray, Dict[str, numpy.ndarray]]) Union[numpy.ndarray, Dict[str, numpy.ndarray]]

Normalize observations using this VecNormalize’s observations statistics. Calling this method does not update statistics.

normalize_reward(reward: numpy.ndarray) numpy.ndarray

Normalize rewards using this VecNormalize’s rewards statistics. Calling this method does not update statistics.

unnormalize_obs(obs: Union[numpy.ndarray, Dict[str, numpy.ndarray]]) Union[numpy.ndarray, Dict[str, numpy.ndarray]]
unnormalize_reward(reward: numpy.ndarray) numpy.ndarray
get_original_obs() Union[numpy.ndarray, Dict[str, numpy.ndarray]]

Returns an unnormalized version of the observations from the most recent step or reset.

get_original_reward() numpy.ndarray

Returns an unnormalized version of the rewards from the most recent step.

reset() Union[numpy.ndarray, Dict[str, numpy.ndarray]]

Reset all environments :return: first observation of the episode

static load(load_path: str, venv: eve.app.env.VecEnv) eve.app.env.VecNormalize

Loads a saved VecNormalize object.

Parameters
  • load_path – the path to load from.

  • venv – the VecEnv to wrap.

Returns

save(save_path: str) None

Save current VecNormalize object with all running statistics and settings (e.g. clip_obs)

Parameters

save_path – The path to save to

class eve.app.env.Monitor(env: eve.app.env.EveEnv, filename: Optional[str] = None, allow_early_resets: bool = True, reset_keywords: Tuple[str, ...] = (), info_keywords: Tuple[str, ...] = ())

Bases: eve.app.env.Wrapper

A monitor wrapper for Gym environments, it is used to know the episode reward, length, time and other data.

Parameters
  • env – The environment

  • filename – the location to save a log file, can be None for no log

  • allow_early_resets – allows the reset of the environment before it is done

  • reset_keywords – extra keywords for the reset call, if extra parameters are needed at reset

  • info_keywords – extra information to log, from the information return of env.step()

EXT = 'monitor.csv'
reset(**kwargs) Union[Tuple, Dict[str, Any], numpy.ndarray, int]

Calls the environment reset. Can only be called if the environment is over, or if allow_early_resets is True

Parameters

kwargs – Extra keywords saved for the next episode. only if defined by reset_keywords

Returns

the first observation of the environment

step(action: Union[numpy.ndarray, int]) Tuple[Union[Tuple, Dict[str, Any], numpy.ndarray, int], float, bool, Dict]

Step the environment with the given action

Parameters

action – the action

Returns

observation, reward, done, information

close() None

Closes the environment

get_total_steps() int

Returns the total number of timesteps

Returns

get_episode_rewards() List[float]

Returns the rewards of all the episodes

Returns

get_episode_lengths() List[int]

Returns the number of timesteps of all the episodes

Returns

get_episode_times() List[float]

Returns the runtime in seconds of all the episodes

Returns

exception eve.app.env.LoadMonitorResultsError

Bases: Exception

Raised when loading the monitor log fails.

eve.app.env.get_monitor_files(path: str) List[str]

get all the monitor files in the given path

Parameters

path – the logging folder

Returns

the log files

eve.app.env.load_results(path: str) pandas.core.frame.DataFrame

Load all Monitor logs from a given directory path matching *monitor.csv

Parameters

path – the directory path containing the log file(s)

Returns

the logged data

eve.app.env.rolling_window(array: numpy.ndarray, window: int) numpy.ndarray

Apply a rolling window to a np.ndarray

Parameters
  • array – the input Array

  • window – length of the rolling window

Returns

rolling window on the input array

eve.app.env.window_func(var_1: numpy.ndarray, var_2: numpy.ndarray, window: int, func: Callable) Tuple[numpy.ndarray, numpy.ndarray]

Apply a function to the rolling window of 2 arrays

Parameters
  • var_1 – variable 1

  • var_2 – variable 2

  • window – length of the rolling window

  • func – function to apply on the rolling window on variable 2 (such as np.mean)

Returns

the rolling output with applied function

eve.app.env.ts2xy(data_frame: pandas.core.frame.DataFrame, x_axis: str) Tuple[numpy.ndarray, numpy.ndarray]

Decompose a data frame variable to x ans ys

Parameters
  • data_frame – the input data

  • x_axis – the axis for the x and y output (can be X_TIMESTEPS=’timesteps’, X_EPISODES=’episodes’ or X_WALLTIME=’walltime_hrs’)

Returns

the x and y output

eve.app.env.plot_curves(xy_list: List[Tuple[numpy.ndarray, numpy.ndarray]], x_axis: str, title: str, figsize: Tuple[int, int] = (8, 2)) None

plot the curves

Parameters
  • xy_list – the x and y coordinates to plot

  • x_axis – the axis for the x and y output (can be X_TIMESTEPS=’timesteps’, X_EPISODES=’episodes’ or X_WALLTIME=’walltime_hrs’)

  • title – the title of the plot

  • figsize – Size of the figure (width, height)

eve.app.env.plot_results(dirs: List[str], num_timesteps: Optional[int], x_axis: str, task_name: str, figsize: Tuple[int, int] = (8, 2)) None

Plot the results using csv files from Monitor wrapper.

Parameters
  • dirs – the save location of the results to plot

  • num_timesteps – only plot the points below this value

  • x_axis – the axis for the x and y output (can be X_TIMESTEPS=’timesteps’, X_EPISODES=’episodes’ or X_WALLTIME=’walltime_hrs’)

  • task_name – the title of the task to plot

  • figsize – Size of the figure (width, height)

eve.app.env.unwrap_vec_wrapper(env: Union[eve.app.env.EveEnv, eve.app.env.VecEnv], vec_wrapper_class: Type[eve.app.env.VecEnvWrapper]) Optional[eve.app.env.VecEnvWrapper]

Retrieve a VecEnvWrapper object by recursively searching.

Parameters
  • env

  • vec_wrapper_class

Returns

eve.app.env.unwrap_vec_normalize(env: Union[eve.app.env.EveEnv, eve.app.env.VecEnv]) Optional[eve.app.env.VecNormalize]
Parameters

env

Returns

eve.app.env.is_vecenv_wrapped(env: Union[eve.app.env.EveEnv, eve.app.env.VecEnv], vec_wrapper_class: Type[eve.app.env.VecEnvWrapper]) bool

Check if an environment is already wrapped by a given VecEnvWrapper.

Parameters
  • env

  • vec_wrapper_class

Returns

eve.app.env.unwrap_wrapper(env: eve.app.env.EveEnv, wrapper_class: Type[eve.app.env.Wrapper]) Optional[eve.app.env.Wrapper]

Retrieve a VecEnvWrapper object by recursively searching.

Parameters
  • env – Environment to unwrap

  • wrapper_class – Wrapper to look for

Returns

Environment unwrapped till wrapper_class if it has been wrapped with it

eve.app.env.is_wrapped(env: Type[eve.app.env.EveEnv], wrapper_class: Type[eve.app.env.Wrapper]) bool

Check if a given environment has been wrapped with a given wrapper.

Parameters
  • env – Environment to check

  • wrapper_class – Wrapper class to look for

Returns

True if environment has been wrapped with wrapper_class.

eve.app.env.get_wrapper_class(hyperparams: Dict[str, Any]) Optional[Callable[[eve.app.env.EveEnv], eve.app.env.EveEnv]]

Get one or more environment wrapper class specified as a hyper parameter “env_wrapper”. e.g. env_wrapper: _minigrid.wrappers.FlatObsWrapper

for multiple, specify a list:

env_wrapper:
  • utils.wrappers.PlotActionWrapper

  • utils.wrappers.TimeFeatureWrapper

Parameters

hyperparams

Returns

maybe a callable to wrap the environment with one or multiple Wrapper

eve.app.env.make_vec_env(env_id: Union[str, Type[eve.app.env.EveEnv]], n_envs: int = 1, seed: Optional[int] = None, start_index: int = 0, monitor_dir: Optional[str] = None, wrapper_class: Optional[Callable[[eve.app.env.EveEnv], eve.app.env.EveEnv]] = None, env_kwargs: Optional[Dict[str, Any]] = None, vec_env_cls: Optional[Type[Union[eve.app.env.DummyVecEnv, eve.app.env.SubprocVecEnv]]] = None, vec_env_kwargs: Optional[Dict[str, Any]] = None, monitor_kwargs: Optional[Dict[str, Any]] = None) eve.app.env.VecEnv

Create a wrapped, monitored VecEnv. By default it uses a DummyVecEnv which is usually faster than a SubprocVecEnv.

Parameters
  • env_id – the environment ID or the environment class

  • n_envs – the number of environments you wish to have in parallel

  • seed – the initial seed for the random number generator

  • start_index – start rank index

  • monitor_dir – Path to a folder where the monitor files will be saved. If None, no file will be written, however, the env will still be wrapped in a Monitor wrapper to provide additional information about training.

  • wrapper_class – Additional wrapper to use on the environment. This can also be a function with single argument that wraps the environment in many things.

  • env_kwargs – Optional keyword argument to pass to the env constructor

  • vec_env_cls – A custom VecEnv class constructor. Default: None.

  • vec_env_kwargs – Keyword arguments to pass to the VecEnv class constructor.

  • monitor_kwargs – Keyword arguments to pass to the Monitor class constructor.

Returns

The wrapped environment

eve.app.env.create_test_env(env_id: str, n_envs: int = 1, stats_path: Optional[str] = None, seed: int = 0, log_dir: Optional[str] = None, should_render: bool = True, hyperparams: Optional[Dict[str, Any]] = None, env_kwargs: Optional[Dict[str, Any]] = None) eve.app.env.VecEnv

Create environment for testing a trained agent

Parameters
  • env_id

  • n_envs – number of processes

  • stats_path – path to folder containing saved running averaged

  • seed – Seed for random number generator

  • log_dir – Where to log rewards

  • should_render – For Pybullet env, display the GUI

  • hyperparams – Additional hyperparams (ex: n_stack)

  • env_kwargs – Optional keyword argument to pass to the env constructor

eve.app.exp_manager module

eve.app.hyperparams_opt module

eve.app.logger module

exception eve.app.logger.FormatUnsupportedError(unsupported_formats: Sequence[str], value_description: str)

Bases: NotImplementedError

class eve.app.logger.KVWriter

Bases: object

Key Value writer

write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None

Write a dictionary to file

Parameters
  • key_values

  • key_excluded

  • step

close() None

Close owned resources

class eve.app.logger.SeqWriter

Bases: object

sequence writer

write_sequence(sequence: List) None

write_sequence an array to file

Parameters

sequence

class eve.app.logger.HumanOutputFormat(filename_or_file: Union[str, TextIO])

Bases: eve.app.logger.KVWriter, eve.app.logger.SeqWriter

log to a file, in a human readable format

Parameters

filename_or_file – the file to write the log to

write(key_values: Dict, key_excluded: Dict, step: int = 0) None
write_sequence(sequence: List) None
close() None

closes the file

eve.app.logger.filter_excluded_keys(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], _format: str) Dict[str, Any]

Filters the keys specified by key_exclude for the specified format

Parameters
  • key_values – log dictionary to be filtered

  • key_excluded – keys to be excluded per format

  • _format – format for which this filter is run

Returns

dict without the excluded keys

class eve.app.logger.JSONOutputFormat(filename: str)

Bases: eve.app.logger.KVWriter

log to a file, in the JSON format

Parameters

filename – the file to write the log to

write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None
close() None

closes the file

class eve.app.logger.CSVOutputFormat(filename: str)

Bases: eve.app.logger.KVWriter

log to a file, in a CSV format

Parameters

filename – the file to write the log to

write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None
close() None

closes the file

class eve.app.logger.TensorBoardOutputFormat(folder: str)

Bases: eve.app.logger.KVWriter

Dumps key/value pairs into TensorBoard’s numeric format.

Parameters

folder – the folder to write the log to

write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None
close() None

closes the file

eve.app.logger.make_output_format(_format: str, log_dir: str, log_suffix: str = '') eve.app.logger.KVWriter

return a logger for the requested format

Parameters
  • _format – the requested format to log to (‘stdout’, ‘log’, ‘json’ or ‘csv’ or ‘tensorboard’)

  • log_dir – the logging directory

  • log_suffix – the suffix for the log file

Returns

the logger

class eve.app.logger.Logger(folder: Optional[str], output_formats: List[eve.app.logger.KVWriter])

Bases: object

the logger class

Parameters
  • folder – the logging location

  • output_formats – the list of output format

DEFAULT = <eve.app.logger.Logger object>
CURRENT = <eve.app.logger.Logger object>
record(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None

Log a value of some diagnostic Call this once for each diagnostic quantity, each iteration If called many times, last value will be used.

Parameters
  • key – save to log this key

  • value – save to log this value

  • exclude – outputs to be excluded

record_mean(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None

The same as record(), but if called many times, values averaged.

Parameters
  • key – save to log this key

  • value – save to log this value

  • exclude – outputs to be excluded

dump(step: int = 0) None

Write all of the diagnostics from the current iteration

log(*args, level: int = 20) None

Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file).

level: int. (see logger.py docs) If the global logger level is higher than

the level argument here, don’t print to stdout.

Parameters
  • args – log the arguments

  • level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)

set_level(level: int) None

Set logging threshold on current logger.

Parameters

level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)

get_dir() str

Get directory that log files are being written to. will be None if there is no output directory (i.e., if you didn’t call start)

Returns

the logging directory

close() None

closes the file

eve.app.logger.configure(folder: Optional[str] = None, format_strings: Optional[List[str]] = None) None

configure the current logger

Parameters
  • folder – the save location (if None, $SB3_LOGDIR, if still None, tempdir/baselines-[date & time])

  • format_strings – the output logging format (if None, $SB3_LOG_FORMAT, if still None, [‘stdout’, ‘log’, ‘csv’])

eve.app.logger.reset() None

reset the current logger

class eve.app.logger.ScopedConfigure(folder: Optional[str] = None, format_strings: Optional[List[str]] = None)

Bases: object

Class for using context manager while logging

usage:

>>> with ScopedConfigure(folder=None, format_strings=None):
>>>    {code}
Parameters
  • folder – the logging folder

  • format_strings – the list of output logging format

eve.app.logger.record(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None

Log a value of some diagnostic Call this once for each diagnostic quantity, each iteration If called many times, last value will be used.

Parameters
  • key – save to log this key

  • value – save to log this value

  • exclude – outputs to be excluded

eve.app.logger.record_mean(key: str, value: Union[int, float], exclude: Optional[Union[str, Tuple[str, ...]]] = None) None

The same as record(), but if called many times, values averaged.

Parameters
  • key – save to log this key

  • value – save to log this value

  • exclude – outputs to be excluded

eve.app.logger.record_dict(key_values: Dict[str, Any]) None

Log a dictionary of key-value pairs.

Parameters

key_values – the list of keys and values to save to log

eve.app.logger.dump(step: int = 0) None

Write all of the diagnostics from the current iteration

eve.app.logger.get_log_dict() Dict

get the key values logs

Returns

the logged values

eve.app.logger.log(*args, level: int = 20) None

Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file).

level: int. (see logger.py docs) If the global logger level is higher than

the level argument here, don’t print to stdout.

Parameters
  • args – log the arguments

  • level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)

eve.app.logger.debug(*args) None

Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the DEBUG level.

Parameters

args – log the arguments

eve.app.logger.info(*args) None

Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the INFO level.

Parameters

args – log the arguments

eve.app.logger.warn(*args) None

Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the WARN level.

Parameters

args – log the arguments

eve.app.logger.error(*args) None

Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the ERROR level.

Parameters

args – log the arguments

eve.app.logger.set_level(level: int) None

Set logging threshold on current logger.

Parameters

level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)

eve.app.logger.get_level() int

Get logging threshold on current logger. :return: the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)

eve.app.logger.get_dir() str

Get directory that log files are being written to. will be None if there is no output directory (i.e., if you didn’t call start)

Returns

the logging directory

eve.app.logger.record_tabular(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None

Log a value of some diagnostic Call this once for each diagnostic quantity, each iteration If called many times, last value will be used.

Parameters
  • key – save to log this key

  • value – save to log this value

  • exclude – outputs to be excluded

eve.app.logger.dump_tabular(step: int = 0) None

Write all of the diagnostics from the current iteration

eve.app.logger.read_json(filename: str) pandas.core.frame.DataFrame

read a json file using pandas

Parameters

filename – the file path to read

Returns

the data in the json

eve.app.logger.read_csv(filename: str) pandas.core.frame.DataFrame

read a csv file using pandas

Parameters

filename – the file path to read

Returns

the data in the csv

eve.app.model module

eve.app.policies module

eve.app.space module

eve.app.space.np_random(seed=None)
eve.app.space.hash_seed(seed=None, max_bytes=8)

Any given evaluation is likely to have many PRNG’s active at once. (Most commonly, because the environment is running in multiple processes.) There’s literature indicating that having linear correlations between seeds of multiple PRNG’s can correlate the outputs:

http://blogs.unity3d.com/2015/01/07/a-primer-on-repeatable-random-numbers/ http://stackoverflow.com/questions/1554958/how-different-do-random-seeds-need-to-be http://dl.acm.org/citation.cfm?id=1276928

Thus, for sanity we hash the seeds before using them. (This scheme is likely not crypto-strength, but it should be good enough to get rid of simple correlations.)

Parameters
  • seed (Optional[int]) – None seeds from an operating system specific randomness source.

  • max_bytes – Maximum number of bytes to use in the hashed seed.

eve.app.space.create_seed(a=None, max_bytes=8)

Create a strong random seed. Otherwise, Python 2 would seed using the system time, which might be non-robust especially in the presence of concurrency.

Parameters
  • a (Optional[int, str]) – None seeds from an operating system specific randomness source.

  • max_bytes – Maximum number of bytes to use in the seed.

class eve.app.space.EveSpace(shape=None, dtype=None)

Bases: object

Defines the observation and action spaces, so you can write generic code that applies to any Env. For example, you can choose a random action.

WARNING - Custom observation & action spaces can inherit from the Space class. However, most use-cases should be covered by the existing space classes (e.g. Box, Discrete, etc…), and container classes (Tuple & Dict). Note that parametrized probability distributions (through the sample() method), and batching functions (in eve.vector.VectorEnv), are only well-defined for instances of spaces provided in eve by default. Moreover, some implementations of Reinforcement Learning algorithms might not handle custom spaces properly. Use custom spaces with care.

property np_random

Lazily seed the rng since this is expensive and only needed if sampling from this space.

sample()

Randomly sample an element of this space. Can be uniform or non-uniform sampling based on boundedness of space.

seed(seed=None)

Seed the PRNG of this space.

contains(x)

Return boolean specifying if x is a valid member of this space

to_jsonable(sample_n)

Convert a batch of samples from this space to a JSONable data type.

from_jsonable(sample_n)

Convert a JSONable data type to a batch of samples from this space.

class eve.app.space.EveBox(low, high, shape=None, max_neurons: typing.Optional[int] = None, max_states: typing.Optional[int] = None, dtype=<class 'numpy.float32'>)

Bases: eve.app.space.EveSpace

A (possibly unbounded) box in R^n. Specifically, a Box represents the Cartesian product of n closed intervals. Each interval has the form of one of [a, b], (-oo, b], [a, oo), or (-oo, oo).

There are two common use cases:

  • Identical bound for each dimension::
    >>> Box(low=-1.0, high=2.0, shape=(3, 4), dtype=np.float32)
    Box(3, 4)
    
  • Independent bound for each dimension::
    >>> Box(low=np.array([-1.0, -2.0]), high=np.array([2.0, 4.0]), dtype=np.float32)
    Box(2,)
    
is_bounded(manner='both')
sample()

Generates a single random sample inside of the Box.

In creating a sample of the box, each coordinate is sampled according to the form of the interval:

  • [a, b] : uniform distribution

  • [a, oo) : shifted exponential distribution

  • (-oo, b] : shifted negative exponential distribution

  • (-oo, oo) : normal distribution

contains(x)
to_jsonable(sample_n)
from_jsonable(sample_n)
class eve.app.space.EveDict(spaces=None, **spaces_kwargs)

Bases: eve.app.space.EveSpace

A dictionary of simpler spaces.

Example usage: self.observation_space = spaces.Dict({“position”: spaces.Discrete(2), “velocity”: spaces.Discrete(3)})

Example usage [nested]:

>>> self.nested_observation_space = spaces.Dict({
>>>        'sensors':  spaces.Dict({
>>>            'position': spaces.Box(low=-100, high=100, shape=(3,)),
>>>            'velocity': spaces.Box(low=-1, high=1, shape=(3,)),
>>>            'front_cam': spaces.Tuple((
>>>                spaces.Box(low=0, high=1, shape=(10, 10, 3)),
>>>                spaces.Box(low=0, high=1, shape=(10, 10, 3))
>>>            )),
>>>            'rear_cam': spaces.Box(low=0, high=1, shape=(10, 10, 3)),
>>>        }),
>>>        'ext_controller': spaces.MultiDiscrete((5, 2, 2)),
>>>        'inner_state':spaces.Dict({
>>>            'charge': spaces.Discrete(100),
>>>            'system_checks': spaces.MultiBinary(10),
>>>            'job_status': spaces.Dict({
>>>                'task': spaces.Discrete(5),
>>>                'progress': spaces.Box(low=0, high=100, shape=()),
>>>            })
>>>        })
>>>    })
seed(seed=None)
sample()
contains(x)
to_jsonable(sample_n)
from_jsonable(sample_n)
class eve.app.space.EveDiscrete(n, max_neurons: Optional[int] = None, max_states: Optional[int] = None)

Bases: eve.app.space.EveSpace

A discrete space in \(\{ 0, 1, \\dots, n-1 \}\).

Example:

>>> EveDiscrete(2)
sample()
contains(x)
class eve.app.space.EveMultiBinary(n, max_neurons: Optional[int] = None, max_states: Optional[int] = None)

Bases: eve.app.space.EveSpace

An n-shape binary space.

The argument to MultiBinary defines n, which could be a number or a list of numbers.

Example Usage:

>> self.observation_space = spaces.MultiBinary(5)

>> self.observation_space.sample()

array([0,1,0,1,0], dtype =int8)

>> self.observation_space = spaces.MultiBinary([3,2])

>> self.observation_space.sample()

array([[0, 0],

[0, 1], [1, 1]], dtype=int8)

sample()
contains(x)
to_jsonable(sample_n)
from_jsonable(sample_n)
class eve.app.space.EveMultiDiscrete(nvec, max_neurons: Optional[int] = None, max_states: Optional[int] = None)

Bases: eve.app.space.EveSpace

  • The multi-discrete action space consists of a series of discrete action spaces with different number of actions in eachs

  • It is useful to represent game controllers or keyboards where each key can be represented as a discrete action space

  • It is parametrized by passing an array of positive integers specifying number of actions for each discrete action space

Note: Some environment wrappers assume a value of 0 always represents the NOOP action.

e.g. Nintendo Game Controller - Can be conceptualized as 3 discrete action spaces:

  1. Arrow Keys: Discrete 5 - NOOP[0], UP[1], RIGHT[2], DOWN[3], LEFT[4] - params: min: 0, max: 4

  2. Button A: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1

  3. Button B: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1

  • Can be initialized as

    MultiDiscrete([ 5, 2, 2 ])

nvec: vector of counts of each categorical variable

sample()
contains(x)
to_jsonable(sample_n)
from_jsonable(sample_n)
class eve.app.space.EveTuple(spaces)

Bases: eve.app.space.EveSpace

A tuple (i.e., product) of simpler spaces

Example usage: self.observation_space = spaces.Tuple((spaces.Discrete(2), spaces.Discrete(3)))

seed(seed=None)
sample()
contains(x)
to_jsonable(sample_n)
from_jsonable(sample_n)
eve.app.space.flatdim(space)

Return the number of dimensions a flattened equivalent of this space would have.

Accepts a space and returns an integer. Raises NotImplementedError if the space is not defined in gym.spaces.

eve.app.space.flatten(space, x)

Flatten a data point from a space.

This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.

Accepts a space and a point from that space. Always returns a 1D array. Raises NotImplementedError if the space is not defined in gym.spaces.

eve.app.space.unflatten(space, x)

Unflatten a data point from a space.

This reverses the transformation applied by flatten(). You must ensure that the space argument is the same as for the flatten() call.

Accepts a space and a flattened point. Returns a point with a structure that matches the space. Raises NotImplementedError if the space is not defined in gym.spaces.

eve.app.space.flatten_space(space)

Flatten a space into a single Box.

This is equivalent to flatten(), but operates on the space itself. The result always is a Box with flat boundaries. The box has exactly flatdim(space) dimensions. Flattening a sample of the original space has the same effect as taking a sample of the flattenend space.

Raises NotImplementedError if the space is not defined in gym.spaces.

Example:

>>> box = Box(0.0, 1.0, shape=(3, 4, 5))
>>> box
Box(3, 4, 5)
>>> flatten_space(box)
Box(60,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that flattens a discrete space:

>>> discrete = Discrete(5)
>>> flatten_space(discrete)
Box(5,)
>>> flatten(box, box.sample()) in flatten_space(box)
True

Example that recursively flattens a dict:

>>> space = Dict({"position": Discrete(2),
...               "velocity": Box(0, 1, shape=(2, 2))})
>>> flatten_space(space)
Box(6,)
>>> flatten(space, space.sample()) in flatten_space(space)
True

eve.app.trainer module

eve.app.upgrader module

eve.app.utils module

eve.app.utils.set_random_seed(seed: int, using_cuda: bool = False) None

Seed the different random generators :param seed: :param using_cuda:

eve.app.utils.explained_variance(y_pred: numpy.ndarray, y_true: numpy.ndarray) numpy.ndarray

Computes fraction of variance that ypred explains about y. Returns 1 - Var[y-ypred] / Var[y]

interpretation:

ev=0 => might as well have predicted zero ev=1 => perfect prediction ev<0 => worse than just predicting zero

Parameters
  • y_pred – the prediction

  • y_true – the expected value

Returns

explained variance of ypred and y

eve.app.utils.update_learning_rate(optimizer: torch.optim.optimizer.Optimizer, learning_rate: float) None

Update the learning rate for a given optimizer. Useful when doing linear schedule.

Parameters
  • optimizer

  • learning_rate

eve.app.utils.get_schedule_fn(value_schedule: Union[Callable[[float], float], float, int]) Callable[[float], float]

Transform (if needed) learning rate and clip range (for PPO) to callable.

Parameters

value_schedule

Returns

eve.app.utils.get_linear_fn(start: float, end: float, end_fraction: float) Callable[[float], float]

Create a function that interpolates linearly between start and end between progress_remaining = 1 and progress_remaining = end_fraction. This is used in DQN for linearly annealing the exploration fraction (epsilon for the epsilon-greedy strategy).

Params start

value to start with if progress_remaining = 1

Params end

value to end with if progress_remaining = 0

Params end_fraction

fraction of progress_remaining where end is reached e.g 0.1 then end is reached after 10% of the complete training process.

Returns

eve.app.utils.linear_schedule(initial_value: Union[float, str]) Callable[[float], float]

Linear learning rate schedule.

Parameters

initial_value – (float or str)

eve.app.utils.constant_fn(val: float) Callable[[float], float]

Create a function that returns a constant It is useful for learning rate schedule (to avoid code duplication)

Parameters

val

Returns

eve.app.utils.get_device(device: Union[torch.device, str] = 'auto') torch.device

Retrieve PyTorch device. It checks that the requested device is available first. For now, it supports only cpu and cuda. By default, it tries to use the gpu.

Parameters

device – One for ‘auto’, ‘cuda’, ‘cpu’

Returns

eve.app.utils.get_latest_run_id(log_path: Optional[str] = None, log_name: str = '') int

Returns the latest run number for the given log name and log path, by finding the greatest number in the directories.

Returns

latest run number

eve.app.utils.safe_mean(arr: Union[numpy.ndarray, list]) numpy.ndarray

Compute the mean of an array if there is at least one element. For empty array, return NaN. It is used for logging only.

Parameters

arr

Returns

eve.app.utils.zip_strict(*iterables: Iterable) Iterable

zip() function but enforces that iterables are of equal length. Raises ValueError if iterables not of equal length. Code inspired by Stackoverflow answer for question #32954486.

Parameters

*iterables – iterables to zip()

eve.app.utils.polyak_update(params: Iterable[torch.nn.parameter.Parameter], target_params: Iterable[torch.nn.parameter.Parameter], tau: float) None

Perform a Polyak average update on target_params using params: target parameters are slowly updated towards the main parameters. tau, the soft update coefficient controls the interpolation: tau=1 corresponds to copying the parameters to the target ones whereas nothing happens when tau=0. The Polyak update is done in place, with no_grad, and therefore does not create intermediate tensors, or a computation graph, reducing memory cost and improving performance. We scale the target params by 1-tau (in-place), add the new weights, scaled by tau and store the result of the sum in the target params (in place). See https://github.com/DLR-RM/stable-baselines3/issues/93

Parameters
  • params – parameters to use to update the target params

  • target_params – parameters to update

  • tau – the soft update coefficient (“Polyak update”, between 0 and 1)

eve.app.utils.recursive_getattr(obj: Any, attr: str, *args) Any

Recursive version of getattr taken from https://stackoverflow.com/questions/31174295

Ex: > MyObject.sub_object = SubObject(name=’test’) > recursive_getattr(MyObject, ‘sub_object.name’) # return test :param obj: :param attr: Attribute to retrieve :return: The attribute

eve.app.utils.recursive_setattr(obj: Any, attr: str, val: Any) None

Recursive version of setattr taken from https://stackoverflow.com/questions/31174295

Ex: > MyObject.sub_object = SubObject(name=’test’) > recursive_setattr(MyObject, ‘sub_object.name’, ‘hello’) :param obj: :param attr: Attribute to set :param val: New value of the attribute

eve.app.utils.is_json_serializable(item: Any) bool

Test if an object is serializable into JSON

Parameters

item – The object to be tested for JSON serialization.

Returns

True if object is JSON serializable, false otherwise.

eve.app.utils.data_to_json(data: Dict[str, Any]) str

Turn data (class parameters) into a JSON string for storing

Parameters

data – Dictionary of class parameters to be stored. Items that are not JSON serializable will be pickled with Cloudpickle and stored as bytearray in the JSON file

Returns

JSON string of the data serialized.

eve.app.utils.json_to_data(json_string: str, custom_objects: Optional[Dict[str, Any]] = None) Dict[str, Any]

Turn JSON serialization of class-parameters back into dictionary.

Parameters
  • json_string – JSON serialization of the class-parameters that should be loaded.

  • custom_objects – Dictionary of objects to replace upon loading. If a variable is present in this dictionary as a key, it will not be deserialized and the corresponding item will be used instead. Similar to custom_objects in keras.models.load_model. Useful when you have an object in file that can not be deserialized.

Returns

Loaded class parameters.

eve.app.utils.open_path(path: Union[str, pathlib.Path, io.BufferedIOBase], mode: str, verbose: int = 0, suffix: Optional[str] = None)

Opens a path for reading or writing with a preferred suffix and raises debug information. If the provided path is a derivative of io.BufferedIOBase it ensures that the file matches the provided mode, i.e. If the mode is read (“r”, “read”) it checks that the path is readable. If the mode is write (“w”, “write”) it checks that the file is writable.

If the provided path is a string or a pathlib.Path, it ensures that it exists. If the mode is “read” it checks that it exists, if it doesn’t exist it attempts to read path.suffix if a suffix is provided. If the mode is “write” and the path does not exist, it creates all the parent folders. If the path points to a folder, it changes the path to path_2. If the path already exists and verbose == 2, it raises a warning.

Parameters
  • path – the path to open. if save_path is a str or pathlib.Path and mode is “w”, single dispatch ensures that the path actually exists. If path is a io.BufferedIOBase the path exists.

  • mode – how to open the file. “w”|”write” for writing, “r”|”read” for reading.

  • verbose – Verbosity level, 0 means only warnings, 2 means debug information.

  • suffix – The preferred suffix. If mode is “w” then the opened file has the suffix. If mode is “r” then we attempt to open the path. If an error is raised and the suffix is not None, we attempt to open the path with the suffix.

Returns

eve.app.utils.open_path_str(path: str, mode: str, verbose: int = 0, suffix: Optional[str] = None) io.BufferedIOBase

Open a path given by a string. If writing to the path, the function ensures that the path exists.

Parameters
  • path – the path to open. If mode is “w” then it ensures that the path exists by creating the necessary folders and renaming path if it points to a folder.

  • mode – how to open the file. “w” for writing, “r” for reading.

  • verbose – Verbosity level, 0 means only warnings, 2 means debug information.

  • suffix – The preferred suffix. If mode is “w” then the opened file has the suffix. If mode is “r” then we attempt to open the path. If an error is raised and the suffix is not None, we attempt to open the path with the suffix.

Returns

eve.app.utils.open_path_pathlib(path: pathlib.Path, mode: str, verbose: int = 0, suffix: Optional[str] = None) io.BufferedIOBase

Open a path given by a string. If writing to the path, the function ensures that the path exists.

Parameters
  • path – the path to check. If mode is “w” then it ensures that the path exists by creating the necessary folders and renaming path if it points to a folder.

  • mode – how to open the file. “w” for writing, “r” for reading.

  • verbose – Verbosity level, 0 means only warnings, 2 means debug information.

  • suffix – The preferred suffix. If mode is “w” then the opened file has the suffix. If mode is “r” then we attempt to open the path. If an error is raised and the suffix is not None, we attempt to open the path with the suffix.

Returns

eve.app.utils.save_to_zip_file(save_path: Union[str, pathlib.Path, io.BufferedIOBase], data: Optional[Dict[str, Any]] = None, params: Optional[Dict[str, Any]] = None, pytorch_variables: Optional[Dict[str, Any]] = None, verbose: int = 0) None

Save model data to a zip archive.

Parameters
  • save_path – Where to store the model. if save_path is a str or pathlib.Path ensures that the path actually exists.

  • data – Class parameters being stored (non-PyTorch variables)

  • params – Model parameters being stored expected to contain an entry for every state_dict with its name and the state_dict.

  • pytorch_variables – Other PyTorch variables expected to contain name and value of the variable.

  • verbose – Verbosity level, 0 means only warnings, 2 means debug information

eve.app.utils.save_to_pkl(path: Union[str, pathlib.Path, io.BufferedIOBase], obj: Any, verbose: int = 0) None

Save an object to path creating the necessary folders along the way. If the path exists and is a directory, it will raise a warning and rename the path. If a suffix is provided in the path, it will use that suffix, otherwise, it will use ‘.pkl’.

Parameters
  • path – the path to open. if save_path is a str or pathlib.Path and mode is “w”, single dispatch ensures that the path actually exists. If path is a io.BufferedIOBase the path exists.

  • obj – The object to save.

  • verbose – Verbosity level, 0 means only warnings, 2 means debug information.

eve.app.utils.load_from_pkl(path: Union[str, pathlib.Path, io.BufferedIOBase], verbose: int = 0) Any

Load an object from the path. If a suffix is provided in the path, it will use that suffix. If the path does not exist, it will attempt to load using the .pkl suffix.

Parameters
  • path – the path to open. if save_path is a str or pathlib.Path and mode is “w”, single dispatch ensures that the path actually exists. If path is a io.BufferedIOBase the path exists.

  • verbose – Verbosity level, 0 means only warnings, 2 means debug information.

eve.app.utils.load_from_zip_file(load_path: Union[str, pathlib.Path, io.BufferedIOBase], load_data: bool = True, device: Union[torch.device, str] = 'auto', verbose: int = 0) Tuple[Optional[Dict[str, Any]], Optional[Dict[str, torch.Tensor]], Optional[Dict[str, torch.Tensor]]]

Load model data from a .zip archive

Parameters
  • load_path – Where to load the model from

  • load_data – Whether we should load and return data (class parameters). Mainly used by ‘load_parameters’ to only load model parameters (weights)

  • device – Device on which the code should run.

Returns

Class parameters, model state_dicts (aka “params”, dict of state_dict) and dict of pytorch variables

eve.app.utils.get_trained_models(log_folder: str) Dict[str, Tuple[str, str]]
Parameters

log_folder – (str) Root log folder

Returns

(Dict[str, Tuple[str, str]]) Dict representing the trained agent

eve.app.utils.get_saved_hyperparams(stats_path: str, norm_reward: bool = False, test_mode: bool = False) Tuple[Dict[str, Any], str]
Parameters
  • stats_path

  • norm_reward

  • test_mode

Module contents