eve.app package
Subpackages
Submodules
eve.app.algo module
eve.app.buffers module
- class eve.app.buffers.RolloutBufferSamples(observations, actions, old_values, old_log_prob, advantages, returns)
Bases:
tuple
Create new instance of RolloutBufferSamples(observations, actions, old_values, old_log_prob, advantages, returns)
- property observations
Alias for field number 0
- property actions
Alias for field number 1
- property old_values
Alias for field number 2
- property old_log_prob
Alias for field number 3
- property advantages
Alias for field number 4
- property returns
Alias for field number 5
- class eve.app.buffers.ReplayBufferSamples(observations, actions, next_observations, dones, rewards)
Bases:
tuple
Create new instance of ReplayBufferSamples(observations, actions, next_observations, dones, rewards)
- property observations
Alias for field number 0
- property actions
Alias for field number 1
- property next_observations
Alias for field number 2
- property dones
Alias for field number 3
- property rewards
Alias for field number 4
- class eve.app.buffers.RolloutReturn(episode_reward, episode_timesteps, n_episodes, continue_training)
Bases:
tuple
Create new instance of RolloutReturn(episode_reward, episode_timesteps, n_episodes, continue_training)
- property episode_reward
Alias for field number 0
- property episode_timesteps
Alias for field number 1
- property n_episodes
Alias for field number 2
- property continue_training
Alias for field number 3
- eve.app.buffers.get_action_dim(action_space: eve.app.space.EveSpace) int
Get the dimension of the action space.
- Parameters
action_space –
- Returns
- eve.app.buffers.get_obs_shape(observation_space: eve.app.space.EveSpace) Tuple[int, ...]
Get the shape of the observation (useful for the buffers).
- Parameters
observation_space –
- Returns
- class eve.app.buffers.BaseBuffer(buffer_size: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace, device: Union[torch.device, str] = 'cpu', n_envs: int = 1, sample_episode: bool = False)
Bases:
abc.ABC
Base class that represent a buffer (rollout or replay)
- Parameters
buffer_size – Max number of element in the buffer
observation_space – Observation space
action_space – Action space
device – PyTorch device to which the values will be converted
n_envs – Number of parallel environments
sample_episode – If
False
, we will sample the observations in a ramdon states format, and will return batch_size states. IfFalse
, we will sample the observation in a random episode formot, and will return batch_size episodes. NOTE: ifTrue
, all the episodes length should keep the same, or the batch size should be 1, otherwise, we can’t stack differnt length of episodes.
- static swap_and_flatten(arr: numpy.ndarray) numpy.ndarray
Swap and then flatten axes 0 (buffer_size) and 1 (n_envs) to convert shape from [n_steps, n_envs, …] (when … is the shape of the features) to [n_steps * n_envs, …] (which maintain the order)
- Parameters
arr –
- Returns
- size() int
- Returns
The current size of the buffer
- add(*args, **kwargs) None
Add elements to the buffer.
- extend(*args, **kwargs) None
Add a new batch of transitions to the buffer
- reset() None
Reset the buffer.
- sample(batch_size: int, env: Optional[VecNormalize] = None)
- Parameters
batch_size – Number of element to sample
env – associated VecEnv to normalize the observations/rewards when sampling
- Returns
if episode sample, return a list with episode length and contains BufferSamples, else, return BufferSamples.
- to_torch(array: numpy.ndarray, copy: bool = True) torch.Tensor
Convert a numpy array to a PyTorch tensor. Note: it copies the data by default
- Parameters
array –
copy – Whether to copy or not the data (may be useful to avoid changing things be reference)
- Returns
- class eve.app.buffers.ReplayBuffer(buffer_size: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace, device: Union[torch.device, str] = 'cpu', n_envs: int = 1, sample_episode: bool = False)
Bases:
eve.app.buffers.BaseBuffer
Replay buffer used in off-policy algorithms like SAC/TD3.
- Parameters
buffer_size – Max number of element in the buffer
observation_space – Observation space
action_space – Action space
device –
n_envs – Number of parallel environments
- add(obs: numpy.ndarray, next_obs: numpy.ndarray, action: numpy.ndarray, reward: numpy.ndarray, done: numpy.ndarray) None
- class eve.app.buffers.RolloutBuffer(buffer_size: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace, device: Union[torch.device, str] = 'cpu', gae_lambda: float = 1, gamma: float = 0.99, n_envs: int = 1, sample_episode: bool = False)
Bases:
eve.app.buffers.BaseBuffer
Rollout buffer used in on-policy algorithms like A2C/PPO. It corresponds to
buffer_size
transitions collected using the current policy. This experience will be discarded after the policy update. In order to use PPO objective, we also store the current value of each state and the log probability of each taken action.The term rollout here refers to the model-free notion and should not be used with the concept of rollout used in model-based RL or planning. Hence, it is only involved in policy and value function training but not action selection.
- Parameters
buffer_size – Max number of element in the buffer
observation_space – Observation space
action_space – Action space
device –
gae_lambda – Factor for trade-off of bias vs variance for Generalized Advantage Estimator Equivalent to classic advantage when set to 1.
gamma – Discount factor
n_envs – Number of parallel environments
- reset() None
- compute_returns_and_advantage(last_values: torch.Tensor, dones: numpy.ndarray) None
Post-processing step: compute the returns (sum of discounted rewards) and GAE advantage. Adapted from Stable-Baselines PPO2.
Uses Generalized Advantage Estimation (https://arxiv.org/abs/1506.02438) to compute the advantage. To obtain vanilla advantage (A(s) = R - V(S)) where R is the discounted reward with value bootstrap, set
gae_lambda=1.0
during initialization.- Parameters
last_values –
dones –
- add(obs: numpy.ndarray, action: numpy.ndarray, reward: numpy.ndarray, done: numpy.ndarray, value: torch.Tensor, log_prob: torch.Tensor) None
- Parameters
obs – Observation
action – Action
reward –
done – End of episode signal.
value – estimated value of the current state following the current policy.
log_prob – log probability of the action following the current policy.
eve.app.callbacks module
- eve.app.callbacks.sync_envs_normalization(env: EveEnv, eval_env: EveEnv) None
Sync eval env and train env when using VecNormalize
- Parameters
env –
eval_env –
- eve.app.callbacks.evaluate_policy(model: algo.BaseAlgorithm, env: EveEnv, n_eval_episodes: int = 10, deterministic: bool = True, callback: Optional[Callable[[Dict[str, Any], Dict[str, Any]], None]] = None, reward_threshold: Optional[float] = None, return_episode_rewards: bool = False, warn: bool = True) Union[Tuple[float, float], Tuple[List[float], List[int]]]
Runs policy for
n_eval_episodes
episodes and returns average reward. This is made to work only with one env.Note
If environment has not been wrapped with
Monitor
wrapper, reward and episode lengths are counted as it appears withenv.step
calls. If the environment contains wrappers that modify rewards or episode lengths (e.g. reward scaling, early episode reset), these will affect the evaluation results as well. You can avoid this by wrapping environment withMonitor
wrapper before anything else.- Parameters
model – The RL agent you want to evaluate.
env – The environment. In the case of a
VecEnv
this must contain only one environment.n_eval_episodes – Number of episode to evaluate the agent
deterministic – Whether to use deterministic or stochastic actions
callback – callback function to do additional checks, called after each step. Gets locals() and globals() passed as parameters.
reward_threshold – Minimum expected reward per episode, this will raise an error if the performance is not met
return_episode_rewards – If True, a list of rewards and episde lengths per episode will be returned instead of the mean.
warn – If True (default), warns user about lack of a Monitor wrapper in the evaluation environment.
- Returns
Mean reward per episode, std of reward per episode. Returns ([float], [int]) when
return_episode_rewards
is True, first list containing per-episode rewards and second containing per-episode lengths (in number of steps).
- class eve.app.callbacks.BaseCallback(verbose: int = 0)
Bases:
abc.ABC
Base class for callback.
- Parameters
verbose –
- init_callback(model: algo.BaseAlgorithm) None
Initialize the callback by saving references to the RL model and the training environment for convenience.
- on_training_start(locals_: Dict[str, Any], globals_: Dict[str, Any]) None
- on_rollout_start() None
- on_step() bool
This method will be called by the model after each call to
env.step()
.For child callback (of an
EventCallback
), this will be called when the event is triggered.- Returns
If the callback returns False, training is aborted early.
- on_training_end() None
- on_rollout_end() None
- update_locals(locals_: Dict[str, Any]) None
Update the references to the local variables.
- Parameters
locals – the local variables during rollout collection
- update_child_locals(locals_: Dict[str, Any]) None
Update the references to the local variables on sub callbacks.
- Parameters
locals – the local variables during rollout collection
- class eve.app.callbacks.EventCallback(callback: Optional[eve.app.callbacks.BaseCallback] = None, verbose: int = 0)
Bases:
eve.app.callbacks.BaseCallback
Base class for triggering callback on event.
- Parameters
callback – Callback that will be called when an event is triggered.
verbose –
- init_callback(model: algo.BaseAlgorithm) None
- update_child_locals(locals_: Dict[str, Any]) None
Update the references to the local variables.
- Parameters
locals – the local variables during rollout collection
- class eve.app.callbacks.CallbackList(callbacks: List[eve.app.callbacks.BaseCallback])
Bases:
eve.app.callbacks.BaseCallback
Class for chaining callbacks.
- Parameters
callbacks – A list of callbacks that will be called sequentially.
- update_child_locals(locals_: Dict[str, Any]) None
Update the references to the local variables.
- Parameters
locals – the local variables during rollout collection
- class eve.app.callbacks.CheckpointCallback(save_freq: int, save_path: str, name_prefix: str = 'rl_model', verbose: int = 0)
Bases:
eve.app.callbacks.BaseCallback
Callback for saving a model every
save_freq
steps- Parameters
save_freq –
save_path – Path to the folder where the model will be saved.
name_prefix – Common prefix to the saved models
verbose –
- class eve.app.callbacks.ConvertCallback(callback: Callable[[Dict[str, Any], Dict[str, Any]], bool], verbose: int = 0)
Bases:
eve.app.callbacks.BaseCallback
Convert functional callback (old-style) to object.
- Parameters
callback –
verbose –
- class eve.app.callbacks.EvalCallback(eval_env: EveEnv, callback_on_new_best: Optional[eve.app.callbacks.BaseCallback] = None, n_eval_episodes: int = 5, eval_freq: int = 10000, log_path: str = None, best_model_save_path: str = None, deterministic: bool = True, verbose: int = 1, warn: bool = True)
Bases:
eve.app.callbacks.EventCallback
Callback for evaluating an agent.
- Parameters
eval_env – The environment used for initialization
callback_on_new_best – Callback to trigger when there is a new best model according to the
mean_reward
n_eval_episodes – The number of episodes to test the agent
eval_freq – Evaluate the agent every eval_freq call of the callback.
log_path – Path to a folder where the evaluations (
evaluations.npz
) will be saved. It will be updated at each evaluation.best_model_save_path – Path to a folder where the best model according to performance on the eval env will be saved.
deterministic – Whether the evaluation should use a stochastic or deterministic actions.
verbose –
warn – Passed to
evaluate_policy
(warns ifeval_env
has not been wrapped with a Monitor wrapper)
- update_child_locals(locals_: Dict[str, Any]) None
Update the references to the local variables.
- Parameters
locals – the local variables during rollout collection
- class eve.app.callbacks.StopTrainingOnRewardThreshold(reward_threshold: float, verbose: int = 0)
Bases:
eve.app.callbacks.BaseCallback
Stop the training once a threshold in episodic reward has been reached (i.e. when the model is good enough).
It must be used with the
EvalCallback
.- Parameters
reward_threshold – Minimum expected reward per episode to stop training.
verbose –
- class eve.app.callbacks.EveryNTimesteps(n_steps: int, callback: eve.app.callbacks.BaseCallback)
Bases:
eve.app.callbacks.EventCallback
Trigger a callback every
n_steps
timesteps- Parameters
n_steps – Number of timesteps between two trigger.
callback – Callback that will be called when the event is triggered.
- class eve.app.callbacks.StopTrainingOnMaxEpisodes(max_episodes: int, verbose: int = 0)
Bases:
eve.app.callbacks.BaseCallback
Stop the training once a maximum number of episodes are played.
For multiple environments presumes that, the desired behavior is that the agent trains on each env for
max_episodes
and in total formax_episodes * n_envs
episodes.- Parameters
max_episodes – Maximum number of episodes to stop training.
verbose – Select whether to print information about when training ended by reaching
max_episodes
- class eve.app.callbacks.TrialEvalCallback(eval_env: eve.app.env.VecEnv, trial: optuna.trial._trial.Trial, n_eval_episodes: int = 5, eval_freq: int = 10000, deterministic: bool = True, verbose: int = 0)
Bases:
eve.app.callbacks.EvalCallback
Callback used for evaluating and reporting a trial.
- class eve.app.callbacks.SaveVecNormalizeCallback(save_freq: int, save_path: str, name_prefix: Optional[str] = None, verbose: int = 0)
Bases:
eve.app.callbacks.BaseCallback
Callback for saving a VecNormalize wrapper every
save_freq
steps.- Parameters
save_freq (int) –
save_path (str) – Path to the folder where
VecNormalize
will be saved, asvecnormalize.pkl
name_prefix (str) – Common prefix to the saved
VecNormalize
, if None (default) only one file will be kept.
eve.app.env module
- class eve.app.env.EveEnv
Bases:
object
The main OpenAI class. It encapsulates an environment with arbitrary behind-the-scenes dynamics. An environment can be partially or fully observed.
The main API methods that users of this class need to know are:
step reset render close seed
And set the following attributes:
action_space: The Space object corresponding to valid actions observation_space: The Space object corresponding to valid observations reward_range: A tuple corresponding to the min and max possible rewards
Note: a default reward range set to [-inf,+inf] already exists. Set it if you want a narrower range.
The methods are accessed publicly as “step”, “reset”, etc…
- metadata = {'render.modes': []}
- reward_range = (-inf, inf)
- spec = None
- action_space = None
- observation_space = None
- step(action)
Run one timestep of the environment’s dynamics. When end of episode is reached, you are responsible for calling reset() to reset this environment’s state.
Accepts an action and returns a tuple (observation, reward, done, info).
- Parameters
action (object) – an action provided by the agent
- Returns
agent’s observation of the current environment reward (float) : amount of reward returned after previous action done (bool): whether the episode has ended, in which case further step() calls will return undefined results info (dict): contains auxiliary diagnostic information (helpful for debugging, and sometimes learning)
- Return type
observation (object)
- reset()
Resets the environment to an initial state and returns an initial observation.
Note that this function should not reset the environment’s random number generator(s); random variables in the environment’s state should be sampled independently between multiple calls to reset(). In other words, each call of reset() should yield an environment suitable for a new episode, independent of previous episodes.
- Returns
the initial observation.
- Return type
observation (object)
- render(mode='human')
Renders the environment.
The set of supported modes varies per environment. (And some environments do not support rendering at all.) By convention, if mode is:
human: render to the current display or terminal and return nothing. Usually for human consumption.
rgb_array: Return an numpy.ndarray with shape (x, y, 3), representing RGB values for an x-by-y pixel image, suitable for turning into a video.
ansi: Return a string (str) or StringIO.StringIO containing a terminal-style text representation. The text can include newlines and ANSI escape sequences (e.g. for colors).
Note
- Make sure that your class’s metadata ‘render.modes’ key includes
the list of supported modes. It’s recommended to call super() in implementations to use the functionality of this method.
- Parameters
mode (str) – the mode to render with
Example:
- class MyEnv(EveEnv):
metadata = {‘render.modes’: [‘human’, ‘rgb_array’]}
- def render(self, mode=’human’):
- if mode == ‘rgb_array’:
return np.array(…) # return RGB frame suitable for video
- elif mode == ‘human’:
… # pop up a window and render
- else:
super(MyEnv, self).render(mode=mode) # just raise an exception
- close()
Override close in your subclass to perform any necessary cleanup.
Environments will automatically close() themselves when garbage collected or when the program exits.
- seed(seed=None)
Sets the seed for this env’s random number generator(s).
Note
Some environments use multiple pseudorandom number generators. We want to capture all such seeds used in order to ensure that there aren’t accidental correlations between multiple generators.
- Returns
- Returns the list of seeds used in this env’s random
number generators. The first value in the list should be the “main” seed, or the value which a reproducer should pass to ‘seed’. Often, the main seed equals the provided ‘seed’, but this won’t be true if seed=None, for example.
- Return type
list<bigint>
- class eve.app.env.GoalEnv
Bases:
eve.app.env.EveEnv
A goal-based environment. It functions just as any regular OpenAI environment but it imposes a required structure on the observation_space. More concretely, the observation space is required to contain at least three elements, namely observation, desired_goal, and achieved_goal. Here, desired_goal specifies the goal that the agent should attempt to achieve. achieved_goal is the goal that it currently achieved instead. observation contains the actual observations of the environment as per usual.
- reset()
- compute_reward(achieved_goal, desired_goal, info)
Compute the step reward. This externalizes the reward function and makes it dependent on a desired goal and the one that was achieved. If you wish to include additional rewards that are independent of the goal, you can include the necessary values to derive it in ‘info’ and compute it accordingly.
- Parameters
achieved_goal (object) – the goal that was achieved during execution
desired_goal (object) – the desired goal that we asked the agent to attempt to achieve
info (dict) – an info dictionary with additional information
- Returns
The reward that corresponds to the provided achieved goal w.r.t. to the desired goal. Note that the following should always hold true:
ob, reward, done, info = env.step() assert reward == env.compute_reward(ob[‘achieved_goal’], ob[‘goal’], info)
- Return type
float
- class eve.app.env.Wrapper(env)
Bases:
eve.app.env.EveEnv
Wraps the environment to allow a modular transformation.
This class is the base class for all wrappers. The subclass could override some methods to change the behavior of the original environment without touching the original code.
Note
Don’t forget to call
super().__init__(env)
if the subclass overrides__init__()
.- property spec
- classmethod class_name()
- step(action)
- reset(**kwargs)
- render(mode='human', **kwargs)
- close()
- seed(seed=None)
- compute_reward(achieved_goal, desired_goal, info)
- property unwrapped
- class eve.app.env.ObservationWrapper(env)
Bases:
eve.app.env.Wrapper
- reset(**kwargs)
- step(action)
- observation(observation)
- class eve.app.env.RewardWrapper(env)
Bases:
eve.app.env.Wrapper
- reset(**kwargs)
- step(action)
- reward(reward)
- class eve.app.env.ActionWrapper(env)
Bases:
eve.app.env.Wrapper
- reset(**kwargs)
- step(action)
- action(action)
- reverse_action(action)
- class eve.app.env.FlattenObservation(env)
Bases:
eve.app.env.ObservationWrapper
Observation wrapper that flattens the observation.
- observation(observation)
- class eve.app.env.VecEnv(num_envs: int, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace)
Bases:
abc.ABC
An abstract asynchronous, vectorized environment.
- Parameters
num_envs – the number of environments
observation_space – the observation space
action_space – the action space
- abstract reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
Reset all the environments and return an array of observations, or a tuple of observation arrays.
If step_async is still doing work, that work will be cancelled and step_wait() should not be called until step_async() is invoked again.
- Returns
observation
- abstract step_async(actions: numpy.ndarray) None
Tell all the environments to start taking a step with the given actions. Call step_wait() to get the results of the step.
You should not call this if a step_async run is already pending.
- abstract step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
Wait for the step taken with step_async().
- Returns
observation, reward, done, information
- abstract close() None
Clean up the environment’s resources.
- abstract get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]
Return attribute from vectorized environment.
- Parameters
attr_name – The name of the attribute whose value to return
indices – Indices of envs to get attribute from
- Returns
List of values of ‘attr_name’ in all environments
- abstract set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None
Set attribute inside vectorized environments.
- Parameters
attr_name – The name of attribute to assign new value
value – Value to assign to attr_name
indices – Indices of envs to assign value
- Returns
- abstract env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]
Call instance methods of vectorized environments.
- Parameters
method_name – The name of the environment method to invoke.
indices – Indices of envs whose method to call
method_args – Any positional arguments to provide in the call
method_kwargs – Any keyword arguments to provide in the call
- Returns
List of items returned by the environment’s method call
- abstract env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]
Check if environments are wrapped with a given wrapper.
- Parameters
method_name – The name of the environment method to invoke.
indices – Indices of envs whose method to call
method_args – Any positional arguments to provide in the call
method_kwargs – Any keyword arguments to provide in the call
- Returns
True if the env is wrapped, False otherwise, for each env queried.
- step(actions: numpy.ndarray) Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
Step the environments with the given action
- Parameters
actions – the action
- Returns
observation, reward, done, information
- abstract seed(seed: Optional[int] = None) List[Union[None, int]]
Sets the random seeds for all environments, based on a given seed. Each individual environment will still get its own seed, by incrementing the given seed.
- Parameters
seed – The random seed. May be None for completely random seeding.
- Returns
Returns a list containing the seeds for each individual env. Note that all list elements may be None, if the env does not return anything when being seeded.
- property unwrapped: eve.app.env.VecEnv
- getattr_depth_check(name: str, already_found: bool) Optional[str]
Check if an attribute reference is being hidden in a recursive call to __getattr__
- Parameters
name – name of attribute to check for
already_found – whether this attribute has already been found in a wrapper
- Returns
name of module whose attribute is being shadowed, if any.
- class eve.app.env.VecEnvWrapper(venv: eve.app.env.VecEnv, observation_space: Optional[eve.app.space.EveSpace] = None, action_space: Optional[eve.app.space.EveSpace] = None)
Bases:
eve.app.env.VecEnv
Vectorized environment base class
- Parameters
venv – the vectorized environment to wrap
observation_space – the observation space (can be None to load from venv)
action_space – the action space (can be None to load from venv)
- step_async(actions: numpy.ndarray) None
- abstract reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
- abstract step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
- seed(seed: Optional[int] = None) List[Union[None, int]]
- close() None
- get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]
- set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None
- env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]
- env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]
- getattr_recursive(name: str) Any
Recursively check wrappers to find attribute.
- Parameters
name – name of attribute to look for
- Returns
attribute
- getattr_depth_check(name: str, already_found: bool) str
See base class.
- Returns
name of module whose attribute is being shadowed, if any.
- eve.app.env.copy_obs_dict(obs: Dict[str, numpy.ndarray]) Dict[str, numpy.ndarray]
Deep-copy a dict of numpy arrays.
- Parameters
obs – a dict of numpy arrays.
- Returns
a dict of copied numpy arrays.
- eve.app.env.dict_to_obs(space_: eve.app.space.EveSpace, obs_dict: Dict[Any, numpy.ndarray]) Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
Convert an internal representation raw_obs into the appropriate type specified by space.
- Parameters
space – an observation space.
obs_dict – a dict of numpy arrays.
- Returns
returns an observation of the same type as space. If space is Dict, function is identity; if space is Tuple, converts dict to Tuple; otherwise, space is unstructured and returns the value raw_obs[None].
- eve.app.env.obs_space_info(obs_space: eve.app.space.EveSpace) Tuple[List[str], Dict[Any, Tuple[int, ...]], Dict[Any, numpy.dtype]]
Get dict-structured information about a eve.app.EveSpace.
Dict spaces are represented directly by their dict of subspaces. Tuple spaces are converted into a dict with keys indexing into the tuple. Unstructured spaces are represented by {None: obs_space}.
- Parameters
obs_space – an observation space
- Returns
A tuple (keys, shapes, dtypes): keys: a list of dict keys. shapes: a dict mapping keys to shapes. dtypes: a dict mapping keys to dtypes.
- class eve.app.env.ObsDictWrapper(venv: eve.app.env.VecEnv)
Bases:
eve.app.env.VecEnvWrapper
Wrapper for a VecEnv which overrides the observation space for Hindsight Experience Replay to support dict observations.
- Parameters
env – The vectorized environment to wrap.
- reset()
- step_wait()
- static convert_dict(observation_dict: Dict[str, numpy.ndarray], observation_key: str = 'observation', goal_key: str = 'desired_goal') numpy.ndarray
Concatenate observation and (desired) goal of observation dict.
- Parameters
observation_dict – Dictionary with observation.
observation_key – Key of observation in dicitonary.
goal_key – Key of (desired) goal in dicitonary.
- Returns
Concatenated observation.
- class eve.app.env.CloudpickleWrapper(var: Any)
Bases:
object
Uses cloudpickle to serialize contents (otherwise multiprocessing tries to use pickle)
- Parameters
var – the variable you wish to wrap for pickling with cloudpickle
- class eve.app.env.DummyVecEnv(env_fns: List[Callable[[], eve.app.env.EveEnv]])
Bases:
eve.app.env.VecEnv
Creates a simple vectorized wrapper for multiple environments, calling each environment in sequence on the current Python process. This is useful for computationally simple environment such as
cartpole-v1
, as the overhead of multiprocess or multithread outweighs the environment computation time. This can also be used for RL methods that require a vectorized environment, but that you want a single environments to train with.- Parameters
env_fns – a list of functions that return environments to vectorize
- step_async(actions: numpy.ndarray) None
- step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
- seed(seed: Optional[int] = None) List[Union[None, int]]
- reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
- close() None
- get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]
Return attribute from vectorized environment (see base class).
- set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None
Set attribute inside vectorized environments (see base class).
- env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]
Call instance methods of vectorized environments.
- env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]
Check if worker environments are wrapped with a given wrapper
- class eve.app.env.SubprocVecEnv(env_fns: List[Callable[[], eve.app.env.EveEnv]], start_method: Optional[str] = None)
Bases:
eve.app.env.VecEnv
Creates a multiprocess vectorized wrapper for multiple environments, distributing each environment to its own process, allowing significant speed up when the environment is computationally complex.
For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.
Warning
Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an
if __name__ == "__main__":
block. For more information, see the multiprocessing documentation.- Parameters
env_fns – Environments to run in subprocesses
start_method – method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.
- step_async(actions: numpy.ndarray) None
- step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
- seed(seed: Optional[int] = None) List[Union[None, int]]
- reset() Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]]
- close() None
- get_attr(attr_name: str, indices: Union[None, int, Iterable[int]] = None) List[Any]
Return attribute from vectorized environment (see base class).
- set_attr(attr_name: str, value: Any, indices: Union[None, int, Iterable[int]] = None) None
Set attribute inside vectorized environments (see base class).
- env_method(method_name: str, *method_args, indices: Union[None, int, Iterable[int]] = None, **method_kwargs) List[Any]
Call instance methods of vectorized environments.
- env_is_wrapped(wrapper_class: Type[eve.app.env.Wrapper], indices: Union[None, int, Iterable[int]] = None) List[bool]
Check if worker environments are wrapped with a given wrapper
- class eve.app.env.RunningMeanStd(epsilon: float = 0.0001, shape: Tuple[int, ...] = ())
Bases:
object
Calulates the running mean and std of a data stream https://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#Parallel_algorithm
- Parameters
epsilon – helps with arithmetic issues
shape – the shape of the data stream’s output
- update(arr: numpy.ndarray) None
- update_from_moments(batch_mean: numpy.ndarray, batch_var: numpy.ndarray, batch_count: int) None
- eve.app.env.check_for_correct_spaces(env: eve.app.env.EveEnv, observation_space: eve.app.space.EveSpace, action_space: eve.app.space.EveSpace) None
Checks that the environment has same spaces as provided ones. Used by BaseAlgorithm to check if spaces match after loading the model with given env. Checked parameters: - observation_space - action_space
- Parameters
env – Environment to check for valid spaces
observation_space – Observation space to check against
action_space – Action space to check against
- class eve.app.env.VecNormalize(venv: eve.app.env.VecEnv, training: bool = True, norm_obs: bool = True, norm_reward: bool = True, clip_obs: float = 10.0, clip_reward: float = 10.0, gamma: float = 0.99, epsilon: float = 1e-08)
Bases:
eve.app.env.VecEnvWrapper
A moving average, normalizing wrapper for vectorized environment. has support for saving/loading moving average,
- Parameters
venv – the vectorized environment to wrap
training – Whether to update or not the moving average
norm_obs – Whether to normalize observation or not (default: True)
norm_reward – Whether to normalize rewards or not (default: True)
clip_obs – Max absolute value for observation
clip_reward – Max value absolute for discounted reward
gamma – discount factor
epsilon – To avoid division by zero
- set_venv(venv: eve.app.env.VecEnv) None
Sets the vector environment to wrap to venv.
Also sets attributes derived from this such as num_env.
- Parameters
venv –
- step_wait() Tuple[Union[numpy.ndarray, Dict[str, numpy.ndarray], Tuple[numpy.ndarray, ...]], numpy.ndarray, numpy.ndarray, List[Dict]]
Apply sequence of actions to sequence of environments actions -> (observations, rewards, news)
where ‘news’ is a boolean vector indicating whether each element is new.
- normalize_obs(obs: Union[numpy.ndarray, Dict[str, numpy.ndarray]]) Union[numpy.ndarray, Dict[str, numpy.ndarray]]
Normalize observations using this VecNormalize’s observations statistics. Calling this method does not update statistics.
- normalize_reward(reward: numpy.ndarray) numpy.ndarray
Normalize rewards using this VecNormalize’s rewards statistics. Calling this method does not update statistics.
- unnormalize_obs(obs: Union[numpy.ndarray, Dict[str, numpy.ndarray]]) Union[numpy.ndarray, Dict[str, numpy.ndarray]]
- unnormalize_reward(reward: numpy.ndarray) numpy.ndarray
- get_original_obs() Union[numpy.ndarray, Dict[str, numpy.ndarray]]
Returns an unnormalized version of the observations from the most recent step or reset.
- get_original_reward() numpy.ndarray
Returns an unnormalized version of the rewards from the most recent step.
- reset() Union[numpy.ndarray, Dict[str, numpy.ndarray]]
Reset all environments :return: first observation of the episode
- static load(load_path: str, venv: eve.app.env.VecEnv) eve.app.env.VecNormalize
Loads a saved VecNormalize object.
- Parameters
load_path – the path to load from.
venv – the VecEnv to wrap.
- Returns
- save(save_path: str) None
Save current VecNormalize object with all running statistics and settings (e.g. clip_obs)
- Parameters
save_path – The path to save to
- class eve.app.env.Monitor(env: eve.app.env.EveEnv, filename: Optional[str] = None, allow_early_resets: bool = True, reset_keywords: Tuple[str, ...] = (), info_keywords: Tuple[str, ...] = ())
Bases:
eve.app.env.Wrapper
A monitor wrapper for Gym environments, it is used to know the episode reward, length, time and other data.
- Parameters
env – The environment
filename – the location to save a log file, can be None for no log
allow_early_resets – allows the reset of the environment before it is done
reset_keywords – extra keywords for the reset call, if extra parameters are needed at reset
info_keywords – extra information to log, from the information return of env.step()
- EXT = 'monitor.csv'
- reset(**kwargs) Union[Tuple, Dict[str, Any], numpy.ndarray, int]
Calls the environment reset. Can only be called if the environment is over, or if allow_early_resets is True
- Parameters
kwargs – Extra keywords saved for the next episode. only if defined by reset_keywords
- Returns
the first observation of the environment
- step(action: Union[numpy.ndarray, int]) Tuple[Union[Tuple, Dict[str, Any], numpy.ndarray, int], float, bool, Dict]
Step the environment with the given action
- Parameters
action – the action
- Returns
observation, reward, done, information
- close() None
Closes the environment
- get_total_steps() int
Returns the total number of timesteps
- Returns
- get_episode_rewards() List[float]
Returns the rewards of all the episodes
- Returns
- get_episode_lengths() List[int]
Returns the number of timesteps of all the episodes
- Returns
- get_episode_times() List[float]
Returns the runtime in seconds of all the episodes
- Returns
- exception eve.app.env.LoadMonitorResultsError
Bases:
Exception
Raised when loading the monitor log fails.
- eve.app.env.get_monitor_files(path: str) List[str]
get all the monitor files in the given path
- Parameters
path – the logging folder
- Returns
the log files
- eve.app.env.load_results(path: str) pandas.core.frame.DataFrame
Load all Monitor logs from a given directory path matching
*monitor.csv
- Parameters
path – the directory path containing the log file(s)
- Returns
the logged data
- eve.app.env.rolling_window(array: numpy.ndarray, window: int) numpy.ndarray
Apply a rolling window to a np.ndarray
- Parameters
array – the input Array
window – length of the rolling window
- Returns
rolling window on the input array
- eve.app.env.window_func(var_1: numpy.ndarray, var_2: numpy.ndarray, window: int, func: Callable) Tuple[numpy.ndarray, numpy.ndarray]
Apply a function to the rolling window of 2 arrays
- Parameters
var_1 – variable 1
var_2 – variable 2
window – length of the rolling window
func – function to apply on the rolling window on variable 2 (such as np.mean)
- Returns
the rolling output with applied function
- eve.app.env.ts2xy(data_frame: pandas.core.frame.DataFrame, x_axis: str) Tuple[numpy.ndarray, numpy.ndarray]
Decompose a data frame variable to x ans ys
- Parameters
data_frame – the input data
x_axis – the axis for the x and y output (can be X_TIMESTEPS=’timesteps’, X_EPISODES=’episodes’ or X_WALLTIME=’walltime_hrs’)
- Returns
the x and y output
- eve.app.env.plot_curves(xy_list: List[Tuple[numpy.ndarray, numpy.ndarray]], x_axis: str, title: str, figsize: Tuple[int, int] = (8, 2)) None
plot the curves
- Parameters
xy_list – the x and y coordinates to plot
x_axis – the axis for the x and y output (can be X_TIMESTEPS=’timesteps’, X_EPISODES=’episodes’ or X_WALLTIME=’walltime_hrs’)
title – the title of the plot
figsize – Size of the figure (width, height)
- eve.app.env.plot_results(dirs: List[str], num_timesteps: Optional[int], x_axis: str, task_name: str, figsize: Tuple[int, int] = (8, 2)) None
Plot the results using csv files from
Monitor
wrapper.- Parameters
dirs – the save location of the results to plot
num_timesteps – only plot the points below this value
x_axis – the axis for the x and y output (can be X_TIMESTEPS=’timesteps’, X_EPISODES=’episodes’ or X_WALLTIME=’walltime_hrs’)
task_name – the title of the task to plot
figsize – Size of the figure (width, height)
- eve.app.env.unwrap_vec_wrapper(env: Union[eve.app.env.EveEnv, eve.app.env.VecEnv], vec_wrapper_class: Type[eve.app.env.VecEnvWrapper]) Optional[eve.app.env.VecEnvWrapper]
Retrieve a
VecEnvWrapper
object by recursively searching.- Parameters
env –
vec_wrapper_class –
- Returns
- eve.app.env.unwrap_vec_normalize(env: Union[eve.app.env.EveEnv, eve.app.env.VecEnv]) Optional[eve.app.env.VecNormalize]
- Parameters
env –
- Returns
- eve.app.env.is_vecenv_wrapped(env: Union[eve.app.env.EveEnv, eve.app.env.VecEnv], vec_wrapper_class: Type[eve.app.env.VecEnvWrapper]) bool
Check if an environment is already wrapped by a given
VecEnvWrapper
.- Parameters
env –
vec_wrapper_class –
- Returns
- eve.app.env.unwrap_wrapper(env: eve.app.env.EveEnv, wrapper_class: Type[eve.app.env.Wrapper]) Optional[eve.app.env.Wrapper]
Retrieve a
VecEnvWrapper
object by recursively searching.- Parameters
env – Environment to unwrap
wrapper_class – Wrapper to look for
- Returns
Environment unwrapped till
wrapper_class
if it has been wrapped with it
- eve.app.env.is_wrapped(env: Type[eve.app.env.EveEnv], wrapper_class: Type[eve.app.env.Wrapper]) bool
Check if a given environment has been wrapped with a given wrapper.
- Parameters
env – Environment to check
wrapper_class – Wrapper class to look for
- Returns
True if environment has been wrapped with
wrapper_class
.
- eve.app.env.get_wrapper_class(hyperparams: Dict[str, Any]) Optional[Callable[[eve.app.env.EveEnv], eve.app.env.EveEnv]]
Get one or more environment wrapper class specified as a hyper parameter “env_wrapper”. e.g. env_wrapper: _minigrid.wrappers.FlatObsWrapper
for multiple, specify a list:
- env_wrapper:
utils.wrappers.PlotActionWrapper
utils.wrappers.TimeFeatureWrapper
- Parameters
hyperparams –
- Returns
maybe a callable to wrap the environment with one or multiple Wrapper
- eve.app.env.make_vec_env(env_id: Union[str, Type[eve.app.env.EveEnv]], n_envs: int = 1, seed: Optional[int] = None, start_index: int = 0, monitor_dir: Optional[str] = None, wrapper_class: Optional[Callable[[eve.app.env.EveEnv], eve.app.env.EveEnv]] = None, env_kwargs: Optional[Dict[str, Any]] = None, vec_env_cls: Optional[Type[Union[eve.app.env.DummyVecEnv, eve.app.env.SubprocVecEnv]]] = None, vec_env_kwargs: Optional[Dict[str, Any]] = None, monitor_kwargs: Optional[Dict[str, Any]] = None) eve.app.env.VecEnv
Create a wrapped, monitored
VecEnv
. By default it uses aDummyVecEnv
which is usually faster than aSubprocVecEnv
.- Parameters
env_id – the environment ID or the environment class
n_envs – the number of environments you wish to have in parallel
seed – the initial seed for the random number generator
start_index – start rank index
monitor_dir – Path to a folder where the monitor files will be saved. If None, no file will be written, however, the env will still be wrapped in a Monitor wrapper to provide additional information about training.
wrapper_class – Additional wrapper to use on the environment. This can also be a function with single argument that wraps the environment in many things.
env_kwargs – Optional keyword argument to pass to the env constructor
vec_env_cls – A custom
VecEnv
class constructor. Default: None.vec_env_kwargs – Keyword arguments to pass to the
VecEnv
class constructor.monitor_kwargs – Keyword arguments to pass to the
Monitor
class constructor.
- Returns
The wrapped environment
- eve.app.env.create_test_env(env_id: str, n_envs: int = 1, stats_path: Optional[str] = None, seed: int = 0, log_dir: Optional[str] = None, should_render: bool = True, hyperparams: Optional[Dict[str, Any]] = None, env_kwargs: Optional[Dict[str, Any]] = None) eve.app.env.VecEnv
Create environment for testing a trained agent
- Parameters
env_id –
n_envs – number of processes
stats_path – path to folder containing saved running averaged
seed – Seed for random number generator
log_dir – Where to log rewards
should_render – For Pybullet env, display the GUI
hyperparams – Additional hyperparams (ex: n_stack)
env_kwargs – Optional keyword argument to pass to the env constructor
eve.app.exp_manager module
eve.app.hyperparams_opt module
eve.app.logger module
- exception eve.app.logger.FormatUnsupportedError(unsupported_formats: Sequence[str], value_description: str)
Bases:
NotImplementedError
- class eve.app.logger.KVWriter
Bases:
object
Key Value writer
- write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None
Write a dictionary to file
- Parameters
key_values –
key_excluded –
step –
- close() None
Close owned resources
- class eve.app.logger.SeqWriter
Bases:
object
sequence writer
- write_sequence(sequence: List) None
write_sequence an array to file
- Parameters
sequence –
- class eve.app.logger.HumanOutputFormat(filename_or_file: Union[str, TextIO])
Bases:
eve.app.logger.KVWriter
,eve.app.logger.SeqWriter
log to a file, in a human readable format
- Parameters
filename_or_file – the file to write the log to
- write(key_values: Dict, key_excluded: Dict, step: int = 0) None
- write_sequence(sequence: List) None
- close() None
closes the file
- eve.app.logger.filter_excluded_keys(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], _format: str) Dict[str, Any]
Filters the keys specified by
key_exclude
for the specified format- Parameters
key_values – log dictionary to be filtered
key_excluded – keys to be excluded per format
_format – format for which this filter is run
- Returns
dict without the excluded keys
- class eve.app.logger.JSONOutputFormat(filename: str)
Bases:
eve.app.logger.KVWriter
log to a file, in the JSON format
- Parameters
filename – the file to write the log to
- write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None
- close() None
closes the file
- class eve.app.logger.CSVOutputFormat(filename: str)
Bases:
eve.app.logger.KVWriter
log to a file, in a CSV format
- Parameters
filename – the file to write the log to
- write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None
- close() None
closes the file
- class eve.app.logger.TensorBoardOutputFormat(folder: str)
Bases:
eve.app.logger.KVWriter
Dumps key/value pairs into TensorBoard’s numeric format.
- Parameters
folder – the folder to write the log to
- write(key_values: Dict[str, Any], key_excluded: Dict[str, Union[str, Tuple[str, ...]]], step: int = 0) None
- close() None
closes the file
- eve.app.logger.make_output_format(_format: str, log_dir: str, log_suffix: str = '') eve.app.logger.KVWriter
return a logger for the requested format
- Parameters
_format – the requested format to log to (‘stdout’, ‘log’, ‘json’ or ‘csv’ or ‘tensorboard’)
log_dir – the logging directory
log_suffix – the suffix for the log file
- Returns
the logger
- class eve.app.logger.Logger(folder: Optional[str], output_formats: List[eve.app.logger.KVWriter])
Bases:
object
the logger class
- Parameters
folder – the logging location
output_formats – the list of output format
- DEFAULT = <eve.app.logger.Logger object>
- CURRENT = <eve.app.logger.Logger object>
- record(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None
Log a value of some diagnostic Call this once for each diagnostic quantity, each iteration If called many times, last value will be used.
- Parameters
key – save to log this key
value – save to log this value
exclude – outputs to be excluded
- record_mean(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None
The same as record(), but if called many times, values averaged.
- Parameters
key – save to log this key
value – save to log this value
exclude – outputs to be excluded
- dump(step: int = 0) None
Write all of the diagnostics from the current iteration
- log(*args, level: int = 20) None
Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file).
- level: int. (see logger.py docs) If the global logger level is higher than
the level argument here, don’t print to stdout.
- Parameters
args – log the arguments
level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)
- set_level(level: int) None
Set logging threshold on current logger.
- Parameters
level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)
- get_dir() str
Get directory that log files are being written to. will be None if there is no output directory (i.e., if you didn’t call start)
- Returns
the logging directory
- close() None
closes the file
- eve.app.logger.configure(folder: Optional[str] = None, format_strings: Optional[List[str]] = None) None
configure the current logger
- Parameters
folder – the save location (if None, $SB3_LOGDIR, if still None, tempdir/baselines-[date & time])
format_strings – the output logging format (if None, $SB3_LOG_FORMAT, if still None, [‘stdout’, ‘log’, ‘csv’])
- eve.app.logger.reset() None
reset the current logger
- class eve.app.logger.ScopedConfigure(folder: Optional[str] = None, format_strings: Optional[List[str]] = None)
Bases:
object
Class for using context manager while logging
usage:
>>> with ScopedConfigure(folder=None, format_strings=None): >>> {code}
- Parameters
folder – the logging folder
format_strings – the list of output logging format
- eve.app.logger.record(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None
Log a value of some diagnostic Call this once for each diagnostic quantity, each iteration If called many times, last value will be used.
- Parameters
key – save to log this key
value – save to log this value
exclude – outputs to be excluded
- eve.app.logger.record_mean(key: str, value: Union[int, float], exclude: Optional[Union[str, Tuple[str, ...]]] = None) None
The same as record(), but if called many times, values averaged.
- Parameters
key – save to log this key
value – save to log this value
exclude – outputs to be excluded
- eve.app.logger.record_dict(key_values: Dict[str, Any]) None
Log a dictionary of key-value pairs.
- Parameters
key_values – the list of keys and values to save to log
- eve.app.logger.dump(step: int = 0) None
Write all of the diagnostics from the current iteration
- eve.app.logger.get_log_dict() Dict
get the key values logs
- Returns
the logged values
- eve.app.logger.log(*args, level: int = 20) None
Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file).
- level: int. (see logger.py docs) If the global logger level is higher than
the level argument here, don’t print to stdout.
- Parameters
args – log the arguments
level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)
- eve.app.logger.debug(*args) None
Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the DEBUG level.
- Parameters
args – log the arguments
- eve.app.logger.info(*args) None
Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the INFO level.
- Parameters
args – log the arguments
- eve.app.logger.warn(*args) None
Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the WARN level.
- Parameters
args – log the arguments
- eve.app.logger.error(*args) None
Write the sequence of args, with no separators, to the console and output files (if you’ve configured an output file). Using the ERROR level.
- Parameters
args – log the arguments
- eve.app.logger.set_level(level: int) None
Set logging threshold on current logger.
- Parameters
level – the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)
- eve.app.logger.get_level() int
Get logging threshold on current logger. :return: the logging level (can be DEBUG=10, INFO=20, WARN=30, ERROR=40, DISABLED=50)
- eve.app.logger.get_dir() str
Get directory that log files are being written to. will be None if there is no output directory (i.e., if you didn’t call start)
- Returns
the logging directory
- eve.app.logger.record_tabular(key: str, value: Any, exclude: Optional[Union[str, Tuple[str, ...]]] = None) None
Log a value of some diagnostic Call this once for each diagnostic quantity, each iteration If called many times, last value will be used.
- Parameters
key – save to log this key
value – save to log this value
exclude – outputs to be excluded
- eve.app.logger.dump_tabular(step: int = 0) None
Write all of the diagnostics from the current iteration
- eve.app.logger.read_json(filename: str) pandas.core.frame.DataFrame
read a json file using pandas
- Parameters
filename – the file path to read
- Returns
the data in the json
- eve.app.logger.read_csv(filename: str) pandas.core.frame.DataFrame
read a csv file using pandas
- Parameters
filename – the file path to read
- Returns
the data in the csv
eve.app.model module
eve.app.policies module
eve.app.space module
- eve.app.space.np_random(seed=None)
- eve.app.space.hash_seed(seed=None, max_bytes=8)
Any given evaluation is likely to have many PRNG’s active at once. (Most commonly, because the environment is running in multiple processes.) There’s literature indicating that having linear correlations between seeds of multiple PRNG’s can correlate the outputs:
http://blogs.unity3d.com/2015/01/07/a-primer-on-repeatable-random-numbers/ http://stackoverflow.com/questions/1554958/how-different-do-random-seeds-need-to-be http://dl.acm.org/citation.cfm?id=1276928
Thus, for sanity we hash the seeds before using them. (This scheme is likely not crypto-strength, but it should be good enough to get rid of simple correlations.)
- Parameters
seed (Optional[int]) – None seeds from an operating system specific randomness source.
max_bytes – Maximum number of bytes to use in the hashed seed.
- eve.app.space.create_seed(a=None, max_bytes=8)
Create a strong random seed. Otherwise, Python 2 would seed using the system time, which might be non-robust especially in the presence of concurrency.
- Parameters
a (Optional[int, str]) – None seeds from an operating system specific randomness source.
max_bytes – Maximum number of bytes to use in the seed.
- class eve.app.space.EveSpace(shape=None, dtype=None)
Bases:
object
Defines the observation and action spaces, so you can write generic code that applies to any Env. For example, you can choose a random action.
WARNING - Custom observation & action spaces can inherit from the Space class. However, most use-cases should be covered by the existing space classes (e.g. Box, Discrete, etc…), and container classes (Tuple & Dict). Note that parametrized probability distributions (through the sample() method), and batching functions (in eve.vector.VectorEnv), are only well-defined for instances of spaces provided in eve by default. Moreover, some implementations of Reinforcement Learning algorithms might not handle custom spaces properly. Use custom spaces with care.
- property np_random
Lazily seed the rng since this is expensive and only needed if sampling from this space.
- sample()
Randomly sample an element of this space. Can be uniform or non-uniform sampling based on boundedness of space.
- seed(seed=None)
Seed the PRNG of this space.
- contains(x)
Return boolean specifying if x is a valid member of this space
- to_jsonable(sample_n)
Convert a batch of samples from this space to a JSONable data type.
- from_jsonable(sample_n)
Convert a JSONable data type to a batch of samples from this space.
- class eve.app.space.EveBox(low, high, shape=None, max_neurons: typing.Optional[int] = None, max_states: typing.Optional[int] = None, dtype=<class 'numpy.float32'>)
Bases:
eve.app.space.EveSpace
A (possibly unbounded) box in R^n. Specifically, a Box represents the Cartesian product of n closed intervals. Each interval has the form of one of [a, b], (-oo, b], [a, oo), or (-oo, oo).
There are two common use cases:
- Identical bound for each dimension::
>>> Box(low=-1.0, high=2.0, shape=(3, 4), dtype=np.float32) Box(3, 4)
- Independent bound for each dimension::
>>> Box(low=np.array([-1.0, -2.0]), high=np.array([2.0, 4.0]), dtype=np.float32) Box(2,)
- is_bounded(manner='both')
- sample()
Generates a single random sample inside of the Box.
In creating a sample of the box, each coordinate is sampled according to the form of the interval:
[a, b] : uniform distribution
[a, oo) : shifted exponential distribution
(-oo, b] : shifted negative exponential distribution
(-oo, oo) : normal distribution
- contains(x)
- to_jsonable(sample_n)
- from_jsonable(sample_n)
- class eve.app.space.EveDict(spaces=None, **spaces_kwargs)
Bases:
eve.app.space.EveSpace
A dictionary of simpler spaces.
Example usage: self.observation_space = spaces.Dict({“position”: spaces.Discrete(2), “velocity”: spaces.Discrete(3)})
Example usage [nested]:
>>> self.nested_observation_space = spaces.Dict({ >>> 'sensors': spaces.Dict({ >>> 'position': spaces.Box(low=-100, high=100, shape=(3,)), >>> 'velocity': spaces.Box(low=-1, high=1, shape=(3,)), >>> 'front_cam': spaces.Tuple(( >>> spaces.Box(low=0, high=1, shape=(10, 10, 3)), >>> spaces.Box(low=0, high=1, shape=(10, 10, 3)) >>> )), >>> 'rear_cam': spaces.Box(low=0, high=1, shape=(10, 10, 3)), >>> }), >>> 'ext_controller': spaces.MultiDiscrete((5, 2, 2)), >>> 'inner_state':spaces.Dict({ >>> 'charge': spaces.Discrete(100), >>> 'system_checks': spaces.MultiBinary(10), >>> 'job_status': spaces.Dict({ >>> 'task': spaces.Discrete(5), >>> 'progress': spaces.Box(low=0, high=100, shape=()), >>> }) >>> }) >>> })
- seed(seed=None)
- sample()
- contains(x)
- to_jsonable(sample_n)
- from_jsonable(sample_n)
- class eve.app.space.EveDiscrete(n, max_neurons: Optional[int] = None, max_states: Optional[int] = None)
Bases:
eve.app.space.EveSpace
A discrete space in \(\{ 0, 1, \\dots, n-1 \}\).
Example:
>>> EveDiscrete(2)
- sample()
- contains(x)
- class eve.app.space.EveMultiBinary(n, max_neurons: Optional[int] = None, max_states: Optional[int] = None)
Bases:
eve.app.space.EveSpace
An n-shape binary space.
The argument to MultiBinary defines n, which could be a number or a list of numbers.
Example Usage:
>> self.observation_space = spaces.MultiBinary(5)
>> self.observation_space.sample()
array([0,1,0,1,0], dtype =int8)
>> self.observation_space = spaces.MultiBinary([3,2])
>> self.observation_space.sample()
- array([[0, 0],
[0, 1], [1, 1]], dtype=int8)
- sample()
- contains(x)
- to_jsonable(sample_n)
- from_jsonable(sample_n)
- class eve.app.space.EveMultiDiscrete(nvec, max_neurons: Optional[int] = None, max_states: Optional[int] = None)
Bases:
eve.app.space.EveSpace
The multi-discrete action space consists of a series of discrete action spaces with different number of actions in eachs
It is useful to represent game controllers or keyboards where each key can be represented as a discrete action space
It is parametrized by passing an array of positive integers specifying number of actions for each discrete action space
Note: Some environment wrappers assume a value of 0 always represents the NOOP action.
e.g. Nintendo Game Controller - Can be conceptualized as 3 discrete action spaces:
Arrow Keys: Discrete 5 - NOOP[0], UP[1], RIGHT[2], DOWN[3], LEFT[4] - params: min: 0, max: 4
Button A: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1
Button B: Discrete 2 - NOOP[0], Pressed[1] - params: min: 0, max: 1
Can be initialized as
MultiDiscrete([ 5, 2, 2 ])
nvec: vector of counts of each categorical variable
- sample()
- contains(x)
- to_jsonable(sample_n)
- from_jsonable(sample_n)
- class eve.app.space.EveTuple(spaces)
Bases:
eve.app.space.EveSpace
A tuple (i.e., product) of simpler spaces
Example usage: self.observation_space = spaces.Tuple((spaces.Discrete(2), spaces.Discrete(3)))
- seed(seed=None)
- sample()
- contains(x)
- to_jsonable(sample_n)
- from_jsonable(sample_n)
- eve.app.space.flatdim(space)
Return the number of dimensions a flattened equivalent of this space would have.
Accepts a space and returns an integer. Raises
NotImplementedError
if the space is not defined ingym.spaces
.
- eve.app.space.flatten(space, x)
Flatten a data point from a space.
This is useful when e.g. points from spaces must be passed to a neural network, which only understands flat arrays of floats.
Accepts a space and a point from that space. Always returns a 1D array. Raises
NotImplementedError
if the space is not defined ingym.spaces
.
- eve.app.space.unflatten(space, x)
Unflatten a data point from a space.
This reverses the transformation applied by
flatten()
. You must ensure that thespace
argument is the same as for theflatten()
call.Accepts a space and a flattened point. Returns a point with a structure that matches the space. Raises
NotImplementedError
if the space is not defined ingym.spaces
.
- eve.app.space.flatten_space(space)
Flatten a space into a single
Box
.This is equivalent to
flatten()
, but operates on the space itself. The result always is a Box with flat boundaries. The box has exactlyflatdim(space)
dimensions. Flattening a sample of the original space has the same effect as taking a sample of the flattenend space.Raises
NotImplementedError
if the space is not defined ingym.spaces
.Example:
>>> box = Box(0.0, 1.0, shape=(3, 4, 5)) >>> box Box(3, 4, 5) >>> flatten_space(box) Box(60,) >>> flatten(box, box.sample()) in flatten_space(box) True
Example that flattens a discrete space:
>>> discrete = Discrete(5) >>> flatten_space(discrete) Box(5,) >>> flatten(box, box.sample()) in flatten_space(box) True
Example that recursively flattens a dict:
>>> space = Dict({"position": Discrete(2), ... "velocity": Box(0, 1, shape=(2, 2))}) >>> flatten_space(space) Box(6,) >>> flatten(space, space.sample()) in flatten_space(space) True
eve.app.trainer module
eve.app.upgrader module
eve.app.utils module
- eve.app.utils.set_random_seed(seed: int, using_cuda: bool = False) None
Seed the different random generators :param seed: :param using_cuda:
- eve.app.utils.explained_variance(y_pred: numpy.ndarray, y_true: numpy.ndarray) numpy.ndarray
Computes fraction of variance that ypred explains about y. Returns 1 - Var[y-ypred] / Var[y]
- interpretation:
ev=0 => might as well have predicted zero ev=1 => perfect prediction ev<0 => worse than just predicting zero
- Parameters
y_pred – the prediction
y_true – the expected value
- Returns
explained variance of ypred and y
- eve.app.utils.update_learning_rate(optimizer: torch.optim.optimizer.Optimizer, learning_rate: float) None
Update the learning rate for a given optimizer. Useful when doing linear schedule.
- Parameters
optimizer –
learning_rate –
- eve.app.utils.get_schedule_fn(value_schedule: Union[Callable[[float], float], float, int]) Callable[[float], float]
Transform (if needed) learning rate and clip range (for PPO) to callable.
- Parameters
value_schedule –
- Returns
- eve.app.utils.get_linear_fn(start: float, end: float, end_fraction: float) Callable[[float], float]
Create a function that interpolates linearly between start and end between
progress_remaining
= 1 andprogress_remaining
=end_fraction
. This is used in DQN for linearly annealing the exploration fraction (epsilon for the epsilon-greedy strategy).- Params start
value to start with if
progress_remaining
= 1- Params end
value to end with if
progress_remaining
= 0- Params end_fraction
fraction of
progress_remaining
where end is reached e.g 0.1 then end is reached after 10% of the complete training process.- Returns
- eve.app.utils.linear_schedule(initial_value: Union[float, str]) Callable[[float], float]
Linear learning rate schedule.
- Parameters
initial_value – (float or str)
- eve.app.utils.constant_fn(val: float) Callable[[float], float]
Create a function that returns a constant It is useful for learning rate schedule (to avoid code duplication)
- Parameters
val –
- Returns
- eve.app.utils.get_device(device: Union[torch.device, str] = 'auto') torch.device
Retrieve PyTorch device. It checks that the requested device is available first. For now, it supports only cpu and cuda. By default, it tries to use the gpu.
- Parameters
device – One for ‘auto’, ‘cuda’, ‘cpu’
- Returns
- eve.app.utils.get_latest_run_id(log_path: Optional[str] = None, log_name: str = '') int
Returns the latest run number for the given log name and log path, by finding the greatest number in the directories.
- Returns
latest run number
- eve.app.utils.safe_mean(arr: Union[numpy.ndarray, list]) numpy.ndarray
Compute the mean of an array if there is at least one element. For empty array, return NaN. It is used for logging only.
- Parameters
arr –
- Returns
- eve.app.utils.zip_strict(*iterables: Iterable) Iterable
zip()
function but enforces that iterables are of equal length. RaisesValueError
if iterables not of equal length. Code inspired by Stackoverflow answer for question #32954486.- Parameters
*iterables – iterables to
zip()
- eve.app.utils.polyak_update(params: Iterable[torch.nn.parameter.Parameter], target_params: Iterable[torch.nn.parameter.Parameter], tau: float) None
Perform a Polyak average update on
target_params
usingparams
: target parameters are slowly updated towards the main parameters.tau
, the soft update coefficient controls the interpolation:tau=1
corresponds to copying the parameters to the target ones whereas nothing happens whentau=0
. The Polyak update is done in place, withno_grad
, and therefore does not create intermediate tensors, or a computation graph, reducing memory cost and improving performance. We scale the target params by1-tau
(in-place), add the new weights, scaled bytau
and store the result of the sum in the target params (in place). See https://github.com/DLR-RM/stable-baselines3/issues/93- Parameters
params – parameters to use to update the target params
target_params – parameters to update
tau – the soft update coefficient (“Polyak update”, between 0 and 1)
- eve.app.utils.recursive_getattr(obj: Any, attr: str, *args) Any
Recursive version of getattr taken from https://stackoverflow.com/questions/31174295
Ex: > MyObject.sub_object = SubObject(name=’test’) > recursive_getattr(MyObject, ‘sub_object.name’) # return test :param obj: :param attr: Attribute to retrieve :return: The attribute
- eve.app.utils.recursive_setattr(obj: Any, attr: str, val: Any) None
Recursive version of setattr taken from https://stackoverflow.com/questions/31174295
Ex: > MyObject.sub_object = SubObject(name=’test’) > recursive_setattr(MyObject, ‘sub_object.name’, ‘hello’) :param obj: :param attr: Attribute to set :param val: New value of the attribute
- eve.app.utils.is_json_serializable(item: Any) bool
Test if an object is serializable into JSON
- Parameters
item – The object to be tested for JSON serialization.
- Returns
True if object is JSON serializable, false otherwise.
- eve.app.utils.data_to_json(data: Dict[str, Any]) str
Turn data (class parameters) into a JSON string for storing
- Parameters
data – Dictionary of class parameters to be stored. Items that are not JSON serializable will be pickled with Cloudpickle and stored as bytearray in the JSON file
- Returns
JSON string of the data serialized.
- eve.app.utils.json_to_data(json_string: str, custom_objects: Optional[Dict[str, Any]] = None) Dict[str, Any]
Turn JSON serialization of class-parameters back into dictionary.
- Parameters
json_string – JSON serialization of the class-parameters that should be loaded.
custom_objects – Dictionary of objects to replace upon loading. If a variable is present in this dictionary as a key, it will not be deserialized and the corresponding item will be used instead. Similar to custom_objects in keras.models.load_model. Useful when you have an object in file that can not be deserialized.
- Returns
Loaded class parameters.
- eve.app.utils.open_path(path: Union[str, pathlib.Path, io.BufferedIOBase], mode: str, verbose: int = 0, suffix: Optional[str] = None)
Opens a path for reading or writing with a preferred suffix and raises debug information. If the provided path is a derivative of io.BufferedIOBase it ensures that the file matches the provided mode, i.e. If the mode is read (“r”, “read”) it checks that the path is readable. If the mode is write (“w”, “write”) it checks that the file is writable.
If the provided path is a string or a pathlib.Path, it ensures that it exists. If the mode is “read” it checks that it exists, if it doesn’t exist it attempts to read path.suffix if a suffix is provided. If the mode is “write” and the path does not exist, it creates all the parent folders. If the path points to a folder, it changes the path to path_2. If the path already exists and verbose == 2, it raises a warning.
- Parameters
path – the path to open. if save_path is a str or pathlib.Path and mode is “w”, single dispatch ensures that the path actually exists. If path is a io.BufferedIOBase the path exists.
mode – how to open the file. “w”|”write” for writing, “r”|”read” for reading.
verbose – Verbosity level, 0 means only warnings, 2 means debug information.
suffix – The preferred suffix. If mode is “w” then the opened file has the suffix. If mode is “r” then we attempt to open the path. If an error is raised and the suffix is not None, we attempt to open the path with the suffix.
- Returns
- eve.app.utils.open_path_str(path: str, mode: str, verbose: int = 0, suffix: Optional[str] = None) io.BufferedIOBase
Open a path given by a string. If writing to the path, the function ensures that the path exists.
- Parameters
path – the path to open. If mode is “w” then it ensures that the path exists by creating the necessary folders and renaming path if it points to a folder.
mode – how to open the file. “w” for writing, “r” for reading.
verbose – Verbosity level, 0 means only warnings, 2 means debug information.
suffix – The preferred suffix. If mode is “w” then the opened file has the suffix. If mode is “r” then we attempt to open the path. If an error is raised and the suffix is not None, we attempt to open the path with the suffix.
- Returns
- eve.app.utils.open_path_pathlib(path: pathlib.Path, mode: str, verbose: int = 0, suffix: Optional[str] = None) io.BufferedIOBase
Open a path given by a string. If writing to the path, the function ensures that the path exists.
- Parameters
path – the path to check. If mode is “w” then it ensures that the path exists by creating the necessary folders and renaming path if it points to a folder.
mode – how to open the file. “w” for writing, “r” for reading.
verbose – Verbosity level, 0 means only warnings, 2 means debug information.
suffix – The preferred suffix. If mode is “w” then the opened file has the suffix. If mode is “r” then we attempt to open the path. If an error is raised and the suffix is not None, we attempt to open the path with the suffix.
- Returns
- eve.app.utils.save_to_zip_file(save_path: Union[str, pathlib.Path, io.BufferedIOBase], data: Optional[Dict[str, Any]] = None, params: Optional[Dict[str, Any]] = None, pytorch_variables: Optional[Dict[str, Any]] = None, verbose: int = 0) None
Save model data to a zip archive.
- Parameters
save_path – Where to store the model. if save_path is a str or pathlib.Path ensures that the path actually exists.
data – Class parameters being stored (non-PyTorch variables)
params – Model parameters being stored expected to contain an entry for every state_dict with its name and the state_dict.
pytorch_variables – Other PyTorch variables expected to contain name and value of the variable.
verbose – Verbosity level, 0 means only warnings, 2 means debug information
- eve.app.utils.save_to_pkl(path: Union[str, pathlib.Path, io.BufferedIOBase], obj: Any, verbose: int = 0) None
Save an object to path creating the necessary folders along the way. If the path exists and is a directory, it will raise a warning and rename the path. If a suffix is provided in the path, it will use that suffix, otherwise, it will use ‘.pkl’.
- Parameters
path – the path to open. if save_path is a str or pathlib.Path and mode is “w”, single dispatch ensures that the path actually exists. If path is a io.BufferedIOBase the path exists.
obj – The object to save.
verbose – Verbosity level, 0 means only warnings, 2 means debug information.
- eve.app.utils.load_from_pkl(path: Union[str, pathlib.Path, io.BufferedIOBase], verbose: int = 0) Any
Load an object from the path. If a suffix is provided in the path, it will use that suffix. If the path does not exist, it will attempt to load using the .pkl suffix.
- Parameters
path – the path to open. if save_path is a str or pathlib.Path and mode is “w”, single dispatch ensures that the path actually exists. If path is a io.BufferedIOBase the path exists.
verbose – Verbosity level, 0 means only warnings, 2 means debug information.
- eve.app.utils.load_from_zip_file(load_path: Union[str, pathlib.Path, io.BufferedIOBase], load_data: bool = True, device: Union[torch.device, str] = 'auto', verbose: int = 0) Tuple[Optional[Dict[str, Any]], Optional[Dict[str, torch.Tensor]], Optional[Dict[str, torch.Tensor]]]
Load model data from a .zip archive
- Parameters
load_path – Where to load the model from
load_data – Whether we should load and return data (class parameters). Mainly used by ‘load_parameters’ to only load model parameters (weights)
device – Device on which the code should run.
- Returns
Class parameters, model state_dicts (aka “params”, dict of state_dict) and dict of pytorch variables
- eve.app.utils.get_trained_models(log_folder: str) Dict[str, Tuple[str, str]]
- Parameters
log_folder – (str) Root log folder
- Returns
(Dict[str, Tuple[str, str]]) Dict representing the trained agent
- eve.app.utils.get_saved_hyperparams(stats_path: str, norm_reward: bool = False, test_mode: bool = False) Tuple[Dict[str, Any], str]
- Parameters
stats_path –
norm_reward –
test_mode –