Utilities¶

class fuel.utils.Subset(list_or_slice, original_num_examples)[source]¶

Bases: object

A description of a subset of examples.

Parameters:	list_or_slice (`list` or `slice`) – List of positive integer indices or slice that describes which examples are part of the subset. original_num_examples (int) – Number of examples in the dataset this subset belongs to.

is_list¶

Whether the Subset is a list-based subset (as opposed to a slice-based subset).

Type:	bool

num_examples¶

Number of examples the Subset spans.

Type:	int

original_num_examples¶

Number of examples in the dataset this subset is part of.

Type:	int

classmethod empty_subset(original_num_examples)[source]¶

Construct an empty Subset.

Parameters:	original_num_examples (int) – Number of examples in the dataset this subset is part of.

get_list_representation()[source]¶: Returns this subset’s representation as a list of indices.

index_within_subset(indexable, subset_request, sort_indices=False)[source]¶

Index an indexable object within the context of this subset.

Parameters:

indexable (indexable object) – The object to index through.
subset_request (list or slice) – List of positive integer indices or slice that constitutes the request within the context of this subset. This request will be translated to a request on the indexable object.
sort_indices (bool, optional) – If the request is a list of indices, indexes in sorted order and reshuffles the result in the original order. Defaults to False.

is_empty¶: Whether this subset is empty.

is_list: Whether this subset is list-based (as opposed to slice-based).

num_examples: The number of examples this subset spans.

static slice_to_numerical_args(slice_, num_examples)[source]¶

Translate a slice’s attributes into numerical attributes.

Parameters:	slice (`slice`) – Slice for which numerical attributes are wanted. num_examples (int) – Number of examples in the indexable that is to be sliced through. This determines the numerical value for the stop attribute in case it’s None.

static sorted_fancy_indexing(indexable, request)[source]¶

Safe fancy indexing.

Some objects, such as h5py datasets, only support list indexing if the list is sorted.

This static method adds support for unsorted list indexing by sorting the requested indices, accessing the corresponding elements and re-shuffling the result.

Parameters:	request (list of int) – Unsorted list of example indices. indexable (any fancy-indexable object) – Indexable we’d like to do unsorted fancy indexing on.

classmethod subset_of(subset, list_or_slice)[source]¶

Construct a Subset that is a subset of another Subset.

Parameters:	subset (`Subset`) – Subset to take the subset of. list_or_slice (`list` or `slice`) – List of positive integer indices or slice that describes which examples are part of the subset’s subset.

fuel.utils.do_not_pickle_attributes(*lazy_properties)[source]¶

Decorator to assign non-pickable properties.

Used to assign properties which will not be pickled on some class. This decorator creates a series of properties whose values won’t be serialized; instead, their values will be reloaded (e.g. from disk) by the load() function after deserializing the object.

The decorator can be used to avoid the serialization of bulky attributes. Another possible use is for attributes which cannot be pickled at all. In this case the user should construct the attribute himself in load().

Parameters:	lazy_properties (strings*) – The names of the attributes that are lazy.

Notes

The pickling behavior of the dataset is only overridden if the dataset does not have a __getstate__ method implemented.

Examples

In order to make sure that attributes are not serialized with the dataset, and are lazily reloaded after deserialization by the load() in the wrapped class. Use the decorator with the names of the attributes as an argument.

>>> from fuel.datasets import Dataset
>>> @do_not_pickle_attributes('features', 'targets')
... class TestDataset(Dataset):
...     def load(self):
...         self.features = range(10 ** 6)
...         self.targets = range(10 ** 6)[::-1]

fuel.utils.find_in_data_path(filename)[source]¶

Searches for a file within Fuel’s data path.

This function loops over all paths defined in Fuel’s data path and returns the first path in which the file is found.

Parameters:	filename (str) – Name of the file to find.
Returns:	file_path – Path to the first file matching filename found in Fuel’s data path.
Return type:	str
Raises:	`IOError` – If the file doesn’t appear in Fuel’s data path.

fuel.utils.iterable_fancy_indexing(iterable, request)[source]¶

fuel.utils.lazy_property_factory(lazy_property)[source]¶: Create properties that perform lazy loading of attributes.

fuel.utils.remember_cwd(*args, **kwds)[source]¶

Caching¶

This file provides the ability to make a local cache of a dataset or part of it. It is meant to help in the case where multiple jobs are reading the same dataset from ${FUEL_DATA_PATH}, which may cause a great burden on the network. With this file, it is possible to make a local copy (in ${FUEL_LOCAL_DATA_PATH}) of any required file and have multiple processes use it simultaneously instead of each acquiring its own copy over the network. Whenever a folder or a dataset copy is created locally, it is granted the same access as it has under ${FUEL_LOCAL_DATA_PATH}. This is guaranteed by default copy.

fuel.utils.cache.cache_file(filename)[source]¶

Caches a file locally if possible.

If caching was succesfull, or if the file was previously successfully cached, this method returns the path to the local copy of the file. If not, it returns the path to the original file.

Parameters:	filename (str) – Remote file to cache locally
Returns:	output – Updated (if needed) filename to use to access the remote file.
Return type:	str

fuel.utils.cache.copy_from_server_to_local(dataset_remote_dir, dataset_local_dir, remote_fname, local_fname)[source]¶

Copies a remote file locally.

Parameters:	remote_fname (str) – Remote file to copy local_fname (str) – Path and name of the local copy to be made of the remote file.