Utilities¶
-
class
fuel.utils.
Subset
(list_or_slice, original_num_examples)[source]¶ Bases:
object
A description of a subset of examples.
Parameters: -
classmethod
empty_subset
(original_num_examples)[source]¶ Construct an empty Subset.
Parameters: original_num_examples (int) – Number of examples in the dataset this subset is part of.
-
index_within_subset
(indexable, subset_request, sort_indices=False)[source]¶ Index an indexable object within the context of this subset.
Parameters: - indexable (indexable object) – The object to index through.
- subset_request (
list
orslice
) – List of positive integer indices or slice that constitutes the request within the context of this subset. This request will be translated to a request on the indexable object. - sort_indices (bool, optional) – If the request is a list of indices, indexes in sorted order and reshuffles the result in the original order. Defaults to False.
-
is_empty
¶ Whether this subset is empty.
-
is_list
Whether this subset is list-based (as opposed to slice-based).
-
num_examples
The number of examples this subset spans.
-
static
slice_to_numerical_args
(slice_, num_examples)[source]¶ Translate a slice’s attributes into numerical attributes.
Parameters:
-
static
sorted_fancy_indexing
(indexable, request)[source]¶ Safe fancy indexing.
Some objects, such as h5py datasets, only support list indexing if the list is sorted.
This static method adds support for unsorted list indexing by sorting the requested indices, accessing the corresponding elements and re-shuffling the result.
Parameters: - request (list of int) – Unsorted list of example indices.
- indexable (any fancy-indexable object) – Indexable we’d like to do unsorted fancy indexing on.
-
classmethod
-
fuel.utils.
do_not_pickle_attributes
(*lazy_properties)[source]¶ Decorator to assign non-pickable properties.
Used to assign properties which will not be pickled on some class. This decorator creates a series of properties whose values won’t be serialized; instead, their values will be reloaded (e.g. from disk) by the
load()
function after deserializing the object.The decorator can be used to avoid the serialization of bulky attributes. Another possible use is for attributes which cannot be pickled at all. In this case the user should construct the attribute himself in
load()
.Parameters: *lazy_properties (strings) – The names of the attributes that are lazy. Notes
The pickling behavior of the dataset is only overridden if the dataset does not have a
__getstate__
method implemented.Examples
In order to make sure that attributes are not serialized with the dataset, and are lazily reloaded after deserialization by the
load()
in the wrapped class. Use the decorator with the names of the attributes as an argument.>>> from fuel.datasets import Dataset >>> @do_not_pickle_attributes('features', 'targets') ... class TestDataset(Dataset): ... def load(self): ... self.features = range(10 ** 6) ... self.targets = range(10 ** 6)[::-1]
-
fuel.utils.
find_in_data_path
(filename)[source]¶ Searches for a file within Fuel’s data path.
This function loops over all paths defined in Fuel’s data path and returns the first path in which the file is found.
Parameters: filename (str) – Name of the file to find. Returns: file_path – Path to the first file matching filename found in Fuel’s data path. Return type: str Raises: IOError
– If the file doesn’t appear in Fuel’s data path.
-
fuel.utils.
lazy_property_factory
(lazy_property)[source]¶ Create properties that perform lazy loading of attributes.
Caching¶
This file provides the ability to make a local cache of a dataset or part of it. It is meant to help in the case where multiple jobs are reading the same dataset from ${FUEL_DATA_PATH}, which may cause a great burden on the network. With this file, it is possible to make a local copy (in ${FUEL_LOCAL_DATA_PATH}) of any required file and have multiple processes use it simultaneously instead of each acquiring its own copy over the network. Whenever a folder or a dataset copy is created locally, it is granted the same access as it has under ${FUEL_LOCAL_DATA_PATH}. This is guaranteed by default copy.
-
fuel.utils.cache.
cache_file
(filename)[source]¶ Caches a file locally if possible.
If caching was succesfull, or if the file was previously successfully cached, this method returns the path to the local copy of the file. If not, it returns the path to the original file.
Parameters: filename (str) – Remote file to cache locally Returns: output – Updated (if needed) filename to use to access the remote file. Return type: str