Iteration schemes

class fuel.schemes.BatchScheme(examples, batch_size)[source]

Bases: fuel.schemes.IterationScheme

Iteration schemes that return slices or indices for batches.

For datasets where the number of examples is known and easily accessible (as is the case for most datasets which are small enough to be kept in memory, like MNIST) we can provide slices or lists of labels to the dataset.

Parameters:
  • examples (int or list) – Defines which examples from the dataset are iterated. If list, its items are the indices of examples. If an integer, it will use that many examples from the beginning of the dataset, i.e. it is interpreted as range(examples)
  • batch_size (int) – The request iterator will return slices or list of indices in batches of size batch_size until the end of examples is reached. Note that this means that the last batch size returned could be smaller than batch_size. If you want to ensure all batches are of equal size, then ensure len(examples) or examples is a multiple of batch_size.
requests_examples = False
class fuel.schemes.BatchSizeScheme[source]

Bases: fuel.schemes.IterationScheme

Iteration scheme that returns batch sizes.

For infinite datasets it doesn’t make sense to provide indices to examples, but the number of samples per batch can still be given. Hence BatchSizeScheme is the base class for iteration schemes that only provide the number of examples that should be in a batch.

requests_examples = False
class fuel.schemes.ConcatenatedScheme(schemes)[source]

Bases: fuel.schemes.IterationScheme

Build an iterator by concatenating several schemes’ iterators.

Useful for iterating through different subsets of data in a specific order.

Parameters:schemes (list) – A list of IterationSchemes, whose request iterators are to be concatenated in the order given.

Notes

All schemes being concatenated must produce the same type of requests (batches or examples).

get_request_iterator()[source]

Returns an iterator type.

requests_examples
class fuel.schemes.ConstantScheme(batch_size, num_examples=None, times=None)[source]

Bases: fuel.schemes.BatchSizeScheme

Constant batch size iterator.

This subset iterator simply returns the same constant batch size for a given number of times (or else infinitely).

Parameters:
  • batch_size (int) – The size of the batch to return.
  • num_examples (int, optional) – If given, the request iterator will return batch_size until the sum reaches num_examples. Note that this means that the last batch size returned could be smaller than batch_size. If you want to ensure all batches are of equal size, then pass times equal to num_examples / batch-size instead.
  • times (int, optional) – The number of times to return batch_size.
get_request_iterator()[source]

Returns an iterator type.

class fuel.schemes.IndexScheme(examples)[source]

Bases: fuel.schemes.IterationScheme

Iteration schemes that return single indices.

This is for datasets that support indexing (like BatchScheme) but where we want to return single examples instead of batches.

requests_examples = True
class fuel.schemes.IterationScheme[source]

Bases: object

An iteration scheme.

Iteration schemes provide a dataset-agnostic iteration scheme, such as sequential batches, shuffled batches, etc. for datasets that choose to support them.

requests_examples

Whether requests produced by this scheme correspond to single examples (as opposed to batches).

Type:bool

Notes

Iteration schemes implement the get_request_iterator() method, which returns an iterator type (e.g. a generator or a class which implements the iterator protocol).

Stochastic iteration schemes should generally not be shared between different data streams, because it would make experiments harder to reproduce.

get_request_iterator()[source]

Returns an iterator type.

class fuel.schemes.SequentialExampleScheme(examples)[source]

Bases: fuel.schemes.IndexScheme

Sequential examples iterator.

Returns examples in order.

get_request_iterator()[source]

Returns an iterator type.

class fuel.schemes.SequentialScheme(examples, batch_size)[source]

Bases: fuel.schemes.BatchScheme

Sequential batches iterator.

Iterate over all the examples in a dataset of fixed size sequentially in batches of a given size.

Notes

The batch size isn’t enforced, so the last batch could be smaller.

get_request_iterator()[source]

Returns an iterator type.

class fuel.schemes.ShuffledExampleScheme(*args, **kwargs)[source]

Bases: fuel.schemes.IndexScheme

Shuffled examples iterator.

Returns examples in random order.

get_request_iterator()[source]

Returns an iterator type.

class fuel.schemes.ShuffledScheme(*args, **kwargs)[source]

Bases: fuel.schemes.BatchScheme

Shuffled batches iterator.

Iterate over all the examples in a dataset of fixed size in shuffled batches.

Parameters:sorted_indices (bool, optional) – If True, enforce that indices within a batch are ordered. Defaults to False.

Notes

The batch size isn’t enforced, so the last batch could be smaller.

Shuffling the batches requires creating a shuffled list of indices in memory. This can be memory-intensive for very large numbers of examples (i.e. in the order of tens of millions).

get_request_iterator()[source]

Returns an iterator type.

fuel.schemes.cross_validation(scheme_class, num_examples, num_folds, strict=True, **kwargs)[source]

Return pairs of schemes to be used for cross-validation.

Parameters:
  • scheme_class (subclass of IndexScheme or BatchScheme) – The type of the returned schemes. The constructor is called with an iterator and **kwargs as arguments.
  • num_examples (int) – The number of examples in the datastream.
  • num_folds (int) – The number of folds to return.
  • strict (bool, optional) – If True, enforce that num_examples is divisible by num_folds and so, that all validation sets have the same size. If False, the size of the validation set is returned along the iteration schemes. Defaults to True.
Yields:

fold (tuple) – The generator returns num_folds tuples. The first two elements of the tuple are the training and validation iteration schemes. If strict is set to False, the tuple has a third element corresponding to the size of the validation set.