Converters¶
Base classes¶
-
fuel.converters.base.
check_exists
(required_files)[source]¶ Decorator that checks if required files exist before running.
Parameters: required_files (list of str) – A list of strings indicating the filenames of regular files (not directories) that should be found in the input directory (which is the first argument to the wrapped function). Returns: wrapper – A function that takes a function and returns a wrapped function. The function returned by wrapper will include input file existence verification. Return type: function Notes
Assumes that the directory in which to find the input files is provided as the first argument, with the argument name directory.
-
fuel.converters.base.
fill_hdf5_file
(h5file, data)[source]¶ Fills an HDF5 file in a H5PYDataset-compatible manner.
Parameters: - h5file (
h5py.File
) – File handle for an HDF5 file. - data (tuple of tuple) –
One element per split/source pair. Each element consists of a tuple of (split_name, source_name, data_array, comment), where
- ’split_name’ is a string identifier for the split name
- ’source_name’ is a string identifier for the source name
- ’data_array’ is a
numpy.ndarray
containing the data for this split/source pair - ’comment’ is a comment string for the split/source pair
The ‘comment’ element can optionally be omitted.
- h5file (
Adult¶
-
fuel.converters.adult.
convert_adult
(directory, output_directory, output_filename='adult.hdf5')[source]¶ Convert the Adult dataset to HDF5.
Converts the Adult dataset to an HDF5 dataset compatible with
fuel.datasets.Adult
. The converted dataset is saved as ‘adult.hdf5’. This method assumes the existence of the file adult.data and adult.test.Parameters: Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.adult.
convert_to_one_hot
(y)[source]¶ converts y into one hot reprsentation.
Parameters: y (list) – A list containing continous integer values. Returns: one_hot – A numpy.ndarray object, which is one-hot representation of y. Return type: numpy.ndarray
CalTech 101 Silhouettes¶
-
fuel.converters.caltech101_silhouettes.
convert_silhouettes
(size, directory, output_directory, output_filename=None)[source]¶ Convert the CalTech 101 Silhouettes Datasets.
Parameters:
-
fuel.converters.caltech101_silhouettes.
fill_subparser
(subparser)[source]¶ Sets up a subparser to convert CalTech101 Silhouettes Database files.
Parameters: subparser ( argparse.ArgumentParser
) – Subparser handling the caltech101_silhouettes command.
Binarized MNIST¶
-
fuel.converters.binarized_mnist.
convert_binarized_mnist
(directory, *args, **kwargs)[source]¶ Converts the binarized MNIST dataset to HDF5.
Converts the binarized MNIST dataset used in R. Salakhutdinov’s DBN paper [DBN] to an HDF5 dataset compatible with
fuel.datasets.BinarizedMNIST
. The converted dataset is saved as ‘binarized_mnist.hdf5’.This method assumes the existence of the files binarized_mnist_{train,valid,test}.amat, which are accessible through Hugo Larochelle’s website [HUGO].
[DBN] Ruslan Salakhutdinov and Iain Murray, On the Quantitative Analysis of Deep Belief Networks, Proceedings of the 25th international conference on Machine learning, 2008, pp. 872-879. Parameters: Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.binarized_mnist.
fill_subparser
(subparser)[source]¶ Sets up a subparser to convert the binarized MNIST dataset files.
Parameters: subparser ( argparse.ArgumentParser
) – Subparser handling the binarized_mnist command.
CIFAR100¶
-
fuel.converters.cifar100.
convert_cifar100
(directory, *args, **kwargs)[source]¶ Converts the CIFAR-100 dataset to HDF5.
Converts the CIFAR-100 dataset to an HDF5 dataset compatible with
fuel.datasets.CIFAR100
. The converted dataset is saved as ‘cifar100.hdf5’.This method assumes the existence of the following file: cifar-100-python.tar.gz
Parameters: Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.cifar100.
fill_subparser
(subparser)[source]¶ Sets up a subparser to convert the CIFAR100 dataset files.
Parameters: subparser ( argparse.ArgumentParser
) – Subparser handling the cifar100 command.
CIFAR10¶
-
fuel.converters.cifar10.
convert_cifar10
(directory, *args, **kwargs)[source]¶ Converts the CIFAR-10 dataset to HDF5.
Converts the CIFAR-10 dataset to an HDF5 dataset compatible with
fuel.datasets.CIFAR10
. The converted dataset is saved as ‘cifar10.hdf5’.It assumes the existence of the following file:
- cifar-10-python.tar.gz
Parameters: Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.cifar10.
fill_subparser
(subparser)[source]¶ Sets up a subparser to convert the CIFAR10 dataset files.
Parameters: subparser ( argparse.ArgumentParser
) – Subparser handling the cifar10 command.
IRIS¶
-
fuel.converters.iris.
convert_iris
(directory, output_directory, output_filename='iris.hdf5')[source]¶ Convert the Iris dataset to HDF5.
Converts the Iris dataset to an HDF5 dataset compatible with
fuel.datasets.Iris
. The converted dataset is saved as ‘iris.hdf5’. This method assumes the existence of the file iris.data.Parameters: Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.iris.
fill_subparser
(subparser)[source]¶ Sets up a subparser to convert the Iris dataset file.
Parameters: subparser ( argparse.ArgumentParser
) – Subparser handling the iris command.
MNIST¶
-
fuel.converters.mnist.
convert_mnist
(directory, *args, **kwargs)[source]¶ Converts the MNIST dataset to HDF5.
Converts the MNIST dataset to an HDF5 dataset compatible with
fuel.datasets.MNIST
. The converted dataset is saved as ‘mnist.hdf5’.This method assumes the existence of the following files: train-images-idx3-ubyte.gz, train-labels-idx1-ubyte.gz t10k-images-idx3-ubyte.gz, t10k-labels-idx1-ubyte.gz
It assumes the existence of the following files:
- train-images-idx3-ubyte.gz
- train-labels-idx1-ubyte.gz
- t10k-images-idx3-ubyte.gz
- t10k-labels-idx1-ubyte.gz
Parameters: - directory (str) – Directory in which input files reside.
- output_directory (str) – Directory in which to save the converted dataset.
- output_filename (str, optional) – Name of the saved dataset. Defaults to None, in which case a name based on dtype will be used.
- dtype (str, optional) – Either ‘float32’, ‘float64’, or ‘bool’. Defaults to None, in which case images will be returned in their original unsigned byte format.
Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.mnist.
fill_subparser
(subparser)[source]¶ Sets up a subparser to convert the MNIST dataset files.
Parameters: subparser ( argparse.ArgumentParser
) – Subparser handling the mnist command.
-
fuel.converters.mnist.
read_mnist_images
(filename, dtype=None)[source]¶ Read MNIST images from the original ubyte file format.
Parameters: - filename (str) – Filename/path from which to read images.
- dtype ('float32', 'float64', or 'bool') – If unspecified, images will be returned in their original unsigned byte format.
Returns: images – An image array, with individual examples indexed along the first axis and the image dimensions along the second and third axis.
Return type: ndarray
, shape (n_images, 1, n_rows, n_cols)Notes
If the dtype provided was Boolean, the resulting array will be Boolean with True if the corresponding pixel had a value greater than or equal to 128, False otherwise.
If the dtype provided was a float dtype, the values will be mapped to the unit interval [0, 1], with pixel values that were 255 in the original unsigned byte representation equal to 1.0.
-
fuel.converters.mnist.
read_mnist_labels
(filename)[source]¶ Read MNIST labels from the original ubyte file format.
Parameters: filename (str) – Filename/path from which to read labels. Returns: labels – A one-dimensional unsigned byte array containing the labels as integers. Return type: ndarray
, shape (nlabels, 1)
SVHN¶
-
fuel.converters.svhn.
convert_svhn
(which_format, directory, output_directory, output_filename=None)[source]¶ Converts the SVHN dataset to HDF5.
Converts the SVHN dataset [SVHN] to an HDF5 dataset compatible with
fuel.datasets.SVHN
. The converted dataset is saved as ‘svhn_format_1.hdf5’ or ‘svhn_format_2.hdf5’, depending on the which_format argument.[SVHN] Yuval Netzer, Tao Wang, Adam Coates, Alessandro Bissacco, Bo Wu, Andrew Y. Ng. Reading Digits in Natural Images with Unsupervised Feature Learning, NIPS Workshop on Deep Learning and Unsupervised Feature Learning, 2011. Parameters: - which_format (int) – Either 1 or 2. Determines which format (format 1: full numbers or format 2: cropped digits) to convert.
- directory (str) – Directory in which input files reside.
- output_directory (str) – Directory in which to save the converted dataset.
- output_filename (str, optional) – Name of the saved dataset. Defaults to ‘svhn_format_1.hdf5’ or ‘svhn_format_2.hdf5’, depending on which_format.
Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.svhn.
convert_svhn_format_1
(directory, *args, **kwargs)[source]¶ Converts the SVHN dataset (format 1) to HDF5.
This method assumes the existence of the files {train,test,extra}.tar.gz, which are accessible through the official website [SVHNSITE].
[SVHNSITE] http://ufldl.stanford.edu/housenumbers/ Parameters: Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.svhn.
convert_svhn_format_2
(directory, *args, **kwargs)[source]¶ Converts the SVHN dataset (format 2) to HDF5.
This method assumes the existence of the files {train,test,extra}_32x32.mat, which are accessible through the official website [SVHNSITE].
Parameters: Returns: output_paths – Single-element tuple containing the path to the converted dataset.
Return type: tuple of str
-
fuel.converters.svhn.
fill_subparser
(subparser)[source]¶ Sets up a subparser to convert the SVHN dataset files.
Parameters: subparser ( argparse.ArgumentParser
) – Subparser handling the svhn command.