Applying Transforms to Data

Each Transform can be applied to data in three different modes: infer, eval and train.

To process single items in inference mode, use infer_apply().

>>> my_classifier.infer_apply('cat.jpg')
"cat"

Under the hood, a batch dimension is automatically added before the forward stage and removed before the postprocess stage.

infer_apply() automatically disables gradients and sends tensors to a gpu if that’s set for the Pipeline (see Devices).

To process multiple items in eval mode, use eval_apply(). eval_apply() expects an iterable input and returns an output generator:

>>> list(my_classifier.eval_apply(['cat.jpg', 'dog.jpg', 'airplane.jpg']))
["cat", "dog", "airplane"]

Internally, eval_apply() creates a PyTorch DataLoader for the preprocessing part. All arguments available for the PyTorch DataLoader can be passed to eval_apply(), for example, preprocessing can be done with multiple workers and specified batch size with num_workers and batch_size args.

eval_apply() like infer_apply(), automatically disables gradients and can send tensors to a gpu (see Devices).

To process multiple items in train mode, use train_apply(). It expects an iterable input and returns an output generator:

for batch in my_classifier.pd_forward.train_apply():
    ...  # do a training update

train_apply() also uses a PyTorch DataLoader for which arguments canbe passed, handles devices, and, naturally, does not disable gradients.

The outputs of eval_apply() and train_apply() are _GeneratorWithLength objects, a generator that supports len(). This allows adding progress bars, for instance using tqdm:

from tqdm import tqdm

[...]

for batch in tqdm(my_classifier.eval_apply()):
    ...  # loop through batches showing a progress bar

Read the next section to learn how to save and load transforms.