Applying Transforms to Data
Each Transform can be applied to data in three different modes: infer, eval and train.
To process single items in inference mode, use infer_apply()
.
>>> my_classifier.infer_apply('cat.jpg')
"cat"
Under the hood, a batch dimension is automatically added before the forward stage and removed before the postprocess stage.
infer_apply()
automatically disables gradients and sends tensors to a gpu if that’s set for the Pipeline (see Devices).
To process multiple items in eval mode, use eval_apply()
. eval_apply()
expects an iterable input and returns an output generator:
>>> list(my_classifier.eval_apply(['cat.jpg', 'dog.jpg', 'airplane.jpg']))
["cat", "dog", "airplane"]
Internally, eval_apply()
creates a PyTorch DataLoader
for the preprocessing part. All arguments available for the PyTorch DataLoader
can be passed to eval_apply()
, for example, preprocessing can be done with multiple workers and specified batch size with num_workers
and batch_size
args.
eval_apply()
like infer_apply()
, automatically disables gradients and can send tensors to a gpu (see Devices).
To process multiple items in train mode, use train_apply()
. It expects an iterable input and returns an output generator:
for batch in my_classifier.pd_forward.train_apply():
... # do a training update
train_apply()
also uses a PyTorch DataLoader
for which arguments canbe passed, handles devices, and, naturally, does not disable gradients.
The outputs of eval_apply()
and
train_apply()
are _GeneratorWithLength
objects, a generator that supports len()
. This allows adding progress bars, for instance using
tqdm:
from tqdm import tqdm
[...]
for batch in tqdm(my_classifier.eval_apply()):
... # loop through batches showing a progress bar
Read the next section to learn how to save and load transforms.