Stages: Preprocess, Forward and Postprocess
Each Pipeline has a preprocess, forward and postprocess part. We call those parts stages.
As the names suggest, the different stages are responsible for processing data in the different parts of the deep learning workflow:
preprocess stands for pre-processing - for example: loading, reshaping and augmenting data
forward corresponds to the model’s “forward” part - what happens in a PyTorch module, usually on the gpu
postprocess stands for post-processing - for example converting the output of a model to a readable format
To define stages, use the special Transforms padl.batch
and
padl.unbatch
in a composed Pipeline:
from padl import transform, batch, unbatch
from torchvision import transforms, models
transforms = transform(transforms)
models = transform(models)
@transform
def load_image(path):
return Image.load()
@transform
def classify(x):
# [...] lookup the most likely class
return class
my_classifier_transform = (
load_image # preprocessing ...
>> transforms.ToTensor() #
>> batch # ... stage
>> models.resnet18() # forward
>> unbatch # postprocessing ...
>> classify # ... stage
)
The different stages of a Pipeline can be accessed via .pd_preprocess
, .pd_forward
and .pd_postprocess
:
>>> my_classifier.pd_preprocess
load_image >> transforms.ToTensor() >> batch
>>> my_classifier.pd_forward
models.resnet18()
>>> my_classifier.pd_postprocess
unbatch >> classify
The Transforms in the preprocess and postprocess stages process the elements one-by-one whereas the Transforms in the forward stage process batches.
Continue in the next section to learn how to apply transforms to data for inference, evaluation and training.