PyTorch Mental Map
PyTorch code usually separates into two big zones: model specification, which defines the computation, and training setup, which measures error and updates trainable weights.
Big Picture
Wrappersorganize layers
Layerstransform data
Activationsadd nonlinearity
Normalizationstabilize training
Dropoutregularize
Lossesmeasure error
Optimizersupdate weights
Tiny Examples
model = nn.Sequential( # wrapper: organizes layers
nn.Linear(16, 32), # layer: transforms data
nn.ReLU(), # activation: adds nonlinearity
nn.Dropout(0.1), # dropout: regularizes
nn.Linear(32, 1), # layer: final transform
)
loss_fn = nn.BCEWithLogitsLoss()
loss = loss_fn(logits, labels) # loss: measures error
optimizer = torch.optim.AdamW(
model.parameters(),
lr=1e-4,
)
optimizer.step() # optimizer: updates weights
- Wrapper
nn.Sequentialchains modules in order. - Layers
nn.Linearchanges vector dimensions. - Activation
nn.ReLUlets the model learn nonlinear patterns. - Dropout
nn.Dropoutrandomly drops values during training. - Loss
loss_fn(logits, labels)computes how wrong predictions are. - Optimizer
optimizer.step()applies the weight update.
Per-Epoch Training Loop
for epoch in range(num_epochs):
model.train() # put model in training mode
for xb, yb in loader: # get one mini-batch
optimizer.zero_grad() # clear old gradients
logits = model(xb) # forward pass
loss = loss_fn(logits, yb) # measure error
loss.backward() # compute gradients
optimizer.step() # update trainable weights
total_loss += loss.item() * len(xb)
- Epochone full pass over the training data.
- Batcha small chunk of samples from the loader.
- Forward
model(xb)runs the architecture. - Loss
loss_fn(...)turns predictions into an error value. - Backward
loss.backward()computes gradients. - Step
optimizer.step()updates trainable weights.