This is an experimental feature so the API may change in the future.
Keepsake works with any machine learning framework, but it includes a callback that makes it easier to use with PyTorch Lightning.
KeepsakeCallback
behaves like PyTorch Lightning's ModelCheckpoint
callback, but in addition to exporting a model at the end of each epoch, it also:
keepsake.init()
at the start of training to create an experiment, andexperiment.checkpoint()
after saving the model at on_validation_end
. If no validation is defined, then the checkpoint is saved at on_epoch_end
. All metrics that have been logged during training with self.log()
are saved to the Keepsake checkpoint.Here is a simple example:
import torchfrom torch.nn import functional as Fimport pytorch_lightning as plfrom torch.utils.data import DataLoader, random_split, Subsetfrom torchvision.datasets import MNISTfrom torchvision import transformsfrom keepsake.pl_callback import KeepsakeCallbackclass MyModel(pl.LightningModule):def __init__(self):super().__init__()self.layer_1 = torch.nn.Linear(28 * 28, 128)self.layer_2 = torch.nn.Linear(128, 10)self.batch_size = 8def forward(self, x):batch_size = x.size()[0]x = x.view(batch_size, -1)x = F.relu(self.layer_1(x))x = self.layer_2(x)return F.log_softmax(x, dim=1)def prepare_data(self):# download onlyMNIST("/tmp/keepsake-test-mnist",train=True,download=True,transform=transforms.ToTensor(),)def setup(self, stage):# transformtransform = transforms.Compose([transforms.ToTensor()])mnist_train = MNIST("/tmp/keepsake-test-mnist", train=True, download=False, transform=transform)mnist_train = Subset(mnist_train, range(100))# train/val splitmnist_train, mnist_val = random_split(mnist_train, [80, 20])# assign to use in dataloadersself.train_dataset = mnist_trainself.val_dataset = mnist_valdef train_dataloader(self):return DataLoader(self.train_dataset, batch_size=self.batch_size)def training_step(self, batch, batch_idx):x, y = batchlogits = self(x)loss = F.nll_loss(logits, y)self.log("train_loss", x, on_step=True, on_epoch=True, logger=False)return lossdef configure_optimizers(self):return torch.optim.Adam(self.parameters(), lr=1e-3)dense_size = 784learning_rate = 0.1model = MyModel()trainer = pl.Trainer(checkpoint_callback=False,callbacks=[KeepsakeCallback(params={"dense_size": dense_size, "learning_rate": learning_rate,},primary_metric=("train_loss", "minimize"),period=5,)],max_epochs=100,)trainer.fit(model)
The KeepsakeCallback
class takes the following arguments, all optional:
filepath
: The path where the exported model is saved. This path is also saved by experiment.checkpoint()
at the end of each epoch. If it is None
, the model is not saved, and the callback just gathers code and metrics. Default: model.hdf5
params
: A dictionary of hyperparameters that will be recorded to the experiment at the start of training.primary_metric
: A tuple in the format (metric_name, goal)
, where goal
is either minimize
, or maximize
. For example, ("mean_absolute_error", "minimize")
.period
: The callback saves the model at end of this many epochs. Default: 1
save_weights_only
: If True
, then only the model’s weights will be saved, else the full model is saved. Default: False