- Difficulty level: easy
- Time need to lean: 10 minutes or less
- Key points:
zapped
files have only their signatures
SoS keep tracks of all intermediate files and will rerun steps only if any of the tracked files are removed or changed. However, it is often desired to remove some of the large non-essential intemediate files to reduce diskspace used by completed workflows, while allowing the workflow to be re-executed without these files. SoS provides a command
sos remove files --zap
to zap specified file, or for example
sos remove . --size +5G --zap
to zap all files larger than 5G. This command removes specified files but keeps a special {file}.zapped
file with essential information (e.g. md5 signature, and size). SoS would consider a file exist when a .zapped
file is present and will only regenerate the file if the actual file is needed for a later step.
For example, let us execute a workflow with output temp/result.txt
, and temp/size.txt
.
and let us zap the intermediate file temp/result.txt
,
As you can see, temp/result.txt
is replaced with temp/result.txt.zapped
. Now if you rerun the workflow