cache-clean [-h] [-s] [-S] [-m NN -M NN] [-E N] [-D debug_level]
[-f space_command] [ -c <arex_config_file> | <dir1> [<dir2> [...]] ]
cache-clean is a tool for administrators of ARC server installations to safely remove A-REX cache data and to provide an overview of the contents of the cache. It is used by the A-REX to automatically manage cache contents.
There are two modes of operation - printing statistics and deleting files. If -s is used, then statistics are printed on each cache. If -m and -M are used then files in each cache are deleted if the space used by the cache on the file system is more than that given by -M, in the order of least recently accessed, until the space used by the cache is equal to what is specified by -m. If -E is used, then all files accessed less recently than the given time are deleted. -E can be used in combination with -m and -M but deleting files using -E is carried out first. If after this the cache used space is still more than that given by -M then cleaning according to those options is performed.
If the cache is on a file system shared with other data then -S should be specified so that the space used by the cache is calculated. Otherwise all the used space on the file system is assumed to be for the cache. Using -S is slower so should only be used when the cache is shared.
By default the "df" command is used to determine total and (if -S is not specified) used space. If this command is not supported on the cache file system then -f can be used to specify an alternate command. The output of this command must be "total_bytes used_bytes", and so the command would normally be a small script around the file system space information tool. The cache directory is passed as the last argument to this command.
Cache directories are given by dir1, dir2.. or taken from the config file specified by -c or the ARC_CONFIG environment variable.
-h - print short help
-s - print cache statistics, without deleting anything. The output displays for each cache the number of deletable (and locked) files, the total size of these files, the percentage usage of the file system in which the cache is stored, and a histogram of access times of the files in the cache.
-S - Calculate the size of the cache instead of taking used space on the file system. This should only be used when the cache file system is shared with other data.
-M - the maximum used space (as % of the file system) at which to start cleaning
-m - the minimum used space (as % of the file system) at which to stop cleaning
-E - files accessed less recently than the given time period will be deleted. Example values of this option are 1800, 90s, 24h, 30d. The default when no suffix is given is seconds.
-f - alternative command to "df" for obtaining the file system total and used space. The output of this command must be "total_bytes used_bytes". The cache directory is passed as the last argument to this command.
-D - debug level. Possible values are FATAL, ERROR, WARNING, INFO, VERBOSE or DEBUG. Default level is INFO.
-c - path to an A-REX config file, xml or ini format
This tool is run periodically by the A-REX to keep the size of each cache within the limits specified in the configuration file. Therefore cleaning should not be performed manually, unless the cache size needs to be reduced temporarily. For performance reasons it may however be desirable to run cache-clean independently on the machine hosting the cache file system, if this is different from the A-REX host. The most useful function for administrators is to give an overview of the contents of the cache, using the -s option.
Within each cache directory specified in the configuration file, there is a subdirectory for data (data/) and one for per-job hard links (joblinks/). See the A-REX Administration Guide for more details. cache-clean should only operate on the data subdirectory, therefore when giving dir arguments they should be the top level cache directory. cache-clean will then automatically only look at files within the data directory.
cache-clean -m20 -M30 -E30d -D VERBOSE -c /etc/arc.conf
Cache directories are taken from the configuration file /etc/arc.conf and all cache files accessed more than 30 days ago are deleted. Then if the used space in a cache is above 30%, data is deleted until the used space reaches 20%. Verbose debug output is enabled so information is output on each file that is deleted.
APACHE LICENSE Version 2.0
ARC software is developed by the NorduGrid Collaboration (http://www.nordugrid.org), please consult the AUTHORS file distributed with ARC. Please report bugs and feature requests to http://bugzilla.nordugrid.org