cmsSync
How to use the cmsSync software to make sure that your file system is consistent with dCache
The cmsSync software allows you to determine the difference between what is on disk at your site and what PhEDEx has as registered at your site.
Installation
- Install the Nebraska YUM repository as documented here.
- Install the cmsSync RPM with the following command:
yum install cmsSync
Running the client
The client does the following things:- Gets a list of all blocks registered for your SE.
- Retrieve your site's TFC.
- Builds a list of file names from these blocks (long process).
- Spiders a given base directory (long process).
- Compares the contents of dCache and the list of files which should be at your site.
The command line usage goes like this:
cmsSync --se sename.unl.edu /pnfs/path/to/cms/storeThis takes around 10 minutes to run at Nebraska.
Known Bug
There is a known bug in some versions of dCacheNebraska/CherryPy where a file fails on the line "import cherrypy._cpengine". Remove this line from the referenced file, and it will not adversely the running of the cmsSync script (or any other dCacheNebraska application).This is fixed in future versions of dCacheNebraska
Output
The client writes several files (as documented on the output of the utility). They are:- not_lfns.txt: A list of all files in the base directory which are not in the CMS namespace at all
- user_lfns.txt: A list of all user files at your site.
- registered_lfns.txt: All LFNs which should be at your site
- blocks.txt: All blocks which should be at your site.
- missing_lfns.txt: A list of all files which are registered to be at your site, but are not in dCache
- extra_lfns.txt: A list of all files which are at your site, but not registered in PhEDEx.
What to do with the output
- The missing_lfns.txt can be attached to a Savannah ticket. Ask the dataops folks to remove the replicas at your site, but not the subscriptions. This means any datasets your site is still subscribed to are re-downloaded.
- The extra_lfns.txt should be examined carefully. Delete any files which are not in one of these categories:
- Have not been recently transferred (the synchronization is not immediate, meaning any in-transit files from when you ran your script might be falsely marked as extra).
- Are not unmerged files.
- Are not load test files.