Xfiles file tree synchronization and cross-validation

Contents

What it is
Screenshot
Requirements
License
Download
Preface to installation and running
Binary installation
Installation from Source
Setting up to run
Usage
Scripting
FAQ: How does Xfiles differ from Rsync?
Appendix: Version log
Appendix: Sample xfiles.py script
Appendix: Link/Alias detection
Appendix: Troubleshooting
Appendix: Installation from Source using native link detection

What it is

Xfiles is an interactive utility for comparing and merging one file tree with another over a network. It supports freeform work on several machines (no need to keep track of what files are changed on which machine). Xfiles can also be used as a cross-validating disk<->disk backup strategy (portions of a disk may go bad at any time, with no simple indication of which files were affected. Cross-validate against a second disk before backup to make sure you aren't backing up bad data).

A client/server program (GUI on the client) traverses a file tree and reports any files that are missing on the server machine, missing on the client machine, or different. For each such file, the file size/sizes and modification date(s) are shown, and a comparison (using Unix diff) can be obtained. For files that are missing from one tree, `similarly named' files in that tree are reported. Inconsistent files can then be copied in either direction or deleted on either machine.

The file trees do not need to be accessible via nfs. Files checksums are computed in parallel, so largely similar trees can be compared over a slow network link. The client and server processes can also be run on the same machine.

See the topic 'How does Xfiles differ from Rsync' below for more information.

Screenshot

Requirements

Xfiles1.4 requires Java jdk1.2 or above. To see if you have java installed, type java -version, which should report something like

> java -version
java version "1.3.0"
There are various free implementations of jdk1.2 and above. www.javalobby.org has links to some implementations. The Sun java versions are at java.sun.com under "Products and APIs". www.blackdown.org has versions that are better tuned to Linux, though their improvements are also integrated into recent Sun versions. IBM also has a free JDK1.3 implementation with an alternate high-performance virtual machine.

License

Download

Read the LICENSE.txt file first. Xfiles is released under the GPL, but it optionally uses jpython for scripting; jpython has its own open source license, see LICENSE.txt.

Download the latest xfilesBinary or xfilesSource or archive.

Optionally download jpython.jar if you want to do scripting (see scripting section below). [Note: in netscape download this by right-clicking on the link, or by shift-button 1. Simply clicking on the link causes the file to be read into the netscape window, and sometimes causes the browser to hang if you have java turned on] jpython.jar is also needed to recompile the source, though small changes to the source will let it recompile without it. jpython.jar is not needed if you do not want to script.

Download the nativeFile archive only if you want native Unix link detection (see discussion at end). You probably do not need this.

Source and binary archives for older versions

Preface to installation and running

Java tends to be poorly integrated with the underlying operating system. If you're new to java take care to read the directions below -- there are not yet conventions for where and how java programs (nor Java itself) should be installed, so the installation outlined below is not as streamlined as a Linux rpm or similar. The individual steps are easy however.

If you don't already have Java you'll need to select some install location such as /usr/local/java; any directory will work. Similarly, the Xfiles program can live anywhere. If you follow the installation below you will need to launch the program from the directory where it resides (this does not restrict its function). Changes to the shell files to make it run from any directory are evident.

The program and this documentation refer to client and server machines and directories. These are interchangable -- the server merely refers to the machine that the server is running on (see below).

Binary Installation

For both client and server machines, do the following steps:

Installation from Source

(The following instructions are for Unix/Linux)

Setting up to run

Usage

When the GUI comes up, select a directory and press the start button.

The client displays the client file tree at startup to allow you to select a sub-tree of the specified root if desired. If you select nothing the entire tree will compared when you hit the start button.

To save you time Xfiles first scans the whole (sub)tree before reporting any differences (this may take a while); all differences are then reported consecutively.

After synchronizing one directory, you can select another in the GUI and press start again. Currently the GUI file tree does not update to reflect deletions in earlier runs, however (see the TODO section).

The following are optional command-line arguments:

Xfiles writes a file XFILES.LOG listing the selected actions.

Scripting

File selection and interaction with a revision control system such as RCS can be handled by scripting using jpython. To enable scripting, download the file jpython.jar and place it in the client and server launch directories. Then create a file xfiles.py, which also must be copied into both the client and server launch directories.

xfiles.py can define the following functions:

It is not necessary to define all of these functions, however, if a function is defined it should be correct -- if the function call generates an error Xfiles will quit. A sample xfiles.py file is contained with the distribution and listed at the end of this document.

FAQ: How does Xfiles differ from Rsync?

Rsync works automatically, and assumes the source directory is correct. If that disk has corruption, good files on in the backup directory may be overwritten with corrupted files from the source disk. A similar scenario is when you save a file that was accidentally edited somehow. Again, the newer file is not the desired one.

Xfiles is also allows relatively unstructured work in several places, without needing to think of one location as the 'master' -- in this case each directory may have files that are obsolete, as well as ones that should be mirrored. An automated algorithm will duplicate the files that need to be deleted, resulting in extra work later. Xfiles ask you if you want to copy (and in which direction), and gives the option to delete. In the case of text files it also gives a "diff" (On unix) to help in the decision.

In the case where both files have desirable parts, the file must be manually merged; Xfiles doesn't help here (except to identify the situation). Adding something like the Unix xdiff visual merge would be a nice addition.

In rsync's favor, rsync is much faster. (Although, if you add -c to force checksums on all files, rsync gets a lot slower).

Author

Primary author: J.P.Lewis scribblethink.org
Contributors: Peter Gadjokov, Wolfgang Lugmayr, Dan Schmitt

Please e-mail problems, successes, fixes, and fears to: zilla@computer.org
Send email with subject line XFILES to be notified of updates.

To do

Appendix: Version log

Appendix: Sample xfiles.py script

The following example script implements these functions: Although jpython is quite compatible with regular python, this file is written using a mixture of python and java calls (for example, java string functions are used rather than the python string.split routine). This is done to limit the dependence on jpython to the one file jpython.jar.


import java
from java.io import File
import java.lang.Runtime
runtime = java.lang.Runtime.getRuntime()

# ignore files that end with these strings
skipextensions = ['RCS', ',v', '.o', '.so', '.a', '.class', '.jar']

# grab the command line arguments.  
# You could pass in the skipextensions here for example.
def setup(args):
  print 'setup got args:'
  for arg in args:
    print arg

# return 1 if xfiles should visit this path, else 0
#
def pathFilter(path):
  print 'pathFilter(%s)' % path
  if path[len(path)-1] == '~':		# emacs backup file
    return 0			
  if path == 'so_locations':
    return 0
  spath = java.lang.String(path)
  for ext in skipextensions:
    if spath.endsWith(ext):
      return 0
  return 1

# called before copying over a file
# (check out from RCS if appropriate)
#
def preCopy(path):
  name = filename(path)
  spath = filedir(path)
  spath = spath + '/RCS/'
  print 'name = %s' % name
  if exists(spath):			# RCS/ exists
    spath = spath + name + ',v'
    print 'spath = %s' % spath
    if exists(spath):			# RCS/file,v exists
      docmd('co -l -f %s' % path)


# called after copying over a file
# (check in to RCS if appropriate)
#
def postCopy(path):
  name = filename(path)
  spath = filedir(path)
  spath = spath + '/RCS/'
  print 'name = %s' % name
  if exists(spath):			# RCS/ exists
    spath = spath + name + ',v'
    print 'spath = %s' % spath
    if exists(spath):			# RCS/file,v exists
      docmd('ci -u -f -mXfiles_copy_checkin %s' % path)

# helper commands
def docmd(cmd):
  if 1:
    print cmd
  pid = runtime.exec_(cmd)
  pid.waitFor()

def filedir(path):
    result = File(path).getParent()
    if not result:
        if isabs(path):
            result = path # Must be root
        else:
            result = ""
    return result

def filename(path):
    return File(path).getName()

def exists(path):
    return File(path).exists()

def isabs(path):
    return File(path).isAbsolute()

Appendix: Link/Alias detection

This section is largely obsolete because Java link detection appears to work adequately.

Because Xfiles traverses a directory tree, it needs to be able to distinguish between "real" files and links (aliases) so as to avoid an infinite loop in the case where a link points to a directory above itself. There are two approaches to this, and you need to select which one you will use:

Xfiles will call the native function if it exists in the launch directory. Installation with the native function is a bit more work, and it does not exist for non-Unix operating systems yet.

For most purposes it will probably be fine to use the built-in code. Read the appendix Links/Aliases/Shortcuts in Java for more details on this issue.

Troubleshooting Guide

Installation from Source, using native link detection

A Serbo-Croatian translation of this page by Jovana Milutinovich from Geeks Education.

back to home