Detect directory or file changes in *nix
There are various wasys to do this, but this is what I implemented. It is working as expected as of now on my RHEL 5.x boxes. I’ll take my usecase here and describe things.
Plesk web hosting control panel is managing several hundred domains on one of RHEL box. Addition and removal of domains is very common. We need to sync all domain’s httpdocs directory to other webservers. Of course, a simple rsync can be setup for this but it’s much efficient that rsync should only run when there’s any change, ie addition/removal of a domain or file updation etc. What I mean to say, instead of letting rsync detect changes, its better that our script should detect changes and then run rsync. The obvious advantage is that network burden is reduced because rsync will only sync contents with our servers when there are changes.
The domains are stored in their individual directories in /var/www/vhosts path. Therefore, vhosts is the directory we are watching, if we detect any change in it, we’ll store names of all of its subdirectories in a text file and then can call rsync with domain’s httpdocs as argument to sync.
Here is the script:
#!/bin/bash
############### detectdir.sh by Jagbir Singh #################
#
# script to detect changes in directory.
#
################################################################ directory to watch
DIR=”/var/www/vhosts”# store current statistics of dir
OLD=`stat -t $DIR`while true
do# take a new snapshot of stats
NEW=`stat -t $DIR`# compare it with old
if [ "$NEW" != "$OLD" ]; thenecho “changed!” ## you may want to comment this
# take current listing of dir in a file. domains may be added or removed.
ls $DIR –file-type | grep “\/” | sed ‘s/\///’ > /tmp/dir.list# open file and you can now process entries in it
exec 10 let count=0while read LINE <&10; do
# currently printed on screen, can supply this as arg to rSync, discussed later
echo $LINE/httpdocs/
echo
((count++))
done# take snapshot again and store it in both old and new vars
NEW=`stat -t $DIR`
OLD=$NEW
exec 10>&-
fi# i’m using 3 secs calm time, you should update this as per your environment
sleep 3
done
I’ve skipped rsync command here because that is already discussed in detail in other post here in which separate script is there to execute rsync on demand.
This is cool, I’m needing something like this as we speak. With some modifications of course
One question, how do you ensure that this is running at all times?
Also, what is the purpose of OLD=NEW at the end? Seems that just recalculating OLD would suffice since NEW is recalculated at the beginning of the WHILE loop anyway?
ya, thanks for pointing out, it can be removed. to ensure script running at all time you can modify it a bit so that in the beginning it’ll check itself in all running processes (like ps aux| grep detectdir.sh) and start itself if not running already or do nothing. then put this in cron to run frequently.
Oh man, just ran into a snag… Apparently, stat doesn’t take into account activity in subdirectories, of which I have hundreds and need to sync them all. Any ideas?
Sure, instead of using
stat -t $DIR
you can make a checksum for the entire directory (recursively) like this:
find $DIR | while read f; do echo `stat -t $f`; done | sha1sum
Now, whenever something changes inside $DIR, the checksum will change. Also note that you can add the -L option to find, to also follow symlinks if you want.
Cheers
Here is what I usually put in my scripts:
http://journal.valeriu.me/169/recursive-filedirectory-change-detection/
Beware though, that on very big folders (with hundreds or thousands of files) this may prove to be inefficient! The ideal solution would be to have control on the agent that changes those files and to simply change a variable somewhere when changes are applied. Unfortunately this is rarely the case.
I added a new, much faster method of doing this on my post:
http://journal.valeriu.me/169/recursive-filedirectory-change-detection/