Cleaning up my script.

Cleaning up my script.

Post by Russell Conne » Thu, 26 Nov 1998 04:00:00



Hello folks,

I wrote this script with the idea that I could synchronize two directories.
After writing it, it looks a bit clunky, and I was hoping someone might make
suggestions on how to smooth out the code. My first idea is to get rid of
the if...then structure after the "cmp" statement and use a case..esac. Also
the cmp can take some time, any suggestions on whether, say, chksum might
work faster, or rewrite it in Perl? This is going to work over a NFS mount.
How about doing the checks/copy at the dir level?

Here is what I have so far:

Sorry the tabs did not come across, I know it makes it harder to read :~(

# setup some variables
master=$1
slave=$2
TDIR=$PWD
#debug echos
echo "$TDIR""/mlist"
echo $master
echo $slave
# Cd to master/slave directory so all files start with ./ and
# output to mlist/slist in the working dir
cd $master
find . -print > "$TDIR""/mlist"
cd $slave
find . -print > "$TDIR""/slist"
cd $TDIR
# read in each line of mlist to compare with each line in slist
while read x
do
# debug echo
echo $x
# use grep to get a count of all occurences of $x in slist use -xc to get
# a line count and use exact match
d=$(grep -cx $x slist)
# This little hack takes ./file and makes it a /<master>/file and
/<slave>/file
x1=$master"$(echo $x | sed 's/.//')"
x2=$slave"$(echo $x  | sed 's/.//')"
# if $x does not appear in slist copy it
if [ "$d" = "0" ]
then
     echo $x1 not found, cpy it.
     copy $x1 $x2
# if it does exist cmp it to determine if the same.
     elif [ "$d" = "1" ]
     then
     echo Compare $x1 to $x2
     # the sed stuff takes relitive./file and turns it into
     # absolute /<master>/file
 # debug echo
 echo "$x1 = $x2 ?"
 # cmpare the files
 cmp -s $x1 $x2
  # $? is the variable holding the result of the exit code
  # so if it equals 1 (false) do a copy over them
  if [ $? = 1 ]
  then
  echo "******* $x1 and $x2 are different, cpy $x1 over $x2"
  cp $x1 $x2
  else
  # debug echo, put error handling here.
  echo SAME
  fi
 fi
done < mlist
#prototype code to do the deletes
while read x
do
echo $x
d=$(grep -cx $x mlist)
if [ "$d" = "0" ]
then echo "I would delete " $x
fi
done < slist
# dont forget to clean up the temp files mlist slist

BTW another guy said he could do it (write the prototype code) faster in C,
I got done first AND he is the professional programmer! Gotta love scripts!

 
 
 

Cleaning up my script.

Post by Icarus Spar » Tue, 01 Dec 1998 04:00:00



Quote:>I wrote this script with the idea that I could synchronize two directories.
>After writing it, it looks a bit clunky, and I was hoping someone might make
>suggestions on how to smooth out the code. My first idea is to get rid of
>the if...then structure after the "cmp" statement and use a case..esac. Also
>the cmp can take some time, any suggestions on whether, say, chksum might
>work faster, or rewrite it in Perl? This is going to work over a NFS mount.

chksum and/or Perl will not make any significant difference in speed - you
still need to read all of the data and that is the slow bit!

Quote:>How about doing the checks/copy at the dir level?
>Here is what I have so far:

64 lines of script deleted

Quote:>BTW another guy said he could do it (write the prototype code) faster in C,
>I got done first AND he is the professional programmer! Gotta love scripts!

Of course you could always just write a 'distfile' file,

# Tiny 'distfile' to keep two directory trees in step
( . ) -> localhost
        install whole,remove /slave/directory;

Put it in the top level 'master' directory, and then type 'rdist'. You might
also want to look at the 'younger' option.

Of course I am not a professional programmer, but I think that my solution
is better than a 64 line script.

Icarus
P.S. Old versions of 'rdist' had security wholes in them, make sure that your
version of 'rdist' is not SUID. If it is, then upgrade it.

 
 
 

Cleaning up my script.

Post by Monty Taylo » Tue, 01 Dec 1998 04:00:00


*** Apologies for the clunky quoting, I'm using Outlook ***

Just a note... rdist is not necessarily (sp) on all systems. For that
matter, neither is perl, although it is more common. If you don't have
rdist, you can also use mirror, which is a collection of perl scripts.
I agree with Russell, Perl won't be any quicker for the job. However, the
script itself ( if you were to write it from scratch, and you knew perl )
would be a tad simpler.
Do a net search for rdist and/or mirror, though, those will be your best
bet.

--
--------------------------------------------------------------
Monty Taylor
Best Consulting -- Seattle, WA



>>I wrote this script with the idea that I could synchronize two
directories.
>>After writing it, it looks a bit clunky, and I was hoping someone might
make
>>suggestions on how to smooth out the code. My first idea is to get rid of
>>the if...then structure after the "cmp" statement and use a case..esac.
Also
>>the cmp can take some time, any suggestions on whether, say, chksum might
>>work faster, or rewrite it in Perl? This is going to work over a NFS
mount.

>chksum and/or Perl will not make any significant difference in speed - you
>still need to read all of the data and that is the slow bit!

>>How about doing the checks/copy at the dir level?

>>Here is what I have so far:

>64 lines of script deleted

>>BTW another guy said he could do it (write the prototype code) faster in
C,
>>I got done first AND he is the professional programmer! Gotta love
scripts!

>Of course you could always just write a 'distfile' file,

># Tiny 'distfile' to keep two directory trees in step
>( . ) -> localhost
> install whole,remove /slave/directory;

>Put it in the top level 'master' directory, and then type 'rdist'. You
might
>also want to look at the 'younger' option.

>Of course I am not a professional programmer, but I think that my solution
>is better than a 64 line script.

>Icarus
>P.S. Old versions of 'rdist' had security wholes in them, make sure that
your
>version of 'rdist' is not SUID. If it is, then upgrade it.

 
 
 

Cleaning up my script.

Post by Icarus Spar » Wed, 02 Dec 1998 04:00:00



Quote:>*** Apologies for the clunky quoting, I'm using Outlook ***

Apology accepted. - See below :-)

Quote:>Just a note... rdist is not necessarily (sp) on all systems. For that
>matter, neither is perl, although it is more common. If you don't have

Of course rdist is not on all systems, but it is surplied as standard on
a lot more systems than Perl currently is. It has been a standard part of
the BSD 'r' suite for many many years. By default 'rdist' will be a lot
quicker (but slightly less accurate) than a 'cmp' as it will compare only
the file size and time rather than the data. But the chances of two people
modifing a script and writing it back in the same second, with their modified
files being the same size is pretty small. Of course you can always tell rdist
to do a full binary compare as well.

Quote:>rdist, you can also use mirror, which is a collection of perl scripts.
>I agree with Russell, Perl won't be any quicker for the job.

              ^^^^^^^ My name is Icarus!
Icarus
 
 
 

Cleaning up my script.

Post by Donald Desrosie » Thu, 03 Dec 1998 04:00:00


: Hello folks,

Russell,

Clearly, reasonable people can (and will) disagree on
how this should be done.

If you didn't send the whole script (less the tabs :-))
you might want to check for the existence of $1 and $2.
If you didn't enter them (or they are wrong), this
script is going to do strange things.

: # setup some variables
: master=$1
: slave=$2
: TDIR=$PWD

A single echo will do it. (and will run faster)

: #debug echos
: echo "$TDIR""/mlist"
: echo $master
: echo $slave

You could put these into the background and do a wait. I could
argue that one either way.

: # Cd to master/slave directory so all files start with ./ and
: # output to mlist/slist in the working dir
: cd $master
: find . -print > "$TDIR""/mlist"
: cd $slave
: find . -print > "$TDIR""/slist"
: cd $TDIR

Sort mlist and slist and do a diff. Base your action
on the > or < at the beginning of the line (ignore
lines that don't have > or <). It's a boatload faster
than reading through mlist and slist.

[ Rest of script snipped ]

-don

 
 
 

Cleaning up my script.

Post by Russell Conne » Thu, 03 Dec 1998 04:00:00


Thanks guys!
I did not know about rdist, and had not seen it in any of my books Essential
SA by Frisch or SCO Companion by Mohr + a few others.
This will most likely be the method I use, I asked in comp.unix.admin
describing this idea before I wrote the script and did not get an answer
other than using 1776!
I will need to do binary compares because some of the data is Progress
databases, they can remain the same size and date over a period of time
until the .BI is flushed, not to mention the code, what if some one changes
only a single charactor?

Oh well, I needed the practice...


>Hello folks,

>I wrote this script with the idea that I could synchronize two directories.
>After writing it, it looks a bit clunky, and I was hoping someone might
make
>suggestions on how to smooth out the code. My first idea is to get rid of

 
 
 

Cleaning up my script.

Post by Heiner Steve » Sun, 13 Dec 1998 04:00:00



> I wrote this script with the idea that I could synchronize two directories.
> After writing it, it looks a bit clunky, and I was hoping someone might make
> suggestions on how to smooth out the code. My first idea is to get rid of
> the if...then structure after the "cmp" statement and use a case..esac. Also
> the cmp can take some time, any suggestions on whether, say, chksum might
> work faster, or rewrite it in Perl? This is going to work over a NFS mount.
> How about doing the checks/copy at the dir level?

I commented some parts of the script, and present a somewhat shorter
(and faster) version at the end...

Quote:> # setup some variables
> master=$1
> slave=$2
> TDIR=$PWD
> #debug echos
> echo "$TDIR""/mlist"
> echo $master
> echo $slave
> # Cd to master/slave directory so all files start with ./ and
> # output to mlist/slist in the working dir
> cd $master
> find . -print > "$TDIR""/mlist"
> cd $slave
> find . -print > "$TDIR""/slist"

    The "master" and "slave" file lists are not strictly necessary,
    because the processing may be within a pipe.

Quote:> cd $TDIR
> # read in each line of mlist to compare with each line in slist
> while read x
> do
> # debug echo
> echo $x
> # use grep to get a count of all occurences of $x in slist use -xc to get
> # a line count and use exact match
> d=$(grep -cx $x slist)

    If you just want to know, if the file from the master directory exists
    within the slave directory, why don't you just use

        if [ -f "$master/$x" ]
        then
            echo "file does exist in master directory"
        else
            echo "file does not exist in master directory"
        fi

    This increases the speed of the script, because a "grep" on a
    potentially large file is replaced by a single "[ -f ... ]".

Quote:> # This little hack takes ./file and makes it a /<master>/file and
> /<slave>/file
> x1=$master"$(echo $x | sed 's/.//')"
> x2=$slave"$(echo $x  | sed 's/.//')"

    A more general way to convert a relative directory name ("./something",
   "something") into an absolute name (starting with "/") is

        absdir=`(cd "$reldir"; pwd)`

Quote:> # if $x does not appear in slist copy it
> if [ "$d" = "0" ]
> then
>      echo $x1 not found, cpy it.
>      copy $x1 $x2
> # if it does exist cmp it to determine if the same.
>      elif [ "$d" = "1" ]
>      then
>      echo Compare $x1 to $x2
>      # the sed stuff takes relitive./file and turns it into
>      # absolute /<master>/file
>  # debug echo
>  echo "$x1 = $x2 ?"
>  # cmpare the files
>  cmp -s $x1 $x2
>   # $? is the variable holding the result of the exit code
>   # so if it equals 1 (false) do a copy over them
>   if [ $? = 1 ]
>   then
>   echo "******* $x1 and $x2 are different, cpy $x1 over $x2"
>   cp $x1 $x2
>   else
>   # debug echo, put error handling here.
>   echo SAME
>   fi
>  fi
> done < mlist
> #prototype code to do the deletes
> while read x
> do
> echo $x
> d=$(grep -cx $x mlist)
> if [ "$d" = "0" ]
> then echo "I would delete " $x
> fi
> done < slist
> # dont forget to clean up the temp files mlist slist

> BTW another guy said he could do it (write the prototype code) faster in C,
> I got done first AND he is the professional programmer! Gotta love scripts!

This is an optimized version of the script:

# The following processing assumes, that these are absolute paths.
# Convert them to absolute paths (starting with "/"):
master=`(cd $1; pwd)`
slave=$2

# first pass: compare or copy all files from the master directory
(cd "$master"
find . -print |
    while read path
    do
        slavepath=$slave/$path
        if [ -r "$slavepath" ]
        then
            echo "file $path does not exist in slave directory"
            cp "$path" "$slavepath"
        elif cmp "$path" "$slavepath" >/dev/null 2>&1
        then
            echo "file $path was not changed"
        else
            echo "file $path has changed, copying..."
            cp "$path" "$slavepath"
        fi
    done)

# second pass: remove files from slave directory that do no longer exist
# in master directory
(cd "$slave"
find . -print |
    while read path
    do
        [ -r "$master/$path" ] || rm -f "$path"
    done)
<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<

Heiner
--
  --------------------------------------------------------------

/ Heiner's SHELLdorado: http://www.oase-shareware.org/shell  /
-------------------------------------------------------------
ZZW:q!^X^C^DYES^M^JQXexit^Mquit^M^C^C^Z^?^Qq^[xxxalles kacke