How to change Windows-compliant filenames to Unix filenames

How to change Windows-compliant filenames to Unix filenames

Post by David Peterso » Sat, 24 Nov 2001 12:34:42



I'm very green with Linux still and I'm trying to overcome a huge nuisance.
Many files I receive are from Windows users who *love* to embed funky
characters in their file names.  Is there is clever way to reformat the
filenames in batch mode?  I've tried writing a simple script using 'ls', 'xargs'
to build my set of filenames for processing,  but the problem I'm seeing
is a filename like:

This is a windows filename.gif

appears like this inside of a scripting for loop (who's echoing the
filename):

This
is
a
windows
filename.gif

Any help would be much appreciated.

-David

 
 
 

How to change Windows-compliant filenames to Unix filenames

Post by j.. » Sat, 24 Nov 2001 12:49:11



> I'm very green with Linux still and I'm trying to overcome a huge
> nuisance.  Many files I receive are from Windows users who *love* to
> embed funky characters in their file names.  Is there is clever way
> to reformat the filenames in batch mode?  I've tried writing a
> simple script using 'ls', 'xargs' to build my set of filenames for
> processing, but the problem I'm seeing is a filename like:

> This is a windows filename.gif

> appears like this inside of a scripting for loop (who's echoing the
> filename):

> This
> is
> a
> windows
> filename.gif

You need to quote the name. For example

f1="This is a windows filename.gif"
f2=normalName.gif

for f in "$f1" $f2;do
  echo $f
done

Double quotes will preserve it with the spaces as a single entity,
while still allowing $f1 to be expanded.

When in doubt, use double quotes like this.

 
 
 

How to change Windows-compliant filenames to Unix filenames

Post by Chris F.A. Johnso » Sat, 24 Nov 2001 15:17:04



> I'm very green with Linux still and I'm trying to overcome a huge nuisance.
> Many files I receive are from Windows users who *love* to embed funky
> characters in their file names.  Is there is clever way to reformat the
> filenames in batch mode?  I've tried writing a simple script using 'ls', 'xargs'
> to build my set of filenames for processing,  but the problem I'm seeing
> is a filename like:

> This is a windows filename.gif

> appears like this inside of a scripting for loop (who's echoing the
> filename):

> This
> is
> a
> windows
> filename.gif

As I see it, you have two problems:
        1. Funky file names (read as if s/n/c/)

The solution is, of course, to remedy problem 1, then problem 2
disappears.

However, a script to remedy problem 1 runs into problem 2.

Problem 2 is not hard to overcome, but I'd recommend doing it only once,
and changing the offensi^H^Hding filenames so they don't bother you
again.

To harvest the filenames, use find. For example, to find filenames
containing a space (I know it's legal to have spaces in file names;
drinking 3 bottles of scotch every day is legal, but it, too, is not
advisable):

find "$DIR" -name '* *' [-print] ## -print is not necessary with most
                               ## modern versions of find

To convert the filename to a saner moniker:

newname=`echo "$FILENAME" | tr -cd '[a-zA-Z0-9.\-]'`

The characters between the square brackets are those acceptable (sane,
not just legal) for file names. They are all upper- and lower-case
letters, the digits 0 to 9, and '-' and '.'. If you like, you could
include some other characters such as '~'.

Put it together and you have:

DIR=dir_with_a_problem
find "$DIR" -name '* *' | ## adjust to catch other weird characters
        while read filename
        do
            newname=`echo "$filename" | tr -cd '[a-zA-Z0-9.\-]'`
            mv "$filename" "$newname"
        done

You should check for collisions between sanitized filenames, but this
gives you the basic method. Other denizens of this group will help you
plug the holes.

--
    Chris F.A. Johnson                        http://cfaj.freeshell.org
    ===================================================================
    My code (if any) in this post is copyright 2001, Chris F.A. Johnson
    and may be copied under the terms of the GNU General Public License

 
 
 

How to change Windows-compliant filenames to Unix filenames

Post by Bill Marcu » Sun, 25 Nov 2001 04:34:38



Quote:>I'm very green with Linux still and I'm trying to overcome a huge nuisance.
>Many files I receive are from Windows users who *love* to embed funky
>characters in their file names.

I hate to say this, but Unix allows even more funky characters in
file names than Windows does.  Any character except / or null can be
used in a file name.
 
 
 

How to change Windows-compliant filenames to Unix filenames

Post by those who know me have no need of my nam » Tue, 27 Nov 2001 18:58:52



Quote:>I'm very green with Linux still and I'm trying to overcome a huge nuisance.
>Many files I receive are from Windows users who *love* to embed funky
>characters in their file names.  Is there is clever way to reformat the
>filenames in batch mode?  

good shells (e.g., bash and ksh) should be able to handle filenames with
spaces without any problem.  most likely you are forgetting to quote the
variable when passing it as an argument to other functions or programs,
which causes the shell to present it as multiple arguments (due to IFS
processing), e.g., instead of ``for f in *; do echo $f; done'' use ``for f
in *; do echo "$f"; done''.

i differ from others in that i suggest that you get used to this, and start
writing your scripts so that they can deal with `strange' file names, e.g.,
instead of ``for f in *; do echo $f; done'' use ``find -maxdepth 1 -print0
| xargs -r0i echo "{}"''.  (gnu utilities assumed.)

--
okay, have a sig then

 
 
 

How to change Windows-compliant filenames to Unix filenames

Post by Brian Hile » Wed, 28 Nov 2001 10:20:31




> good shells (e.g., bash and ksh) should be able to handle filenames with
> spaces without any problem.  most likely you are forgetting to quote the
> ...

In general, yes, but since we're on the topic....

Under ksh88 _and_ ksh93 (!) the "whence" builtin (mind you, _builtin_)
cannot be made to function correctly with arguments with embedded spaces.

A surprising bug, and all the more so that it has "survived" so many revisions.

=Brian

 
 
 

How to change Windows-compliant filenames to Unix filenames

Post by those who know me have no need of my nam » Wed, 28 Nov 2001 18:30:17



Quote:>Under ksh88 _and_ ksh93 (!) the "whence" builtin (mind you, _builtin_)
>cannot be made to function correctly with arguments with embedded spaces.

>A surprising bug, and all the more so that it has "survived" so many
>revisions.

indeed.

the whence in pdksh (5.2.14) has no trouble with embedded spaces.

--
okay, have a sig then

 
 
 

How to change Windows-compliant filenames to Unix filenames

Post by Brian Hile » Thu, 29 Nov 2001 08:02:03



Quote:> indeed. the whence in pdksh (5.2.14) has no trouble with embedded spaces.

Completely understandable, since pdksh has a different source lineage.

However, I think the purpose in sending this comment is to trumpet
ksh's "smaller brother." ;)

I am most distressed by this particular bug, because unlike the
preponderance of other bugs, this seems *not* to be an issue with
usage, but truly and simply just a programming oversight -- how
else could it have happened, that ksh flubs passing a string
(filename) to a function?!

=Brian