Common tasks for Unix system administrators often require working with all of the files in a directory tree and selectively doing something with some of them: copying, deleting, renaming, or moving them, or simply getting a list of files matching certain characteristics.
Sometimes we want to do something from the Unix command line with files that have spaces in their names. Let's see what we can do with our friends find(1), sed(1), and xargs(1).
Find looks for entries in some directory matching its arguments, typically sending a list of them to the standard output. Sed is the Stream EDitor, and applies a series of commands to transform its input into its output. Xargs supplies its input to the command line of any program.
First, let's set up our little foobox:
$ cd /tmp
$ mkdir foo
$ touch 'foo/file with spaces'
$ touch 'foo/bar'
$ touch 'foo/another file with spaces'
$ ls -1 foo
another file with spaces
bar
file with spaces
Now lets's do a simple find:
/tmp $ find foo -type f
foo/file with spaces
foo/bar
foo/another file with spaces
Now let's do something with those files. Let's just list them:
/tmp $ find foo -type f | xargs ls
foo/file: No such file or directory
spaces: No such file or directory
foo/another: No such file or directory
file: No such file or directory
with: No such file or directory
spaces: No such file or directory
foo/bar
What happened? Xargs delivered its input to the command line of ls(1), which interpreted the spaces in the filenames as new filenames. We need to escape the spaces inside the names for ls, but leave the spaces surrounding the filenames. That's just the sort of thing sed likes to do:
/tmp $ find foo -type f | sed 's, ,\\&,g'| xargs ls -ltr
-rw-r--r-- 1 user group 0 May 11 12:12 foo/file with spaces
-rw-r--r-- 1 user group 0 May 11 12:12 foo/bar
-rw-r--r-- 1 user group 0 May 11 12:12 foo/another file with spaces
In the dorky sed command between the single quotes, the "s" means to substitute for the text matched by the pattern between the first and second delimiter the text between the second and third delimiters. I like to use commas as delimiters instead of slashes, though any character will do. Slashes often appear in path names, and by habitually using commas I avoid errors when I fail to escape the slashes.
The pattern, called a regular expression, in this case says to look for a space, and replace it with a backslash followed by the text we just found. This is sed-ese for "prepend a backlash".
A slightly more general approach is to wrap each filename with single quotes. You still run into a problem with filenames which have single quotes in them, but you shouldn't put quotes in filenames:
$ find foo -type f | sed -e "s,[^.],\'&," -e "s,\$,\',"
'foo/file with spaces'
'foo/bar'
'foo/another file with spaces'
$ find foo -type f | \
sed -e "s,[^.],\'&," \
-e "s,\$,\'," | \
xargs ls
foo/another file with spaces
foo/file with spaces
foo/bar
Sharp reader Nic Ivy has noted a far simpler way to deal with spaces in filenames for find(1) and xargs(1), which also deals with other special characters like quotes and greater-than or less-than symbols:
$ find foo -type f -print0 | xargs -0 ls
foo/another file with spaces
foo/file with spaces
foo/bar
From the Unixhelp xargs(1) man page:
--null, -0
Input items are terminated by a null character instead of by
whitespace, and the quotes and backslash are not special (every
character is taken literally). Disables the end of file string,
which is treated like any other argument. Useful when input
items might contain white space, quote marks, or backslashes.
The GNU find -print0 option produces input suitable for this
mode.
Man pages courtesy UnixHelp.
3 comments:
A simpler approach using null termination:
find foo -type f -print0 | xargs -0 ls
Thanks! That helped me a lot with an old BusyBox that doesn't support find -print0 and xargs -0
Post a Comment