Unix bash find file directories with 2 explicit file extensions
I am trying to create a small bash script that essentially traverses a directory containing hundreds of subdirectories. SOME of these subdirectories have textfile.txt and htmlfile.html, where the name text files and htmlfile are variables.
I only really care about the sub-directories that have both .txt and .html in them, all other sub-directories can be ignored.
Then I want to list all the .html and .txt files that are in the same subdirectory
it seems like a pretty simple problem to solve, but I'm at a loss. all I can really get is a line of code that outputs the sub directories with a .html or .txt file without linking to the actual subdirectory they are in, and I'm pretty new to bash scripting so I can't go any further
any help would be greatly appreciated
#!/bin/bash
files="$(find ~/file/ -type f -name '*.txt' -or -name '*.html')"
for file in $files
do
echo $file
done
early
source to share
The following command find
checks each subdirectory and, if it has both html
and txt
files, it lists them all:
find . -type d -exec env d={} bash -c 'ls "$d"/*.html &>/dev/null && ls "$d"/*.txt &>/dev/null && ls "$d/"*.{html,txt}' \;
Explanation:
-
find . -type d
All subdirectories of the current directory are displayed here.
-
-exec env d={} bash -c '...' \;
This sets the environment variable
d
to the value of the found subdirectory and then runs the bash command, which is enclosed in single quotes (see below). -
ls "$d"/*.html &>/dev/null && ls "$d"/*.txt &>/dev/null && ls "$d/"*.{html,txt}
This is the bash command that is being executed. It consists of three statements and together. The first one checks if
d
there are any html files in the directory . If so, the second statement is executed and it checks if there are any txt files. If so, the last statement is executed and it lists all the html and txt files in the directoryd
.
This command is safe for all file and directory names that contain spaces, tabs, or other complex characters.
source to share
#!/bin/bash
#A quick peek into a dir to see if there at least one file that matches pattern
dir_has_file() { dir="$1"; pattern="$2";
[ -n "$(find "$dir" -maxdepth 1 -type f -name "$pattern" -print -quit)" ]
}
#Assumes there are no newline characters in the filenames, but will behave correctly with subdirectories that match *.html or *.txt
find "$1" -type d|\
while read d
do
dir_has_file "$d" '*.txt' &&
dir_has_file "$d" '*.html' &&
#Now print all the matching files
find "$d" -maxdepth 1 -type f -name '*.txt' -o -name '*.html'
done
This script takes the root directory to be searched as the first argument ($ 1).
source to share
The command test
is what you need to check for the presence of every file in each of the subdirectories:
find . -type d -exec sh -c "if test -f {}/$file1 -a -f {}/$file2 ; then ls {}/*.{txt,html} ; fi" \;
where $file1
and $file2
are the two .txt and .html files you are looking for.
source to share