Selecting Randon files from the folder tree
I have an organization of this folder
root/folder_1/file1_1 --up to-- file_5693
root/folder_2/file2_1 --up to-- file_100
root/folder_3/file3_1 --up to-- file_600
root/folder_4/file4_1 --up to-- file_689
I would like to select a number (1000 examples) of random files in each folder and put them all together in the output folder, but for folders with less than 200 files, I would like to copy all the files.
root_2/output:
file1_350
.
.
.
file2_1 --> file2_100
.
.
.
etc
How can i do this?
I tried to list all the folder names in a directory using a command dir
, but the folder numbers are not sequential. Any help?
source to share
I might misunderstand, but I see no reason for ordering the folder names as you will copy them anyway. Below is a script to copy files inside folders which is in the root directory again.
You can simply change the following four variables ROOT_DIR
, OUT_DIR
, THRESHOLD_COPY
and N_RANDOM_COPY
.
% Define
ROOT_DIR = './'; % where the subdirectories are located
OUT_DIR = './root2'; % copy destination
THRESHOLD_COPY = 200; % threshold for copying all files
N_RANDOM_COPY = 100; % number of files that you want to copy
dirList = dir(ROOT_DIR);
dirList = dirList(3:end); % first two are ./ and ../
dirOnlyIndicators = cell2mat({dirList.isdir});
dirs = dirList(dirOnlyIndicators);
for dirIterator = transpose(dirs)
subdirList = dir([ ROOT_DIR dirIterator.name]);
fileIndicators = ~cell2mat({subdirList.isdir});
subfileList = {subdirList(fileIndicators)};
nFiles = sum(fileIndicators);
copyIndices = [];
if nFiles > THRESHOLD_COPY
copyIndices = randperm(nFiles);
copyIndices = copyIndices(1:N_RANDOM_COPY);
else
copyIndices = 1:nFiles;
end
for copyIndex = copyIndices
copyfile([ ROOT_DIR dirIterator.name '/' subfileList{copyIndex}.name],...
[OUT_DIR '/' subfileList{copyIndex}.name],...
'f');
end
end
source to share