How can we display duplicate files and missing files that we searched in folder, in command window?
23 views (last 30 days)
Show older comments
I need to make a function which would take three arguements. 1. source folder address (where all files are present) 2. destination folder address(where we need to copy files by matching it from excel sheet) 3.folder name (having some specific name)..then i need to read the excel file and search for files that are given in an excel sheet in a parent folder and then copy the files matching the files names in excel sheet to another folder((destination). While doing this need to display the duplicate and missing files of the parent folder in command window as well. How can I do that?
This is the code which i tried. It is copying the files properly and displaying them. But i need to display the duplicate and missing files of the parent folder in the command window . the source folder has 6 subfolders which has some specific starting name which makes them the folders of some same category. We are searching these files in all those subfolders here. Please help , how can this be done.
[file,path] = uigetfile('*.xlsx');
myexcel=readtable(file);
mydir='source folder address';
myartfiles=dir(fullfile(mydir,'**', '*.art'));
mycopydir='Destination Folder address';
samefiles=ismember({myartfiles.name}, myexcel.Tests);
filestocopy=myartfiles(samefiles);
r={filestocopy.name};
a=unique(r);
for k=1:length(a)
origpath=fullfile(filestocopy(k).folder,filestocopy(k).name);
destpath=fullfile(mycopydir,filestocopy(k).name);
copyfile(origpath,destpath);
end
h=msgbox(sprintf('%d number of files copied',k));
2 Comments
Answers (1)
Arjun
on 10 Jun 2025
I see that you have already implemented most of the functionality, and now you're looking for a way to identify and display the names of duplicate and missing files.
You can use the "setdiff" function in MATLAB to identify files that are listed in your Excel sheet but are missing from the source folder. To detect duplicate file names within the source folder, the "tabulate" function provides a simple and readable way to count occurrences of each file name and highlight those that appear more than once.
There is one more concern in the code above. We should not use "k" directly to index "filestocopy" because "k" corresponds to the index of unique file names in "a", not the original "filestocopy" array. This mismatch can lead to incorrect indexing errors.
Kindly refer to slightly modified version of the code below:
[file, path] = uigetfile('*.xlsx');
myexcel = readtable(fullfile(path, file));
mydir = sourceFolder;
myartfiles = dir(fullfile(mydir, '**', '*.art'));
mycopydir = destinationFolder;
% Extract all file names from source and Excel
allSourceNames = {myartfiles.name};
excelNames = myexcel.Tests;
% Find matching files
samefiles = ismember(allSourceNames, excelNames);
filestocopy = myartfiles(samefiles);
% Detect missing files
missingFiles = setdiff(excelNames, allSourceNames);
% Detect duplicate files
fileStats = tabulate(allSourceNames);
allDuplicates = fileStats([fileStats{:,2}] > 1, 1);
% Keep only those also listed in Excel
duplicateFiles = intersect(allDuplicates, excelNames);
% Copy found files
r = {filestocopy.name};
a = unique(r);
for k = 1:length(a)
idx = find(strcmp({filestocopy.name}, a{k}), 1); % Find first match
origpath = fullfile(filestocopy(idx).folder, filestocopy(idx).name);
destpath = fullfile(mycopydir, filestocopy(idx).name);
copyfile(origpath, destpath);
end
% Display results
fprintf('%d number of files copied.\n', k);
if ~isempty(duplicateFiles)
fprintf('Duplicate files found in source folder:\n');
disp(duplicateFiles);
else
fprintf('No duplicate files found.\n');
end
if ~isempty(missingFiles)
fprintf('Missing files (listed in Excel but not found in source):\n');
disp(missingFiles);
else
fprintf('No missing files.\n');
end
h = msgbox(sprintf('%d number of files copied', k));
You can refer to the documentation of "setdiff" and "tabulate" function from the documentation links below:
- "setdiff": https://www.mathworks.com/help/releases/R2021a/matlab/ref/double.setdiff.html
- "tabulate": https://www.mathworks.com/help/releases/R2021a/stats/tabulate.html
Hope this helps!
0 Comments
See Also
Categories
Find more on Environment and Settings in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!