Selecting specific data from pdf
    6 views (last 30 days)
  
       Show older comments
    
%Trying to obtain any values between 48-64 and corresponding values in the
%column to the right 
%For example the first line with value 58 in third column and would like to also
%obtain 100 from it 
%I tried extracting the pdf first but unsure of where to go from here
clear;
pages = [1:18];
str = extractFileText("data-01.pdf",'Pages',pages);
0 Comments
Answers (1)
  Riya
      
 on 15 Sep 2023
        Hello Nathaniel Porter, 
As per my understanding, you want to obtain specific values from a PDF file. Such that values are between 48 and 64 in a specific column and want to retrieve the corresponding values in the column to the right. 
You can follow the steps given below for the same: 
% Split the text into lines 
lines = splitlines(str); 
% Initialize variables 
result = []; 
% Iterate over the lines 
for i = 1:numel(lines) 
    line = lines{i}; 
    % Use regular expressions to find values between 48 and 64 in the third column 
    pattern = '\d+\s+\d+\s+([48-64])\s+(\d+)'; 
    match = regexp(line, pattern, 'tokens'); 
    % If a match is found, extract the values 
    if ~isempty(match) 
        value = str2double(match{1}{1}); 
        correspondingValue = str2double(match{1}{2}); 
        % Store the values in the result 
        result = [result; value, correspondingValue]; 
    end 
end 
% Display the result 
disp(result); 
For more information about ‘regexp’, you can refer the following documentation: 
I hope it helps! 
0 Comments
See Also
Categories
				Find more on Database Toolbox in Help Center and File Exchange
			
	Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!
