find function(s) within string
Show older comments
Hi
I am using eval in some code i have made to evaluate strings made by users in text files. This poor descission should not be disussed as i know it is not a good way to do Things but it relies on som inherrited rather cumbersome code.
I have come to a point where i would like to compile my code which i can't as it is using the eval/feval. I Thus need to find all the functions defined in the text files. a function in the text file could look like:
'mean(2+3)+fftNew(a,c,'test',2)'
What i would like to do is parse the above information into:
functionCell{1} = 'mean'
functionCell{2} = 'fftNew'
this can then be passed to the compiler so i have the user functions included in the compile process.
can anyone help me with a smart way to do this? :)
4 Comments
John D'Errico
on 26 Oct 2017
Edited: John D'Errico
on 26 Oct 2017
So, you want someone to write a parser for you, that would extract all MATLAB function names? (At least, tell you how to write that parser.) Of course, the parser needs to find function names only, but not the name of a function that happens to appear in a comment or a string meant for display. It needs to find function names inside a string that will go into an eval statement. And of course, it needs to be intelligent enough that a function name that was overloaded as a variable is not a function call. So if somewhere in the code the lines appear as...
mean = [1 2 3];
mean(2)
this is NOT a call to the function mean.m, but a simple subscript on a variable named mean.
Good luck. You will need it.
Oh, by the way, there are very good reasons not be be doing what you have chosen to do. But you seem to be aware of them and don't care. So have fun.
As far as the requirement that you CANNOT change how this is working, laying layer upon layer of crap code on top of crap code only serves to create something that will never work well or efficiently, and will be pure hell to maintain and debug.
There is a point in time where the right decision is to toss the entire mess into the bit bucket, starting from scratch, using good coding techniques from the start. Or, you can continue down the road of the unmaintainable, unworkable.
Stephen23
on 26 Oct 2017
@Kasper Bitsch Lund: why not just get the users to write actual MATLAB scripts or functions?
The poor decision should be discussed, maybe not as a criticism but to evaluate whether it wouldn't be less work to redesign the whole thing.
If you are stuck, this is what I would do if I had 30 minutes for somehow solving the problem short term:
- Post another question on the forum, cleverly avoiding mentioning the context, e.g. "Is there a MATLAB function (internal/undocumented/etc)" able to return all variable names and/or all function names when passed an M-File?"
and while waiting, build a regexp-based approach that:
- Process the whole file content and produce a reduced content by removing constant strings and comments. You can see in Per's thread here how just that would already be complicated to achieve if you wanted to do it well in the general case).
- Extract all expressions on the left of the first occurrence of a single = symbol on each line (LHS).
- Maybe make the above a little more robust by managing multi-line expressions, replacing '\s+\.{3}\s+' with ' ', hoping that nobody added comments after ...
- Detect cases with [..] (multiple outputs) in the LHS and split on commas.
- Process the outcome of LHS processing and extract first expression of each part compatible with variable naming, and store in a cell array as the set of all likely variable names.
- Extract [a-zA-Z][\w]*(?=\() from reduced content as a set of all variable and function names (which doesn't capture calls to scripts and functions with no arguments by the way).
- SETDIFF this set and the set of all likely variable names.
But unless you are very familiar with regular expressions, this is going to take much more than 30 minutes, hence my first point about evaluating if redesigning the whole thing wouldn't be more efficient.
Steven Lord
on 26 Oct 2017
John: don't forget about overloaded methods.
Q = sin(A)
Which sin function is called by that line of code (and would need to be included in the compilation process)? Is it the built-in function for double or single precision variables, the overload for sym objects in Symbolic Math Toolbox, the overload for gpuArray in Parallel Computing Toolbox, some other overload, a user-written function, etc.?
Answers (2)
the cyclist
on 26 Oct 2017
This could (obviously?!) get fairly tricky, depending on how elaborate these strings could get. But here is a fairly straightforward way that will at least work for your simple example:
s = 'mean(2+3)+fftNew(a,c,''test'',2)';
startFunctionName = regexp(s,'[a-z,A-Z]+(');
endFunctionName = regexp(s,'(')-1;
for ns = 1:numel(startFunctionName)
functionCell{ns} = s(startFunctionName(ns):endFunctionName(ns));
end
The challenge will be defining the regular expression to precisely match your possibilities.
For example, this code assumes that the function name does not have any numeric characters -- only letters. You'll probably want to fix that up. The documentation for regexp will help.
2 Comments
Stephen23
on 26 Oct 2017
Hi Again
i very much care about it :D it will require alot of Work to rewrite what i have inherrited and to be honest i don't know how to start on that. Right now me the convenience of having people able to write matlab codes in text files and having it executed on a server is a very big benefit. Thanks for you inputs so far. I really appreciate it.
@the cyclist i will try to test that :)
"Right now me the convenience of having people able to write matlab codes in text files and having it executed on a server is a very big benefit"
All MATLAB code can be saved in text files (known as M-files), so it is not clear what your point is. Why not just get your users to write actual MATLAB scripts or functions? It would actually make your life much easier, and would be a real "benefit".
Yair Altman
on 26 Oct 2017
0 votes
There are various ways this can be done, as explained here: https://undocumentedmatlab.com/blog/function-definition-meta-info
Note: all of them are basically unsupported/undocumented. If you want a supported way, then your only option is to parse the m-files using regexps, but I would not recommend that route.
1 Comment
Note that regexp's cannot detect any magical creation of variables, or magical shadowing of function names by variables, or magical creation of variables in another workspace, or resolve which has higher priority when evaluated.... Which means that the "only option is to parse the m-files using regexps" is still not a general correct solution to the problem posed: the only general solution is to evaluate the code.
Categories
Find more on Adding custom doc in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!