Shadowing built-in functions

68 views (last 30 days)
Jan
Jan on 15 Oct 2013
Commented: Matt Cooper on 7 Nov 2022
It is a well known problem from the forums, that the name of a built-in function is used for a variable and later, perhaps in another script, the behavior becomes unexpected:
sum = 1:10;
...
sum(1:5) % replies [1,2,3,4,5] instead of 15
If a function was used before, MLint shows a warning (at least in my Matlab version):
clear('sum');
a = sum(1:5);
sum = 1:10;
This happens for common names like: max, min, sum, i, j, line, text, input, ...
It seems obvious to prevent these problems by applying the general rule:
  • Never shadow built-in functions by variables!*
But nobody can remember the names of all built-in functions, most of all if they belong to toolboxes not installed on the users computer. And is the name "xbuffer" really smarter or safer than "buffer"?!
We could create a tool, which compares all names of (not dynamically created) variables with all documented toolbox functions and show a warning for all collisions. But I'm not convinced that the time required to check the heap of warning messages is worth to avoid the rare (but ugly) real collisions.
Which strategy do you use or suggest to reduce naming collisions?
  8 Comments
Matt J
Matt J on 9 Jan 2014
Edited: Matt J on 9 Jan 2014
that the name of a built-in function is used for a variable and later, perhaps in another script, the behavior becomes unexpected:
@Jan,
I assume you really do mean scripts and not mfile functions. I've never known it to occur that a function name was shadowed by a variable name (inside an mfunction workspace) once the variable went out of scope.
Under this assumption, I don't really consider shadowing function names with variable names to be a big problem. I think it's not unreasonable to ask users to keep track of the contents of their workspace and the meanings they assign to names there. If someone uses a variable called "max", it's fine as long as they are committed to that usage for the duration of the variable's lifetime. If you can't keep track, you're writing functions that are too long. In any case, there is MLint to catch you.
To me, the bigger danger is in shadowing builtin MATLAB functions with other user-written functions. In that case, you change the behavior of all builtin mfunctions trying to call the native versions. There are now warnings to alert you to that too, but I remember a time when there wasn't.
Jan
Jan on 22 Feb 2014
@Matt J: You are right, I meant scripts and not functions. This is the reason why I avoid scripts in general and suggest this repeatedly.
MLint has no chance, if the variables are created dynamically by eval or assignin. Then another serious problem occurs: If a function runs in the norem non-debug mode, the initial JIT compilation recognizes function calls and shadowing the names by dynamically created variables does not have an effect. But in debug mode the code is processed and interpreted line by line, and the shadowing does influence the parsing. I haven't tested this with modern versions again, because it is cleaner to avoid the dangerous combination of scripts, dyanmically created variables and shadowed function strictly by avoiding all of these three methods.

Sign in to comment.

Accepted Answer

Cedric
Cedric on 15 Oct 2013
Edited: Cedric on 15 Oct 2013
I am not using packages either (yet), but I would love to have an alternate version of ADDPATH which would mount folders as packages (basically just generating a name space). It would also be interesting to be able to set MATLAB in a mode where its toolboxes must be "mounted" as packages. E.g.
janLib = addpathpack( '/cedric/matlab/libs/JanLib' ) ;
statTbx = tbxloadpack( 'Statistics' ) ;
reg_jan = janLib.regress( ... ) ;
reg_stat = statTbx.regress( ... ) ;
which -all regress
'regress' not found.

More Answers (7)

Jan
Jan on 15 Oct 2013
Some ideas:
  • clear all is a bad idea
  • Using functions instead of scripts reduces the chance that a collision concerns later code
  • I never use package folders, because I have to support Matlab versions before they have been available and because there have been several severe bugs.
  • In modern Matlab versions and under Linux the case-sensitivity reduces the chance. I used camel-case ("MyVariable") for variables used externally and my functions and sulking-camel-case ("myOtherVariable") for local variables.
  • I use a tool to check names, but e.g. ignore these names (with ignoring the case):
'iF', 'Info', 'Angle', 'Debug', 'Line', 'HostID', 'Box'
But due to the complexity, it is obviously not a good idea to suggest beginners to apply these rules.

John Barber
John Barber on 16 Oct 2013
I have recently started to use packages, although their utility is significantly reduced by the fact that packages do not import their own namespace. If they did, you could trivially move a toolbox or library into a package for namespace management purposes. As is, converting a directory to a package requires editing all of the m-files to add the package name to internal function calls and class references.
I think that the following features could be useful if added to MATLAB:
  1. An optional MLint warning when a variable name shadows a function or class name
  2. An optional environment setting to throw a warning when a variable is created that shadows a function or class name
Obviously, these warnings should be optional and disabled by default because they are a departure from long-established MATLAB behavior. Also, I'm not sure if (2) would affect performance by slowing down variable creation.
As a sidebar, the variable/function shadowing issue seems to have influenced some of TMW's internal design decisions: http://www.mathworks.com/matlabcentral/newsreader/view_thread/171344#439111
  1 Comment
Matt Cooper
Matt Cooper on 7 Nov 2022
This doesn't resolve all of the issues raised, but for readers who stumble upon this, as of R2020b (and I think r2019b), you can at least add an import wildcard statement at the beginning of each function. This means you can avoid adding the package prefix to every function call within each function. It is pretty trivial to do this using fscanf, strrep, and fprintf, if you want to convert an existing project to a package. It is also trivial to take it one step further and use dir to read all the package function names, and append the package prefix to each function call using the same fcanf, strrep, fprintf workflow. Unfortunately, there remain many limitations to package namespaces (such as lack of tab complete to find package functions, even after a wildcard import, although function hints do work).

Sign in to comment.


Sean de Wolski
Sean de Wolski on 15 Oct 2013
which -all xbuffer
'xbuffer' not found.
We're in the clear! I try to use this all of the time.
  6 Comments
Jan
Jan on 16 Oct 2013
Edited: Jan on 16 Oct 2013
@dpb: A collision is, when a user is such surprised by the behavior, that he asks in the forum. Anything else is intended behavior.
Naming collision happen in the real world also. Then either the first character of the middle name is added ("George W. Bush") or a kind of temporal namespace is applied ("Bush senior"). In German the case of words is responsible for the distinction:
Ich habe liebe Genossen: I have dear comrads
Ich habe Liebe genossen: I enjoyed love
Then numbers are appended: "Ludwig XIV." or ".NET 2.0" or an appendix is added describing a property: "Ajax the Lesser" versus "Ajax the Greater". And finally, madness takes its toll: "Attention, a B!" So collisions are replaced by pure confusion and misunderstandings.
dpb
dpb on 16 Oct 2013

Well, given the design of Matlab I'd say all of what has been talked about in the thread is intended behavior. :)

It's just an unfortunate result of initial decisions and that Matlab has turned into something far beyond its original visions wherein it now is a lot more problematical than it was way back when.

Sign in to comment.


dpb
dpb on 15 Oct 2013
I use the strategy of trying to not then fixing it if do... :)

Philip Borghesani
Philip Borghesani on 16 Oct 2013
Edited: Philip Borghesani on 16 Oct 2013
As others have pointed out here avoiding shadowing is impossible particularly because I have not yet met a programmer who is an oracle.
If a few simple rules are followed then shadowing should not be a problem.
  1. Never write scripts. I only find scripts useful for a few minutes of prototyping before turning the script into a function
  2. Keep functions short. Clean C++ programming presentations I have seen suggest 4 lines or less. I think this is a bit excessive but it is nearly impossible to shadow a function and try to use it in a short function. It would also be quite obvious.
  3. Did I say never write scripts?
It seems to me that most cases of shadowing causing confusion are with the function "sum" and novice users working from the command line. Keeping a tidy base workspace and never writing scripts is the best way around this issue.
Function naming and avoiding collisions is a bigger problem. One technique I have started employing is to use static methods on a class to group a set of related functions in a single file. This ends up looking like a package to the user but simplifies a bit on calling related functions and sharing with others. Local helper functions can be added after the classdef block for shared implementation. This sort of thing keeps the namespace much cleaner.
  3 Comments
Cedric
Cedric on 17 Oct 2013
.. but you could meet oracle programmers though, who won't predict a darn thing!
Jan
Jan on 17 Oct 2013
Rudeness is not accepted here, therefore I do not post my opinion about Oracle programmers.

Sign in to comment.


W. Owen Brimijoin
W. Owen Brimijoin on 10 Jan 2014
I've always taken a stupidly simple approach to this problem: I just type what I am planning to name the variable into the command window and hit enter. If Matlab reports something like
Error using sum
Not enough input arguments.
then I don't use 'sum' as a variable name. Too quick and dirty?
  4 Comments
Bjorn Gustavsson
Bjorn Gustavsson on 10 Jan 2014
When I had my saveas.m before the suddendly apperaring matlab saveas.m I could not save my work to ordinary .mat-files, that Mathworks "shiped a new version with such a horrendous bug" confused me for quite a while. When I understood what was going on I realized that one of me and Mathworks had to modify our code - and that I was likely the one that had to do it. After some further pondering I thought it was better to have matlab-functions screw over my functions than the other way around.
Sean de Wolski
Sean de Wolski on 10 Jan 2014
The best one I saw was a shadowed assert(). The effect was similar, GUIDE would not let the files be saved.

Sign in to comment.


Jos (10584)
Jos (10584) on 15 Oct 2013
Edited: Jos (10584) on 15 Oct 2013
From the release notes of ML2015a:
- New built-in function xbuffer. So check all your miles and change the variable xbuffer into something else.
;-)
  1 Comment
Jan
Jan on 15 Oct 2013
@Jos: This stresses, that the problem cannot be solved statically. So exhaustive testing by which is not perfect, but a dynamic check would find collisions as soon as R2015a is installed. But then strread will run out of view such that backward compatibility is the next source of troubles.

Sign in to comment.

Categories

Find more on Loops and Conditional Statements in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!