MATLAB Answers

What should go in a next-generation MATLAB X?

1,319 views (last 30 days)
Andrew Janke
Andrew Janke on 11 Sep 2021
Commented: J. Alex Lee about 13 hours ago
Let's say MathWorks decides to create a MATLAB X release, which takes a big one-time breaking change that abandons back-compatibility and creates a more modern MATLAB language, ditching the unfortunate stuff that's around for historical reasons. What would you like to see in it?
I'm thinking stuff like syntax and semantics tweaks, changes to function behavior and interfaces in the standard library and Toolboxes, and so on.
(The "X" is for major version 10, like in "OS X". Matlab is still on version 9.x even though we use "R20xxa" release names now.)
  5 Comments
Andrew Janke
Andrew Janke on 21 Sep 2021
Personally, I think that the Answers format is particularly well suited to this sort of discussion, because instead of a linear threading model like a regular forum, it allows people to post various suggestions as top-level Answers and to have them be voted on to indicate community interests, and to let each of those suggestions have their own discussion thread hanging off it.

Sign in to comment.

Answers (28)

Andrew Janke
Andrew Janke on 11 Sep 2021
A possibly radical one:
Semicolons are no longer needed to suppress display of a statement's result. Instead, output is suppressed by default, and if you do want it displayed, you append a "!" (or something else) to the end of the statement. Semicolons are now just statement separators, and you can omit them in most places with no effect.
Maybe this should apply only to function and classdef files, and statement result display is on by default in script files, and you still suppress its display by appending a ";" there.
  19 Comments
Walter Roberson
Walter Roberson on 20 Oct 2021 at 18:55
By your logic, we might as well add every function attribute that we can possibly think of
[[NeverOnASunday]] [[HighTide]] [[NoSiciliansWhenDeathIsOnTheLine]] [[Alignment=4096]] [[SpellMyNameWithAnS]] [[IFeelOrangeToday]] [[KOI8FilesOnly]] [[NoThreads]] [[NeverGonnaGiveYouUp]] [[OddResults]] [[BaseIndex=0]] [[BirbImagesOnly]]

Sign in to comment.


Jan
Jan on 16 Sep 2021
A complete list of changes for each command.
Currently we find "introduced in Rxy" already, but modifications of inputs and outputs are very useful also. Examples: When did unique introduce the 'legacy' flag? When did strncmp change the behaviour for empty strings and n=0?
  1 Comment
Andrew Janke
Andrew Janke on 16 Sep 2021
This would be useful.
No reason to wait until MATLAB X to start doing it though; MathWorks could add a per-function/class Changelog to the doco any time, I think!

Sign in to comment.


Matt J
Matt J on 17 Sep 2021
Edited: Matt J on 17 Sep 2021
My wish list:
(1) Colon operator produces column vectors, not row vectors:
x=1:4
x = 4×1
1 2 3 4
(2) Optimization Toolbox solvers should have only one algorithm per solver, i.e., instead of,
x1=lsqnonlin(fun,x0,lb,ub, optimoptions(@lsqnonlin,'Algorithm','levenberg-marquardt'))
x2=fminunc(fun,x0, optimoptions(@fminunc,'Algorithm','trust-region'))
we would just have
x1=lsqnonlinLevMarq(fun,x0,lb,ub)
x2=fminuncTrustReg(fun,x0)
etc...
(3) The Image Processing and the Computer Vision Toolboxes would be designed around the coordinate conventions of ndgrid() instead of meshgrid().
(4)One-dimensional array types, i.e., with ndims(X)=1.
  12 Comments
J. Alex Lee
J. Alex Lee about 14 hours ago
I agree anything that returns a 1D array of something should default to returning it in a column vector, or at least pick an orientation! Some functions return columns, some return rows. Maybe an alternate approach if you don't want to break the for loop is to add a "for..in" construct that behaves like Matt proposes, just going through even element in linear index order.
The ndgrid vs meshgrid is an interesting one...wouldn't that mean for images we would want to index columns first, then rows? If I haven't got it mixed up, that seems juxtaposed to the previous sentiment about wanting default to columns?

Sign in to comment.


Andrew Janke
Andrew Janke on 11 Sep 2021
Edited: Andrew Janke on 13 Sep 2021
In MATLAB X, I would like to see:
  • An object display customization API like Python's __str__ and __repr__. (`disp` isn't suitable.) (See The Dispstr API)
  • In mixed-mode arithmetic (combining floats and ints), ints widen to floats instead of narrowing to ints.
  • Integer-looking literals (like 1234) produce ints instead of doubles.
  • Both single-quoted and double-quoted string literals produce string arrays; to get char arrays you need to explicitly call char(...).
  • Every function uses string arrays instead of char vectors or cellstrs in its return values, when not determined by the type of one of the inputs.
  • Figure handle properties use string arrays instead of char vectors.
  • In string literals, backslash escapes are interpreted by the string literal itself, and not by the *printf() functions.
  • import statements have file scope, not function scope.
  • Class properties with (1,1) string validators default to string(missing) instead of the empty string "".
  • There's a date-only localdate type to complement the date + time datetime type.
  • now() and today() return datetime and localdate values, instead of double datenums.
  • For that matter, pretty much every date or time returned by a function is a datetime or localdate instead of a double datenum.
  • Maybe classes and functions in the same package are visible by default, using unqualified names, instead of requiring package qualification or an import statement. (Though this is mostly handled if import gets file scope.)
  • The "`if false or true`" parsing quirk (where the stuff after "false" is considered the first statement inside the if block) is fixed, and the whole "false or true" is considered part of the if condition.
  • File IO is done OOP style, with fopen returning a file object instead of a numeric handle.
  • UTF-8 becomes the default encoding for all external text IO on all platforms.
  • A revamped helptext system for embedding somewhat-formatted, somewhat-structured API reference documentation in source code. The existing helptext format is too simple and loosey-goosey.
  • Maybe chars should become Unicode code points instead of UTF-16 code units, and strings and chars should be stored in Python-style "flexible-width string" format. Would save memory, and make it easier to work with emoji or exotic scripts.
  • The GUI Layout Toolbox's functionality is pulled in to base Matlab, including support for relative positioning and sizing of widgets (like how Java Swing layouts work), and relative positioning layouts become the default (instead of 'normalized' or absolute-units positioning like it is now).
Things I do not want to see:
  • Multithreading.
  27 Comments
Andrew Janke
Andrew Janke on 24 Sep 2021
I saw this in the release notes and in discussion on the Discord! Very interested! I am going to play with this over the next week or so, and then I'm sure I will have Opinions. Thanks!

Sign in to comment.


Tucker Downs
Tucker Downs on 11 Sep 2021
Yes! I think in all established products it's occasionally neccissary to make major pruning of older functionality for the good of the product / eco system. In companies I've worked for we've done this and made plenty of annoucements "Your legacy code might not work in version!!!! but we have guides on how to change it / we will support old matlab for the next X (many many) years."
For the most part it's always been well recieved.
I'll add
max(2,[]) should not return []
++ incrementing
maps as a more prominant base data type
expose more internal apis for making subclasses for plot objects, like custom arrows
  4 Comments
Andrew Janke
Andrew Janke on 11 Sep 2021
I dig your block scoping idea. Establishing scope like this by breaking things up into subfunctions can be a hassle.
I really would want the variables to be cleared at the end of the block, though: then it could be used for RAII/SBRM style resource management using onCleanup and object delete destructors!

Sign in to comment.


gwoo
gwoo on 29 Sep 2021 at 18:14
I don't know the technical name for it but being able to call methods, properties, or indexing without having to make a new variable first. Kind like in python where you can call a function that will output an array or whatever and instead of saving it to a variable first and then indexing, you can just index right off the end of the function call. I know you can do this for strings and structs, but not for cells or arrays. Also, being able to perform a series of functions on an array, the way you can now with strings.
For example:
[5, 1, 2](2) = 1
horzcat([3;2;1], [5;6;7])(3,2) = 1
  2 Comments
Michael Zauner
Michael Zauner on 21 Oct 2021 at 11:41
This would also be really helpful for anonymous functions where you can not define a variable at all.

Sign in to comment.


Munin
Munin on 29 Sep 2021 at 21:59
Edited: Munin on 30 Sep 2021 at 6:41
An LSP for other IDEs, better documentation of the Python engine, easier install of MEfP using some kind of shell script or dep manager, and a modern IDE UI supporting dark theme.
Also all components like Coder require a support of MATLABs licensing scheme so that they are usable in CI etc.
  7 Comments
Andrew Janke
Andrew Janke on 30 Sep 2021 at 18:02
For context: It would be nice to have CI runs test your Matlab Compiler builds, because this exercises your build & packaging scripts, and lets you test the actual compiled artifacts, but more importantly IMHO, it verifies that your new M-code code base can actually be compiled. In interactive Matlab, if there are syntax errors in your code, that raises a run-time error when you try to run that line or function or load that class, so it only matters if that code actually gets used, only affects that particular functionality, and unit testing can catch it. But in the Matlab Compiler, if there is a syntax error anywhere in a code base that is included in a compilation, it breaks the build and you cannot deploy your code at all. So errors that a programmer may have introduced and not even noticed in interactive testing can bork your entire system.
It's impractical to test this by compiling interactively, because the Matlab Compiler is so darn slow, and most of your developers probably won't have Matlab Compiler licenses anyway; only a few devs who do your builds/releases (I call these "release engineers") will have Compiler licenses, and their time is probably expensive.

Sign in to comment.


Andrew Janke
Andrew Janke on 30 Sep 2021 at 23:29
Remove the length function.
Its behavior of "size along the longest dimension, picked at run time" is a little weird, most junior programmers don't expect it, and it leads to subtle bugs that can silently produce incorrect results instead of erroring out. In my 15 years of Matlab programming experience, I've seen so many people call length, and I've never seen one who actually wanted what length does instead of numel or size.
Let everyone just use numel or size instead; those work "safely".

Walter Roberson
Walter Roberson on 13 Oct 2021 at 7:58
The ability to assign a subset of fields to a struct (array) would be useful. It is common to want to be change a few settings, such as in a user initialization file, or to have a function that is concerned with getting only a subset of properties from the user. There thus might be a struct of updates to be applied to an existing struct. At the moment you have to loop through the fieldnames of the update struct, setting the fields of the existing struct one by one.
The ability to concatenate or assign between structs with the same fields in different orders would be useful. We have the experience of tables to look at: tables re-order as necessary to match the first order.
  1 Comment
Andrew Janke
Andrew Janke on 18 Oct 2021 at 21:31
+1. struct and object "subset of fields" assignment or "merging" like this is such a common use case in the sort of code that I work with that any nontrivial code base typically ends up with a half dozen different custom helper functions for doing this, each with slightly different behavior.

Sign in to comment.


Andrew Janke
Andrew Janke on 15 Sep 2021
Oh, here's one!
  • Comments can begin with "#" in addition to "%".
This would enable Octave compatibility. But I think that might be to MathWorks's benefit: it would enable you to easily take existing Octave code and migrate your workloads to Matlab, which is the direction that MathWorks would like people to move.
Also enables use of "shebang" lines on Unix, so you could easily create executable commands as Matlab scripts.
  3 Comments
Andrew Janke
Andrew Janke on 3 Oct 2021 at 20:51
> It's easy enough to do a find/replace for # -> %
How would one do this easily, without mangling # characters that are inside strings or comments?
> I don't believe this should be done for the sake of Octave users.
It is more for the sake of Matlab users that wish to take advantage of existing Octave code.
> Octave should conform to Matlab syntax, not the other way around.
This is likely never happening here, because there's too much Octave code out there using "#" comments, and Octave coders like "#" comments.

Sign in to comment.


Walter Roberson
Walter Roberson on 17 Sep 2021
Edited: Walter Roberson on 19 Sep 2021
Currently the model of MATLAB is that it always evaluates from left to right [*] finding the left-most unprocessed sub-expression and evaluating it, and then finding and evaluating the right hand side operand, and then performing the operation. The right operand is not processed until the left is evaluated, but unless the left operand results in an error, or the operation is && or || the right will always be evaluated.
[*] exception: there are some funky things with chains of ^ and .^ operators, they are not left strictly left to right.
This behavior prevents there from being function forms of if/else operations -- there is no equivalent to C's ?: operation. In C, the unselected operation is not evaluated at all.
The hack work-arounds require embedding the work to be done inside an anonymous function and writing a function like
function varargout = ifelse(expr, basepart, elsepart)
if expr
if isa(base_part, 'function_handle')
[varargout{:}] = basepart();
else
varargout{1} = basepart;
end
elseif isa(elsepart, 'function_handle')
[varargout{:}] = elsepart();
else
varargout{1} = elsepart;
end
end
and using that gets ugly... and probably messes up multiple output processing.
Piecewise(x ~= 0, 0, 1./x)
can't be done and would have to look like
Piecewise(x ~= 0, 0, @(x)1./x)
I would like to see a cleaner way of handling this -- one in which the function being called does not need to know that a delayed evaluation is being done.
In the Maple programming language, there are two related mechanisms available. First, there is a simple syntax to delay evaluation. This is indicated by using ' ' around the expression. For example,
Piecewise(x <> 0, 0, '1/x')
In Maple, this is not a quoted string: Maple uses double-quotes for strings. Instead it is a delayed evaluation. Each time the relevant expression is evaluated, one level of unevaluation is removed; when it is eventually evaluated in a context where there are not remaining protective uneval() levels, then the expression is evaluated.
Secondly, Maple allows procedures (that is, functions) to declare a parameter as being of type "uneval", which has the effect of adding a layer of uneval around what is passed in. For example,
Piecewise := proc(x, basepart::uneval, elsepart::uneval) #stuff; end proc;
would permit uses to code
Piecewise(x <> 0, 0, 1/x)
and the 1/x will not be evaluated before being passed in to the procedure.
Some programming languages deal with these kinds of issues by using "lazy evaluation". Something like
Piecewise(x <> 0, 0, 1/x)
would not evaluate any of the parameters until such time as the code inside Piecewise asked for their value -- so if the code logic did not ask for the value of a particular parameter, it would never be evaluated.
If I understand correctly, tallarray() already does some delayed evaluation, building up expressions and then internally finding ways to reduce the memory access during evaluation.
  2 Comments
Andrew Janke
Andrew Janke on 17 Sep 2021
Ooh, yeah. I'd also like lazy evaluation in some cases, especially in the context of a ?: ternary operator.
I use your ifelse function hack a lot, and it's not very satisfactory because it doesn't short-circuit.

Sign in to comment.


Walter Roberson
Walter Roberson on 19 Sep 2021
This would not break backwards compatibility, but something to consider:
A lot of time, people try to
for x = first:increment:last
with non-integer increment. And then they want to
f(x) = value;
but of course x is non-integer so that fails.
There are standard ways of rewriting this: the common
counter = 1;
for x = first:increment:last
f(counter) = value;
counter = counter + 1;
end
x =
or (less likely by far, but cleaner since counter is more sensible)
counter = 0;
for x = first:increment:last
counter = counter + 1;
f(counter) = value;
end
or the formal and flexible
xvals = first:increment:last;
num_x = numel(xvals);
f = zeros(1, num_x);
for xidx = 1 : num_x
x = xvals(xidx);
f(xidx) = value;
end
But... keeping those counters is a bit of a nuisance, and people get them wrong.
So I would suggest something I have seen in a couple of programming languages: that there be an accessible automatic counter. We could imagine, for example,
for x = 0:.01:2*pi
f(#x) = sin(x.^2 - pi/7);
end
where the #x translates as "the number of x values we have processed so far".
Indexing a variety of arrays with the same # would be considered valid, so you could write
for x = 0:.01:2*pi
f(#x) = sin(x.^2 - phase(#x));
end
But now we have a question that might lead to some backwards incompatibility: suppose we have
for x = 0:.01:2*pi
y = 0;
for x = 1 : .5 : 5
y = y + z.^(x-1)./gamma(x+1);
end
f(#x) = sin(x.^2 - y);
end
and the question is: in that f(#x) that is after the nested for x, should the #x refer to
  1. the last index associated with the inner x?
  2. the index after the last one associated with the inner x?
  3. the index associated with the outer x?
Consistency with existing nested for loops would say it should be the first of those, that at any point, this hypothetical #x should refer to the last for index for variable x that was encounted in the flow of execution -- just like the way that the sin(x.^2 - y) is going to use the last x value from the for x = 1 : .5 : 5 .
I would kind of like such an operator to be associated with the innermost enclosing loop so that in this example the f(#x) would be counting relative to the for x = 0:.01:2*pi loop, but I do admit that it would be confusing to have the #x refer to that loop at the same time that the x itself would be what was left-over for the for x = 1: 0.5 : 5 loop. Also, in a context such as
f = zeros(1,5000);
for x = 0:.01:2*pi
if x.^2 - sin(x) > 1; break; end
f(#x) = acos(x);
end
f(#x+1:end) = [];
then it would make sense for the counter to survive the loop itself, which argues for the status quo of "last value assigned" rather than "according to scope". I think the factors are in tension here.
Now, if we are going to have automatic counters with for loops it might make sense to have automatic counters associated with while loops as well:
x = 0;
while x <= 2*pi & x.^2 - sin(x) < 1
f(#???) = acos(x);
x = x + 0.01;
end
But while loops have no associated variable. So I might suggest
x = 0;
while x <= 2*pi & x.^2 - sin(x) < 1
f(#) = acos(x);
x = x + 0.01;
end
where # by itself is the counter for the innermost enclosing for or while loop. Which would then permit
for x = 0:.01:2*pi
f(#) = sin(x.^2 - phase(#));
end
which is not ambiguous. Now about about with nested loops?
for x = 0:.01:2*pi
y = 0;
for x = 1 : .5 : 5
y = y + z.^(x-1)./gamma(x+1);
end
f(#) = sin(x.^2 - y);
end
The innermost enclosing for or while loop would be the outer for x loop... the one the user probably intended in such a context.
With the discussion above about what #x means after the end of a for x loop, this proposed behavior of # by itself would lead to the possibility that at that point, assigning to f(#) would be assigning according to the loop counter for the outer for x, but that assigning f(#x) would be assigning according to the loop counter for the inner for x . That is not ideal for readability, and is likely to lead to confusion.
It seems to me that in some cases, people would want a #x at that point to refer to the outer loop, but people would also sometimes want a #x to refer to the inner for x . It would also not surprise me at all if people wanted both ways at the same time. Of course, if they wanted clarity and readability, they probably should not have used nested for loops with the same variable name !!!
  4 Comments
Andrew Janke
Andrew Janke on 16 Oct 2021 at 18:51
I would end up using this a lot; I often have to restructure my for loops to get an index to go with the actual value.
What are your thoughts on Python's approach to this with the "for i, x in enumerate(xs)" sequence generator and multiple assignment for for loops? I could see Matlab doing something like this. Let's say you have some parallel arrays xs, ys, and zs, you could do:
for [i, x, y, z] = enumerate(xs, ys, zs)
and get an index plus the ith element from each of the input arrays. This might generalize to creating other loop-pass-dependent variables using "generator functions" or the like.
This doesn't help with the while case; I still like the convenience of an implicit # and don't see how to handle that using the "generator function" approach.

Sign in to comment.


Image Analyst
Image Analyst on 23 Sep 2021
I'd like a way to enter 2-D matrices interactively easier. The current way with inputdlg() or input() is not WYSIWIG and very clunky and non-intuitive (do I put bracket, parentheses, commas, semicolons - no clue!) We need something like
% Pop up a modal dialog box with a 4 by 5 grid (worksheet) where users can enter values:
m = inputmatrix('Enter your values', 4, 5);
  4 Comments
Andrew Janke
Andrew Janke on 16 Oct 2021 at 18:19
Excel has a couple limitations here. It requires an Excel license and installation, doesn't work at all on Linux or Matlab Online, and is difficult to automate on Mac because you don't have COM/ActiveX automation. Not all users or execution environments are going to have that.
@Image Analyst This is the sort of thing that I or another developer could probably whip up with user M-code for current Matlab. Would this be something you'd be interested in if I wrote it up as a library? It'd be Java code, so it'd work now, but Jexit is coming so that's going to be less useful in the near future. And I don't think it'll work for Matlab Online or web-based Matlab presentation contexts, but I don't know how to do custom GUI components using the new web-tech-based Matlab GUI stuff.

Sign in to comment.


Walter Roberson
Walter Roberson on 18 Oct 2021 at 17:28
Edited: Walter Roberson on 20 Oct 2021 at 19:40
If, hypothetically, a new assignment operator were created that allowed the user to manage
A = object_of_class_B
inside class B, something along the lines of
function target = assign(obj, target) %obj being the object of the class
then that could perhaps have some advantages.
But what should the semantics be ? What would the use-cases be?
  • such a thing could potentially make resource tracking easier
  • there might be reason to warn about assigning between unlike data types. For example if A were uint8 but class B carried int8 then you might want a warning about negative values being truncated
  • not sure what else...
If such an operator existed, you would need a way to distinguish the case where the target was a location that did not exist yet.
Hypothetically that could be handled by nargin < 2 or exist('target', 'var') being false.
But hypothetically perhaps there would be reasons to instead associate each name with a class such as UnassignedLocation, and then isa(target, 'UnassignedLocation')
An existing target of an assignment should definitely be made available inside such a function, so that its datatype can be examined, and resources poked around at.
There is commentary somewhere along the lines that if the target of an assignment is a class name or static method of a class, then the class cannot have influence on what the assignment means: that otherwise the statement
A = B
could change its meaning if a new class A were introduced. I think the implication of that is that there should not be an operator introduced that intercepted assignments onto a class. But possibly I have overlooked some reason why the kind of assignment operator I describe here should not be created.

Jim Svensson
Jim Svensson on 15 Sep 2021
Most important
  • Start indexing from 0
  • Redo package system
  • Improve the class system
  • Improve language a bit (like value += delta)
  13 Comments
Gregory Vernon
Gregory Vernon on 3 Oct 2021 at 21:13
Just cracked open my nonlinear continuum textbook and am greeted with a glut of things like:
that are based on 1-based indexing. Point being that switching to zero-based indexing will just cause pain for people in technical fields that are one-based. To me, seeing means an initial stress state, while means the maximum principle stress. Two entirely different concepts.
Going along with this exercise, instead of changing existing functionality, I'd rather see an extension to functionality -- maybe allowing specifying whether an array/vector is a one-based or zero-based array:
x = zeros(10,1); % A standard 1-based indices array
y = zeros(10,1,"startIndex", 0); % A 0-based indices array
So for something like getting an array of Legendre basis polynomials (which are 0-based indexing) instead of:
P = legendreBasisArray(3, sym("x"))
P = 
disp(P(1)) % Display the constant Legendre polynomial, which is usually referred to as P_0(x)
1
function P = legendreBasisArray(p,variate)
P = sym(zeros(p+1,1));
for n = 0 : p
P(n+1) = 1/((2^n)*factorial(n))*diff((variate^2-1)^n,variate,n);
end
P = simplify(P);
end
You'd instead do something like:
P = legendreBasisArray(3, sym("x"))
disp(P(0))
function P = legendreBasisArray(p,variate)
P = sym(zeros(p+1,1), "startIndex", 0);
for n = 0 : p
P(n) = 1/((2^n)*factorial(n))*diff((variate^2-1)^n,variate,n);
end
P = simplify(P);
end
The challenge here, though, would be that now I'd have to remember that P is zero-based and after a few years and 100k more lines of code, whomever takes over will probably be very confused. Personally, I don't think it's worth the effort and would rather just figure out once whether I need to use n+1 or n in my loops.

Sign in to comment.


Andrew Janke
Andrew Janke on 18 Sep 2021
Convenience thing:
  • The fieldnames function returns a string row vector, not a cellstr column vector, so you can loop over cell fields with for fld = fieldnames(s) instead of for fld = string(fieldnames(s)'), which is uglier.
  2 Comments
J. Alex Lee
J. Alex Lee about 13 hours ago
Alternatively, just return all 1D results into a column (or a row if you insist) and change how for loops work, or [maybe less desirable] have an un-oriented 1D array?

Sign in to comment.


Andrew Janke
Andrew Janke on 18 Sep 2021
Parallel array iteration!
Let's say I've got some arrays in variables x, y, and z, with the same number of columns.
I'd like to be able to say this:
for (x_i, y_i, z_i) = (x, y, z)
% ... do stuff ...
end
Instead of this:
for i = 1:size(x,2)
[x_i, y_i, z_i] = deal(x(:,i), y(:,i), z(:,i));
% ... do stuff ...
end
  4 Comments
Andrew Janke
Andrew Janke on 29 Sep 2021 at 19:11
Oh, that num2cell thing is a nice trick!

Sign in to comment.