Odd behavior of unstack in Matlab2025b

Question

0 votes

Hello,

When calling the unstack function using @sum as the AggregationFunction for numerical values in 2025b, I'm getting the warning below and the empty groups get a zero (from summing into a 0-1 input)

Warning: When a group has no rows for a given value of the indicator variable, UNSTACK calls the supplied aggregation function with an input of size 0-by-1 instead of automatically filling the value.

Review the output to ensure desired result is obtained. This warning might be removed in a future release.

However the function documentation states the following:

Missing value of the appropriate data type, such as a NaN, NaT, missing string, or undefined categorical value.

Which is the behavior I used to get in previous Matlab versions (e.g. 2019b), would get NaNs from empty groups, and which I would have epxected to happend here since its what the documentation suggests.

Any thoughts on why this might be happening or if I can change the behavior so as to get NaNs instead of calling the aggregation function into a 0-1 input for the empty groups?

0 Comments
Show -2 older comments Hide -2 older comments

Sign in to comment.

Sign in to answer this question.

Follow Question

Answer 1

Matt J on 20 Mar 2026 at 19:13

Edited: Matt J on 20 Mar 2026 at 19:21

Open in MATLAB Online

0 votes

The change occurred in R2020a and was documented,

https://www.mathworks.com/help/releases/R2024b/matlab/ref/table.unstack.html?overload=table%2Funstack+true&searchPort=53348#mw_8f433421-06b5-4021-8a95-e68029e75fe9

The workaround would be to stipulate an aggregation function that returns NaN for empty inputs, e.g.,

Date = [repmat(datetime('2008-04-12'),6,1);...
        repmat(datetime('2008-04-13'),5,1)];
Stock = categorical({'Stock1';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock1';'Stock2'}  ,{'Stock1';'Stock2';'Stock3'});
Price = [60.35;27.68;64.19;25.47;28.11;27.98;...
         63.85;27.55;26.43;65.73;25.94];
S = timetable(Date,Stock,Price);
S([8,9,11],:)=[]
S = 8×2 timetable
       Date        Stock     Price
    ___________    ______    _____

    12-Apr-2008    Stock1    60.35
    12-Apr-2008    Stock2    27.68
    12-Apr-2008    Stock1    64.19
    12-Apr-2008    Stock2    25.47
    12-Apr-2008    Stock2    28.11
    12-Apr-2008    Stock2    27.98
    13-Apr-2008    Stock1    63.85
    13-Apr-2008    Stock1    65.73
U = unstack(S,'Price','Stock', Aggregation=@(z) sum(z)/~isempty(z))
U = 2×2 timetable
       Date        Stock1    Stock2
    ___________    ______    ______

    12-Apr-2008    124.54    109.24
    13-Apr-2008    129.58       NaN

20 Comments
Show 18 older comments Hide 18 older comments

dpb 41 minutes ago

Edited: dpb 12 minutes ago

Open in MATLAB Online

While it is default behavior and convention for sum([]) to return zero, it is creating something out of nothing in doing so and for many cases knowing there was not data in a given bin is crucial rather than that the data in those bins was identically (or at least summed to) zero.

I don't have a release prior to R2017b installed; in the olden days the initial developer was generally listed in the header comments and very often Cleve was the original designer/implementer. Now there's just the copyright message so can't see that history. I'd certainly not classify the behavior as a bug;

Of course, for many common statistics it will do so because it will be trying to divide by zero or they will retun the empty result instead of zero.

Date = [repmat(datetime('2008-04-12'),6,1);...
        repmat(datetime('2008-04-13'),5,1)];
Stock = categorical({'Stock1';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock1';'Stock2'}  ,{'Stock1';'Stock2';'Stock3'});
Price = [60.35;27.68;64.19;25.47;28.11;27.98;...
         63.85;27.55;26.43;65.73;25.94];
S = timetable(Date,Stock,Price);
S([8,9,11],:)=[];
U = unstack(S,'Price','Stock','AggregationFunction',@std)
U = 2×2 timetable
       Date        Stock1    Stock2
    ___________    ______    ______

    12-Apr-2008    2.7153    1.2398
    13-Apr-2008    1.3294       NaN
U = unstack(S,'Price','Stock','AggregationFunction',@max)
U = 2×2 timetable
       Date        Stock1    Stock2
    ___________    ______    ______

    12-Apr-2008    64.19     28.11 
    13-Apr-2008    65.73       NaN 

In fact, it is @sum that is more or less the "odd man out" in its behavior relative to other functions....otomh I really can't think of another that doesn't just return empty.

Matt J about 1 hour ago

Edited: Matt J about 1 hour ago

Open in MATLAB Online

Here's a simple wrapper to achieve uniformly NaN-on-empty behavior:

S([8,9,11],:)=[]
S = 8×2 timetable
       Date        Stock     Price
    ___________    ______    _____

    12-Apr-2008    Stock1    60.35
    12-Apr-2008    Stock2    27.68
    12-Apr-2008    Stock1    64.19
    12-Apr-2008    Stock2    25.47
    12-Apr-2008    Stock2    28.11
    12-Apr-2008    Stock2    27.98
    13-Apr-2008    Stock1    63.85
    13-Apr-2008    Stock1    65.73
U = Unstack(S,'Price','Stock' ,AggregationFunction=@sum)
U = 2×2 timetable
       Date        Stock1    Stock2
    ___________    ______    ______

    12-Apr-2008    124.54    109.24
    13-Apr-2008    129.58       NaN
function [varargout]=Unstack(S,vars,ivar, varargin)
  args=struct(varargin{:});
  if ~isfield(args,'AggregationFunction')
      args.AggregationFunction=@sum; 
  end
  
  f=args.AggregationFunction;
  args.AggregationFunction=@(z)f(z)*(numel(z)/numel(z));
  
  varargin=namedargs2cell(args);
  
  [varargout{1:nargout}]=unstack(S,vars,ivar, varargin{:});
  
end

dpb about 2 hours ago

Edited: dpb about 1 hour ago

Open in MATLAB Online

I left out "computational" in the prior statement presuming it would be implied/assumed. Only prod() that I didn't think of is one I would consider pertinent here--it would have been ideal for my prior test/illustration if I had thought of it at the time..

format bank, format compact
Date = [repmat(datetime('2008-04-12'),6,1);...
        repmat(datetime('2008-04-13'),5,1)];
Stock = categorical({'Stock1';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock2';'Stock1';'Stock2';...
                     'Stock2';'Stock1';'Stock2'}  ,{'Stock1';'Stock2';'Stock3'});
Price = [60.35;27.68;64.19;25.47;28.11;27.98;...
         63.85;27.55;26.43;65.73;25.94];
S = timetable(Date,Stock,Price);
S([8,9,11],:)=[];
U = unstack(S,'Price','Stock','AggregationFunction',@prod)
Warning: When a group has no rows for a given value of the indicator variable, UNSTACK calls the supplied aggregation function with an input of size 0-by-1 instead of automatically filling the value. Review the output to ensure desired result is obtained. This warning might be removed in a future release.
U = 2×2 timetable
       Date        Stock1      Stock2  
    ___________    _______    _________
    12-Apr-2008    3873.87    554502.60
    13-Apr-2008    4196.86         1.00

To me, the above is by far the more hazardous behavior, especially if Mathworks were to ever actually remove the warning and let the above result go silently. My opinion is still that no data is no data and silently returning a finite value in its place is abadidea™.

Of the informational functions, numel() and nnz() are certainly expected; any() I grok, all() looks aberrant to me despite being documented as so given the description as "Determine if all array elements are nonzero or true". Certainly there are none of either in an empty set so not sure how it would have been determined to return true for it but false for any(). But, they didn't ask... <g>

Anyways, interesting sidebar and I'll add your Unstack to my Utilities directory of generally useful functions.

Matt J about 1 hour ago

Edited: Matt J 43 minutes ago

The doc for all doesn't say that...it says "Determine if all array elements are nonzero or true". There are no true or nonzero elements in [].

The doc could be worded more transparently, but those two things aren't really logically contradictory. If you don't believe that all elements of the empty matrix are nonzero or true, then point out which element violates this (also known as a vacuous truth).

Another reason that all([])=true is an appropriate convention is to be consistent with De Morgan's Laws. I think you said you believe these are appropriate:

any([])=false

~[] = []

If so, then from De Morgan's Laws,

all([]) = ~any(~[]) = ~any([])=true

Finally, I would note that Matlab is not alone in adopting this convention. It is also pretty consistent with other programming languages.

Stephen23 38 minutes ago

Edited: Stephen23 31 minutes ago

Open in MATLAB Online

Matt J is correct, these are not abberations, these are mathematically consistent results (hint: identity element):

prod([])
ans = 1
sum([])
ans = 0

Lets look in more detail at ANY and ALL. Disregarding dimensions and the like, ANY must satisfy this equivalence:

all([A,B]) = all(A) && all(B)

In essence, this is what ALL means (for any division of some set into some arbitrary subsets A & B), so this equivalence must be true. This includes empty subsets, therefore specify B=[] and we get:

all([A,[]]) = all(A) && all([])

which can only be equivalent when ALL([])==TRUE. Similarly for ANY:

any([A,B]) = any(A) || any(B)

This includes empty subsets, therefore specify B=[] and we get:

any([A,[]]) = any(A) || any([])

which can only be equivalent when ANY([])==FALSE. De Morgan's law also constrains both values jointly:

all(A) = ~any(~A)

given ~[] = []. So the two outputs are not independent: they must be logical negations of each other! MATLAB implements the only mathematically consistent definitions of these operations.

Sign in to comment.

Answer 2

dpb on 20 Mar 2026 at 19:13

Edited: dpb on 20 Mar 2026 at 20:19

0 votes

What was the exact syntax you used? The way I interpret unstack doc for the aggregation function result with a missing indicator is that the aggregation function must be defaulted -- specifying the @sum isn't the same as defaulting even though it is the default function. See the followup comment to @Matt J's example.

For anything other than summing, however, you would have to create a function similar to his example.

1 Comment
Show -1 older comments Hide -1 older comments

Roberto Obergon about 3 hours ago

Thank you all for the quick feedback! @dpb hit the nail on the head on my issue. I was adding 'Aggregation Function', @sum to the syntax of the function, if I leave that out, it defaults to summing and I get the desired behavior, i.e., missing groups show up as NaN instead of calling the function over a 0x1 input

Sign in to comment.

Odd behavior of unstack in Matlab2025b

0 Comments
Show -2 older comments Hide -2 older comments

Answers (2)

20 Comments
Show 18 older comments Hide 18 older comments

1 Comment
Show -1 older comments Hide -1 older comments

Categories

Products

Release

Tags

Community Treasure Hunt

Odd behavior of unstack in Matlab2025b

0 Comments Show -2 older comments Hide -2 older comments

Answers (2)

20 Comments Show 18 older comments Hide 18 older comments

1 Comment Show -1 older comments Hide -1 older comments

Categories

Products

Release

Tags

See Also

Community Treasure Hunt

0 Comments
Show -2 older comments Hide -2 older comments

20 Comments
Show 18 older comments Hide 18 older comments

1 Comment
Show -1 older comments Hide -1 older comments