Dynamically increasing propery size in object

1 view (last 30 days)
TAB
TAB on 10 Oct 2021
Commented: TAB on 10 Oct 2021
Hello Everyone,
I know dynamically increasing array size (without pre-allocation) is not good for performnace and there are several discussion around it in this forum. But is this also true for property inside handle class ?
I am trying to write a class where I can dynamically increase the size of a propery. A simplified example below
classdef MySignal < handle
properties (Access = public)
Name = ''
Source = 'DAQ11'
end
methods
function obj = MySignal(name)
obj.Name = name;
end
end
end
Another class will use MySignal class
classdef DataCollection < handle
properties
AquiredBy = ''
Sig = MySignal.empty; % Create empty array of MySignal
end
methods % Constructor
function obj = DataCollection(user_name)
AquiredBy = user_name;
end
end
methods
function AddSignal(obj, name) % Function to add signal in list on demand
obj.Sig = [obj.Sig MySignal(name)];
end
end
end
DataCollection class will be used to add signals later
data = DataCollection('user_1');
data.AddSignal('Signal_1');
%....
%.... signal search, validation, comparision etc.
data.AddSignal('Signal_2');
%....
%....
Number of signals are large and pre-allocation is not possible.
Is this good approach or any better option available ?
(Using Matlab R2019b)
Thanks
  3 Comments
Walter Roberson
Walter Roberson on 10 Oct 2021
What assurance do you have that you will not fill up your memory?

Sign in to comment.

Accepted Answer

Walter Roberson
Walter Roberson on 10 Oct 2021
classdef MySignal < handle
Okay, that's a handle
function AddSignal(obj, name) % Function to add signal in list on demand
obj.Sig = [obj.Sig MySignal(name)];
and that triggers adding additional information on to the end of an array of handles that is stored as obj.Sig
An array of handles is stored in components. One component is header information saying what the data class is (handle); another component might indicate exactly what the handle class is; another component is header information with size information; there are various flags such as indicating whether the array is sparse or complex (which of course would both be false for arrays of handles.)
And then there is a component which is a pointer to a block of memory to store the data -- where each piece of data is a single handle. Somewhere in the array of handles, there is a block of memory which looks like
pointer to handle #1
pointer to handle #2
pointer to handle #3
the exact datatype of these pointers is probably a 64 bit memory address.
When you add a new handle on the end (say, handle #4), then what MATLAB has to do is allocate a new object that is mostly a copy of the old object obj.Sig . The components about data type and flags have to be copied. The size of the existing data storage that is pointers to the handles is examined and a new block of memory is allocated that is the same size plus enough room to store one more handle, and the size of the new block of memory is written into the components of the new block. Then the existing data information that is the pointer addresses is copied out of the existing data block and into the new data block, which fills all except the last entry. Then the address of the handle from MySignal(name) is written into the end of the memory block. This unnamed block of component information is now what is needed to store all of the information. What is left is to swap out the address of the old obj.Sig component block and replace it with the address of the new unnamed block, and then to take the old component block and release it for deallocation.
So, to grow anything in place is the same basic process: a copy of the existing header has to be created, larger memory for the data itself has to be created, existing data has to be copied into the new location, new location then has the additional information written to the end of it, association with name is updated if appropriate, old block is released.
This is not free.
There are techniques for reducing the costs, such as by allocating in chunks and keeping track of how much has actually been used; when a chunk gets full, add on a new chunk instead of just a single value.
Still... the technique you are using, of creating vectors of handles to the actual signal data, is less expensive then storing all of the signal data into a growing array. The vector-of-handles approach only has to copy the headers and the pointers, rather than having to copy any of the content pointed to.
When you have a numeric array, it is not that the operation X(end+1) = 7; is very expensive in itself. But the first time (starting with the empty array) one numeric value needs to be copied; the next time, two numeric values need to be copied; the third time three numeric values... so after the third time, 1+2+3 = 6 values have been copied. After the N'th time, 1+2+3+4...+N values have been copied... which is N*(N+1)/2 which is proportional to the square of N. And that is what adds up.
The vector-of-handles approach has the same issue: one handle, two handles, three handles, ... N'th time, N*(N+1)/2 handles have been copied to grow that large incrementally. If N is small, that probably doesn't matter. If N is tens of thousands, it definitely does -- even though this approach is a lot less expensive than copying around the signal contents themselves, management of large numbers of signals has its costs.
  1 Comment
TAB
TAB on 10 Oct 2021
@Walter Roberson, Thanks a lot for detailed explaination.
This is what I was trying to understand.

Sign in to comment.

More Answers (0)

Tags

Products


Release

R2019b

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!