Clear Filters
Clear Filters

Problem importing files-- matlab version changes

13 views (last 30 days)
Hi all,
I just updated my license from a 2014a to 2017b and I've noticed that the uiimport command is acting differently. I will post sample data for what I typically work with.
Looking through my code, you will notice that all of my variables are defined by the first row elements in the sample data. The 2014a version would read and import first line as the variable and the columns as the variable data. Now, the import is importing the WHOLE file in to the worspace instead of individual variables like I've just described.
Is there a way to make the 2017b version act like the 2014a version? Or is there a command that can do something along the lines that can generically call out columns to assign them to a variable? for example:
examplefile(:,1) = a
examplefile(:,2) = b
etc.
More specifically, how would I be able to call out a generic file? The file I've posted is just a sample, and I will be running other data files too.

Accepted Answer

dpb
dpb on 17 Apr 2018
Edited: dpb on 17 Apr 2018

OK, the mystery is solv'ed; the root cause is uiimport with it's "new and improved" flavor in wanting to use table class and then even if select columns it wants to turn text in which there are many more rows than unique values into categorical variables instead of char/cellstr and the initial code was written with string comparison operators which fail (silently, unfortunately) on categorical variables.

The workaround in short term is to convert the categorical variable to cellstr and carry on--I'll add this here to point out there really are some advantages if you will allow Matlab to have its way--consider the Details array; as you were using it, you presently are doing string comparisons to locate certain pieces of information as

Q = repmat('_1/4',[WrdVec,1]);
AQ=strcmp(Details,Q);

As previously noted, you do not need the Q array to do the comparison, simply compare the array to a single string does the same job with much less memory, the length of Q in this case is over 16K. IOW,

Q = '_1/4';
AQ=strcmp(Details,Q);

is identical in result or you can even do without the temporary Q altho I can see good reason to not bury "magic strings" in the code itself so that's not a bad factoring.

OTOH, if you let Matlab go ahead and convert the input to categorical, you can then do the same thing as

Q = '_1/4';
AQ=(Details==Q);

and eliminate the comparison function itself in favor of the equality operator.

Categorical variables have many other helpful uses in categorizing functionality as well that could turn out very useful besides just logical addressing. One nice feature that could be very helpful in debugging or otherwise looking at your data is

>> summary(Deteils)
   Deteils                 1 
   _0.1G                1789 
   _0.2G                 667 
   _0.3G                 819 
   _0.4G                 847 
   _0.500                  1 
   _1/4                  812 
   _1/8                  889 
   _100                   50 
   _500                  563 
   _65                    50 
   _8000                5130 
   _Alarm                100 
   _GPS receive            1 
   _Half                 566 
   _WOT                  495 
   _3/8                  678 
   _5000                  47 
   _Complete(HSTT)         1 
   <undefined>          2954 
>>

that produces a summary of the categories; easy to find a miscoded entry that way, maybe, if nothing else? (I just imported the file, didn't run the script so hadn't renamed the array you'll notice.)

All in all, I'd suggest revisiting the subject and consider perhaps using some of the newer features instead of beating MATLAB totally into submission.

  3 Comments
dpb
dpb on 19 Apr 2018
OK, I think I can prove what the Time field is and it's what I thought it should be based on the text format--it's MM:SS.S; NOT hours so all your time scaling is bogus I'm virtually positive.
I figure this from the spreadsheet data--first two data records are
Time Distance(km) Speed(kph)
03:20.0 0 54.44
03:20.1 0.001509 54.33
km/(km/hr) --> hr * 3600 --> sec
>> 0.001509/54.4*3600
ans =
0.0999
>>
which is almost identically 0.1 sec, the difference between the two times as text assuming it is, as it looks like, MM:SS.S. The next data record works out at ~0.2 sec; also in agreement with the recorded time stamp.
I'm out of time tonight, but will look at the proper Matlab code to read that time string and convert to proper elapsed time; this is probably the place for a duration variable altho I've not used them much and have had a few surprises with them so will need to make sure use them correctly for the purpose.
dpb
dpb on 19 Apr 2018

Dylan may be long gone, and he didn't leave contact info in profile, but just in case...

...  
% NB:  DATEVEC() IS WRONG FUNCTION FOR THE PURPOSE Time is being imported
% as string (R2017b) and is actually MM:SS.S clock time in the sample file
% Don't know what happens if rolls over the hour; this file only covers
% about 30 minutes 03:20.0 thru 34:57.3.  I PRESUME it will then roll over
% from 59:59.9 to HH:00:00.0.  I SUSPECT STRONGLY need to fixup the input
% to augment a string length to incorporate the missing HH field.
%
% It is also not known what happens if the data were to be collected 
% over a day rollover although it is also PRESUMED then the Date field will
% also increment by a day and the Time string reset.  In this case,
% ignoring the Date in conversion to time will also cause an apparent
% time travel in the data acquisition time vector.
%
% The present script produces a negative time in the middle; the "WHY"
% has not been fully investigated, simply fixed up the parsing for the 
% existing case of Time being within one calendar day.  --dpb
%time = datevec(Time);
% Update time handling -- dpb
iShortTime=(strlength(Time)==7);            % find those without leading HH:
Time(iShortTime)="00:"+Time(iShortTime);    % "mm:ss.s" --> "00:mm:ss.s"
daytime=datetime(string(Date)+" "+Time, ... % concatenate Date/Time, convert
           'inputformat','MM/dd/yyyy HH:mm:ss.S', ...
           'Format','preserveinput');
DistCalc = hypot(diff(XPoint),diff(YPoint));
% Decimate based on size of imported vector length
% XINT, YINT, YINTMAX interpolation breakpoints vs size and max
XINT=[1 [10:13 50]*1000]; YINT=[1 2:4 4 4]; YINTMAX=5;
nDec=interp1(XINT,YINT,size(Time,1),'nearest',YINTMAX); % compute decimation
% data generic to either MPH or KPH; decimate as desired
% DOWNSAMPLE() in Signal Processing TB; if don't have necessary TB use
% 1:nDec:end for the row indexing expression; is just "syntax sugar"
time = downsample(daytime,nDec);  % dpb time fixup; keep the datetime values
RPM = downsample(Rpm,nDec);
TGTS = downsample(TGTSpeed,nDec);
ThtlImpt = downsample(Throttle,nDec);
X = downsample(XPoint,nDec);
Y = downsample(YPoint,nDec);
Details = downsample(Deteils,nDec);
Distance = downsample(Distancekm,nDec);
if flgMPH %data specific to mph data
  MPH = downsample(Speedmph,nDec);
  mtrsecmph = downsample(meterpersecondmph,nDec);
  cmdmtrsecmph = downsample(cmdmeterpersecondmph,nDec);
else % ~flMPH --> KPH
  KPH = downsample(Speedkph,nDec);
  mtrseckph = downsample(meterpersecondkph,nDec);
  cmdmtrseckph = downsample(cmdmeterpersecondkph,nDec);
end
display(num2str(nDec,'1:%d interval'))
%Defining a usable Time vector
% See above notes on time--replaced original entirely -- dpb
%htime = (h + m + ti(:,3));
%ftime = (h + m + ti(:,3)-htime(1,1));
htime=time-time(1);           % duration array
htime.Format='s';             % display as seconds for debug use
ftime=seconds(htime);         % return as double array of seconds
  ...

leaves the same ftime variable going forward; with that the time-based plots now look much more reasonable--

Sign in to comment.

More Answers (3)

dpb
dpb on 22 Apr 2018
Edited: dpb on 22 Apr 2018

While unfortunately may not return to see it; I took a little time and recast the function to reduce the duplicated code and clean up the many orange lines the ML editor complained about in the original...a great number having to do with not preallocating arrays and augmenting them in loops. Without getting carried away with generic names; simply reducing the branching to near-minimal for the difference between input variables in mph vs kph and using array syntax instead of loops I cut the file length from approx. 630 lines to 330 (including the 25-30 lines of additional comments added) or less than half in actual code.

Incorporated categorical for the variables R2017b wanted to turn into them as well as including the fixup on the time parsing as noted. Results are the same. ( ERRATUM The earlier note about one variable being different was an error on my part in dropping a factor of 10 in the length of input vector for decimation so the previous results were based on 1:4 instead of original 1:1. Fixed in the attached script.)

If one were to further factor the code into functions to calculate the basic results and pass the appropriate data to them and return the results in arrays instead of named variables for every variation, one could probably cut the amount of code by nearly half again.

NB: I defined a variable DATALOADED to keep from having to go thru the early preambles every time while testing once the file had been imported once by setting

clearvars; DATALOADED=false; prac

or after a run that did load

DATALOADED=true; prac

that let me work piecewise through the sections without having to mess with loading the data every time if didn't need a fresh copy. You can just delete or comment out that IF...END block if don't want it.

I tried to figure out a more user-friendly way to pause the script so don't have the refocus thingie, but none of the UI message boxes with UIWAIT() seemed to work correctly with the UIIMPORT window. There's probably some way but I'm not a GUI-type guy; that's out of my league.

Hopefully this can help some; I do think there were/are some issues in the original posted that need to be fixed but don't have any way to directly pass them on; best I can do is this.


dpb
dpb on 13 Apr 2018
As for the behavior change, don't know about it; R2016b is last I've yet to install and it also automagically selected only the data to import. While it is somewhat of a pit(proverbial)a(ppendage) to have to do so, you should be able to force the previous behavior by moving the data selected for import row to the line after the header line.
If it really does as you say consistently on files that were recognized properly before (and, as said, downloading your file and using R2016b was the desired and seems-logical behavior here as well), I'd say it qualifies as an introduced bug and is worthy of an official bug report.
I have to admit this was the first (or maybe second, I think I now recall another query here that used it) time I've ever looked at the function; I generally just directly read a file instead of needing or wanting to look at it first; if I want a dialog selector I'll use uigetfile.
You've already got the fairly extensive m-file written as is; if I were to do something similar now I'd almost certainly use the table and readtable to load the file. For that to work best you'd need to fix up the column headings to be valid Matlab variable names by eliminating the spaces (underscores are useful here) and so on. Oh...mayhaps that's what "broke" between releases; change the headings in a file so that each is a valid variable name and try uiimport again. That would be things like 'Pattern Name' to 'Pattern' or what was done previously internally to just eliminate the non-allowable characters/patterns 'PatternName' (personally, I'll take the shorter over the longer, but that's purely a user-preference thing).
  1 Comment
dpb
dpb on 13 Apr 2018
Edited: dpb on 13 Apr 2018
Just a couple minor things caught my eye in your m-file...first off, "magic numbers" of limited precision for conversion factors; I'd recommend to define them as variables (unfortunate Matlab doesn't have PARAMETERs although you can go the extra trouble by building a class and use constant therein).
For example,
meterpersecondkph = Speedkph.*0.277778;
the 0.277... is really 1000/(60*60) and if computed as a variable would have full machine double precision instead of fewer than the decimal digits of a single besides being more legible and the cost of the one computation is miniscule.
KPH2MPS=1000/60/60; % conversion factor kph --> mps
...
Speedmps = Speedkph*KPH2MPS; % speed, mps
(Don't need the "dot" operator to multiply by a constant).
As a small bit of using Matlab-defined functions already defined for common tasks,
DistCalc=sqrt((XPoint(2:end)-XPoint(1:end-1)).^2 + (YPoint(2:end)-YPoint(1:end-1)).^2);
can be written as
DistCalc=hypot(diff(XPoint),diff(YPoint));
Takes "time in grade" and exploring the myriad of functions there are for various things to eventually become familiar with a larger repertoire...but pays dividends with time.

Sign in to comment.


Steven Lord
Steven Lord on 13 Apr 2018
So you want to read each column of data into a separate variable, which for the sample CSV file you posted would mean sixteen different variables? You can do that. Change the Output Type from "Table" to "Column vectors" in the "Imported Data" section of the Import tab of the Import Tool's toolstrip (near the middle of the tab.) But I would encourage you to consider storing your data in a table instead.
All those pieces of data are related and so if they're in one variable, a table, it's more difficult to accidentally overwrite one of those pieces of data with something unrelated. For example, two of your columns are named Date and Time. How easy would it be to accidentally define "Date = datetime('today')" or something similar?
I see in the code you posted that you're checking for the existence of variables named Speedmph or Speedkph. In a table that's a little bit trickier to do but not that much more difficult.
>> ismember('Speedmph', Mx00CASESTUDY0001.Properties.VariableNames)
ans =
logical
0
>> ismember('Speedkph', Mx00CASESTUDY0001.Properties.VariableNames)
ans =
logical
1
You could encapsulate that in a function if you needed to perform those queries repeatedly.
function IV = isVariable(thetable, variableName)
IV = ismember(variableName, thetable.Properties.VariableNames);
Or you could use try and catch.
>> units = 'mph';
>> try
speed = Mx00CASESTUDY0001.Speedmph;
catch
units = 'kph';
speed = Mx00CASESTUDY0001.Speedkph;
end
The try / catch approach would also let you do some tricks like dynamic variable referencing. If units is 'mph' this will compare the variable speed with the variable Speedmph in the table, and if units is 'kph' it will compare speed with the variable Speedkph in the table.
>> isequal(speed, Mx00CASESTUDY0001.(['Speed' units]))
ans =
logical
1
If I've misunderstood what you're trying to do, can you clarify what these sentences from your question meant? "The 2014a version would read and import first line as the variable and the columns as the variable data. Now, the import is importing the WHOLE file in to the worspace instead of individual variables like I've just described."
  22 Comments
Dylan Mecca
Dylan Mecca on 17 Apr 2018
I don't know why I didn't think to do variable conversions. For an immediate feedback, the conversion seems to be working and the code looks to be acting back to normal. I should now more the more I look at these results.
dpb
dpb on 17 Apr 2018
The key thing to learn here re: debugging when something mysterious happens is to poke at the internals thoroughly and don't presume what may be wrong; prove/disprove every hypothesis with data.
Here, looking at the internal storage to be sure you have what you think you have instead of thinking it was like it was before was the key--I was guessing it was going to show char but be 2-bytes/per instead of one; hadn't thought of it getting automagically converted to a categorical.
I also hadn't ever tried that operation on a categorical so wasn't aware of that behavior; I did submit a comment to TMW on the failure in the string comparison functions to address their behavior; it is totally undocumented; guess nobody ever thought to do anything with them when they introduced the new data class, specifically.
R2017b is installing so can't do a check immediately; it may be that if you create the name to go with the category then the string comparison will work but then I doubt one saves enough memory to make it worth the effort.

Sign in to comment.

Categories

Find more on Data Type Identification in Help Center and File Exchange

Tags

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!