Problem importing files-- matlab version changes

Question

Dylan Mecca on 13 Apr 2018

0
Link

Direct link to this question

https://uk.mathworks.com/matlabcentral/answers/394688-problem-importing-files-matlab-version-changes

Edited: dpb on 22 Apr 2018

Hi all,

I just updated my license from a 2014a to 2017b and I've noticed that the uiimport command is acting differently. I will post sample data for what I typically work with.

Looking through my code, you will notice that all of my variables are defined by the first row elements in the sample data. The 2014a version would read and import first line as the variable and the columns as the variable data. Now, the import is importing the WHOLE file in to the worspace instead of individual variables like I've just described.

Is there a way to make the 2017b version act like the 2014a version? Or is there a command that can do something along the lines that can generically call out columns to assign them to a variable? for example:

examplefile(:,1) = a

examplefile(:,2) = b

etc.

More specifically, how would I be able to call out a generic file? The file I've posted is just a sample, and I will be running other data files too.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Sign in to answer this question.

Answer 1

dpb on 17 Apr 2018

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/394688-problem-importing-files-matlab-version-changes#answer_315670

Edited: dpb on 17 Apr 2018

Open in MATLAB Online

OK, the mystery is solv'ed; the root cause is uiimport with it's "new and improved" flavor in wanting to use table class and then even if select columns it wants to turn text in which there are many more rows than unique values into categorical variables instead of char/cellstr and the initial code was written with string comparison operators which fail (silently, unfortunately) on categorical variables.

The workaround in short term is to convert the categorical variable to cellstr and carry on--I'll add this here to point out there really are some advantages if you will allow Matlab to have its way--consider the Details array; as you were using it, you presently are doing string comparisons to locate certain pieces of information as

Q = repmat('_1/4',[WrdVec,1]);
AQ=strcmp(Details,Q);

As previously noted, you do not need the Q array to do the comparison, simply compare the array to a single string does the same job with much less memory, the length of Q in this case is over 16K. IOW,

Q = '_1/4';
AQ=strcmp(Details,Q);

is identical in result or you can even do without the temporary Q altho I can see good reason to not bury "magic strings" in the code itself so that's not a bad factoring.

OTOH, if you let Matlab go ahead and convert the input to categorical, you can then do the same thing as

Q = '_1/4';
AQ=(Details==Q);

and eliminate the comparison function itself in favor of the equality operator.

Categorical variables have many other helpful uses in categorizing functionality as well that could turn out very useful besides just logical addressing. One nice feature that could be very helpful in debugging or otherwise looking at your data is

>> summary(Deteils)
   Deteils                 1 
   _0.1G                1789 
   _0.2G                 667 
   _0.3G                 819 
   _0.4G                 847 
   _0.500                  1 
   _1/4                  812 
   _1/8                  889 
   _100                   50 
   _500                  563 
   _65                    50 
   _8000                5130 
   _Alarm                100 
   _GPS receive            1 
   _Half                 566 
   _WOT                  495 
   _3/8                  678 
   _5000                  47 
   _Complete(HSTT)         1 
   <undefined>          2954 
>>

that produces a summary of the categories; easy to find a miscoded entry that way, maybe, if nothing else? (I just imported the file, didn't run the script so hadn't renamed the array you'll notice.)

All in all, I'd suggest revisiting the subject and consider perhaps using some of the newer features instead of beating MATLAB totally into submission.

3 Comments
Show 1 older commentHide 1 older comment

dpb on 19 Apr 2018

Edited: dpb on 22 Apr 2018

Open in MATLAB Online

I noticed there being a lot of duplicated code in your m-file; thought I'd illustrate how to reduce that some...a first part for the choice between MPH vis a vis KPH as the input units can be reduced to

% conversion factors
KPH2mps=1000/60/60;                  % k/hr --> m/s
MPH2mps=5280*12*2.54/60/60/100;      % mph --> m/s
%Looking for speed units
if exist('Speedmph','var')          % will be 1/0, don't need explicit 1
  flgMPH=true;                      % indicator for which units have
  meterpersecondmph = Speedmph*MPH2mps;
  cmdmeterpersecondmph = TGTSpeed*MPH2mps;
else                                % not MPH, must be KPH
  flgMPH=false;                     % indicator KPH --> ~MPH
  meterpersecondkph = Speedkph*KPH2mps;
  cmdmeterpersecondkph = TGTSpeed*KPH2mps;
end
Pattern_Size = size(Time,1);
time = datevec(Time);
DistCalc = hypot(diff(XPoint),diff(YPoint));
% Decimate based on size of imported vector length
% XINT, YINT, YINTMAX interpolation breakpoints vs size and max
XINT=[1 [10:13 50]*1000]; YINT=[1 2:4 4 4]; YINTMAX=5;
nDec=interp1(XINT,YINT,size(Time,1),'nearest',YINTMAX); % compute decimation
% data generic to either MPH or KPH; decimate as desired
% DOWNSAMPLE() in Signal Processing TB; if don't have necessary TB use
% 1:nDec:end for the row indexing expression; is just "syntax sugar"
ti = downsample(time(:,4:6),nDec);
RPM = downsample(Rpm,nDec);
TGTS = downsample(TGTSpeed,nDec);
ThtlImpt = downsample(Throttle,nDec);
X = downsample(XPoint,nDec);
Y = downsample(YPoint,nDec);
Details = downsample(Deteils,nDec);
Distance = downsample(Distancekm,nDec);
if flgMPH %data specific to mph data
  MPH = downsample(Speedmph,nDec);
  mtrsecmph = downsample(meterpersecondmph,nDec);
  cmdmtrsecmph = downsample(cmdmeterpersecondmph,nDec);
else % ~flMPH --> KPH
  KPH = downsample(Speedkph,nDec);
  mtrseckph = downsample(meterpersecondkph,nDec);
  cmdmtrseckph = downsample(cmdmeterpersecondkph,nDec);
end
display(num2str(nDec,'1:%d interval'))

This leaves all your variables so far intact; if I were to do this I'd create ONE generic name for each of the outputs that is independent of the data contained therein and simply store the desired output for that variable. This would eliminate almost all of the following redundant code and most of the logical branching. Doing such refactorization has the advantage of shortening the code drastically plus a change only has to be incorporated at one place instead of multiple places; a highly error-prone process.

I did, however, come upon an anomaly that I can't understand/explain related to the Time input field; what, precisely is the data as shown? You've processed it as Hr,Min,Sec; but the input format doesn't seem to be any standard form I've seen and datevec isn't converting the decimal fraction as MM:SS.S as I would kinda' presume from the text format it is.

As is, the conversion to ftime is at least on the copy of the script I downloaded, generating some negative times in the middle of the vector; that makes for funny-looking plots and I suspect, some erroneous data in your plots.

I've not had time yet to decipher just why the negatives are showing up; figured I'd try to get a precise definition of what the input is supposed to mean before doing anything further. I do think calling

datevec(Time)

is very risky at best and likely not producing what you really want.

This is plot I get with either your original or with the above simplifications; note the time axis has negative time values--

Do your plots look similar there; there's an issue methinks if so but I don't think it has anything to do with the previous issue but one of a logic error in processing the time data that's causing a wraparound problem. The place in the vector it turns negative is:

>> ftime(2665:2675)
ans =
     74262
     74292
     74316
     74340
     74370
     74394
    -11982
    -11952
    -11928
    -11898
    -11862
>>

Tell us what the interpretation of the text string for Time really is and we can sort out how to process it correctly but I'm pretty sure it isn't right in your existing code, whatever it is actually supposed to be.

dpb on 19 Apr 2018

Open in MATLAB Online

OK, I think I can prove what the Time field is and it's what I thought it should be based on the text format--it's MM:SS.S; NOT hours so all your time scaling is bogus I'm virtually positive.

I figure this from the spreadsheet data--first two data records are

Time    Distance(km)    Speed(kph)
03:20.0         0           54.44
03:20.1     0.001509        54.33
km/(km/hr) --> hr * 3600 --> sec
>> 0.001509/54.4*3600
ans =
  0.0999
>>

which is almost identically 0.1 sec, the difference between the two times as text assuming it is, as it looks like, MM:SS.S. The next data record works out at ~0.2 sec; also in agreement with the recorded time stamp.

I'm out of time tonight, but will look at the proper Matlab code to read that time string and convert to proper elapsed time; this is probably the place for a duration variable altho I've not used them much and have had a few surprises with them so will need to make sure use them correctly for the purpose.

dpb on 19 Apr 2018

Open in MATLAB Online

Dylan may be long gone, and he didn't leave contact info in profile, but just in case...

...  
% NB:  DATEVEC() IS WRONG FUNCTION FOR THE PURPOSE Time is being imported
% as string (R2017b) and is actually MM:SS.S clock time in the sample file
% Don't know what happens if rolls over the hour; this file only covers
% about 30 minutes 03:20.0 thru 34:57.3.  I PRESUME it will then roll over
% from 59:59.9 to HH:00:00.0.  I SUSPECT STRONGLY need to fixup the input
% to augment a string length to incorporate the missing HH field.
%
% It is also not known what happens if the data were to be collected 
% over a day rollover although it is also PRESUMED then the Date field will
% also increment by a day and the Time string reset.  In this case,
% ignoring the Date in conversion to time will also cause an apparent
% time travel in the data acquisition time vector.
%
% The present script produces a negative time in the middle; the "WHY"
% has not been fully investigated, simply fixed up the parsing for the 
% existing case of Time being within one calendar day.  --dpb
%time = datevec(Time);

% Update time handling -- dpb
iShortTime=(strlength(Time)==7);            % find those without leading HH:
Time(iShortTime)="00:"+Time(iShortTime);    % "mm:ss.s" --> "00:mm:ss.s"
daytime=datetime(string(Date)+" "+Time, ... % concatenate Date/Time, convert
           'inputformat','MM/dd/yyyy HH:mm:ss.S', ...
           'Format','preserveinput');
DistCalc = hypot(diff(XPoint),diff(YPoint));

% Decimate based on size of imported vector length
% XINT, YINT, YINTMAX interpolation breakpoints vs size and max
XINT=[1 [10:13 50]*1000]; YINT=[1 2:4 4 4]; YINTMAX=5;
nDec=interp1(XINT,YINT,size(Time,1),'nearest',YINTMAX); % compute decimation

% data generic to either MPH or KPH; decimate as desired
% DOWNSAMPLE() in Signal Processing TB; if don't have necessary TB use
% 1:nDec:end for the row indexing expression; is just "syntax sugar"
time = downsample(daytime,nDec);  % dpb time fixup; keep the datetime values
RPM = downsample(Rpm,nDec);
TGTS = downsample(TGTSpeed,nDec);
ThtlImpt = downsample(Throttle,nDec);
X = downsample(XPoint,nDec);
Y = downsample(YPoint,nDec);
Details = downsample(Deteils,nDec);
Distance = downsample(Distancekm,nDec);
if flgMPH %data specific to mph data
  MPH = downsample(Speedmph,nDec);
  mtrsecmph = downsample(meterpersecondmph,nDec);
  cmdmtrsecmph = downsample(cmdmeterpersecondmph,nDec);
else % ~flMPH --> KPH
  KPH = downsample(Speedkph,nDec);
  mtrseckph = downsample(meterpersecondkph,nDec);
  cmdmtrseckph = downsample(cmdmeterpersecondkph,nDec);
end
display(num2str(nDec,'1:%d interval'))

%Defining a usable Time vector
% See above notes on time--replaced original entirely -- dpb
%htime = (h + m + ti(:,3));
%ftime = (h + m + ti(:,3)-htime(1,1));
htime=time-time(1);           % duration array
htime.Format='s';             % display as seconds for debug use
ftime=seconds(htime);         % return as double array of seconds
  ...

leaves the same ftime variable going forward; with that the time-based plots now look much more reasonable--

Sign in to comment.

Answer 2

dpb on 22 Apr 2018

1
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/394688-problem-importing-files-matlab-version-changes#answer_316477

Edited: dpb on 22 Apr 2018

Open in MATLAB Online

prac.m

While unfortunately may not return to see it; I took a little time and recast the function to reduce the duplicated code and clean up the many orange lines the ML editor complained about in the original...a great number having to do with not preallocating arrays and augmenting them in loops. Without getting carried away with generic names; simply reducing the branching to near-minimal for the difference between input variables in mph vs kph and using array syntax instead of loops I cut the file length from approx. 630 lines to 330 (including the 25-30 lines of additional comments added) or less than half in actual code.

Incorporated categorical for the variables R2017b wanted to turn into them as well as including the fixup on the time parsing as noted. Results are the same. ( ERRATUM The earlier note about one variable being different was an error on my part in dropping a factor of 10 in the length of input vector for decimation so the previous results were based on 1:4 instead of original 1:1. Fixed in the attached script.)

If one were to further factor the code into functions to calculate the basic results and pass the appropriate data to them and return the results in arrays instead of named variables for every variation, one could probably cut the amount of code by nearly half again.

NB: I defined a variable DATALOADED to keep from having to go thru the early preambles every time while testing once the file had been imported once by setting

clearvars; DATALOADED=false; prac

or after a run that did load

DATALOADED=true; prac

that let me work piecewise through the sections without having to mess with loading the data every time if didn't need a fresh copy. You can just delete or comment out that IF...END block if don't want it.

I tried to figure out a more user-friendly way to pause the script so don't have the refocus thingie, but none of the UI message boxes with UIWAIT() seemed to work correctly with the UIIMPORT window. There's probably some way but I'm not a GUI-type guy; that's out of my league.

Hopefully this can help some; I do think there were/are some issues in the original posted that need to be fixed but don't have any way to directly pass them on; best I can do is this.

0 Comments
Show -2 older commentsHide -2 older comments

Sign in to comment.

Answer 3

dpb on 13 Apr 2018

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/394688-problem-importing-files-matlab-version-changes#answer_314972

As for the behavior change, don't know about it; R2016b is last I've yet to install and it also automagically selected only the data to import. While it is somewhat of a pit(proverbial)a(ppendage) to have to do so, you should be able to force the previous behavior by moving the data selected for import row to the line after the header line.

If it really does as you say consistently on files that were recognized properly before (and, as said, downloading your file and using R2016b was the desired and seems-logical behavior here as well), I'd say it qualifies as an introduced bug and is worthy of an official bug report.

I have to admit this was the first (or maybe second, I think I now recall another query here that used it) time I've ever looked at the function; I generally just directly read a file instead of needing or wanting to look at it first; if I want a dialog selector I'll use uigetfile.

You've already got the fairly extensive m-file written as is; if I were to do something similar now I'd almost certainly use the table and readtable to load the file. For that to work best you'd need to fix up the column headings to be valid Matlab variable names by eliminating the spaces (underscores are useful here) and so on. Oh...mayhaps that's what "broke" between releases; change the headings in a file so that each is a valid variable name and try uiimport again. That would be things like 'Pattern Name' to 'Pattern' or what was done previously internally to just eliminate the non-allowable characters/patterns 'PatternName' (personally, I'll take the shorter over the longer, but that's purely a user-preference thing).

1 Comment
Show -1 older commentsHide -1 older comments

dpb on 13 Apr 2018

Edited: dpb on 13 Apr 2018

Open in MATLAB Online

Just a couple minor things caught my eye in your m-file...first off, "magic numbers" of limited precision for conversion factors; I'd recommend to define them as variables (unfortunate Matlab doesn't have PARAMETERs although you can go the extra trouble by building a class and use constant therein).

For example,

meterpersecondkph = Speedkph.*0.277778;

the 0.277... is really 1000/(60*60) and if computed as a variable would have full machine double precision instead of fewer than the decimal digits of a single besides being more legible and the cost of the one computation is miniscule.

KPH2MPS=1000/60/60;            % conversion factor kph --> mps
...
Speedmps = Speedkph*KPH2MPS;   % speed, mps

(Don't need the "dot" operator to multiply by a constant).

As a small bit of using Matlab-defined functions already defined for common tasks,

DistCalc=sqrt((XPoint(2:end)-XPoint(1:end-1)).^2 + (YPoint(2:end)-YPoint(1:end-1)).^2);

can be written as

DistCalc=hypot(diff(XPoint),diff(YPoint));

Takes "time in grade" and exploring the myriad of functions there are for various things to eventually become familiar with a larger repertoire...but pays dividends with time.

Sign in to comment.

Answer 4

Steven Lord on 13 Apr 2018

0
Link

Direct link to this answer

https://uk.mathworks.com/matlabcentral/answers/394688-problem-importing-files-matlab-version-changes#answer_314989

Open in MATLAB Online

So you want to read each column of data into a separate variable, which for the sample CSV file you posted would mean sixteen different variables? You can do that. Change the Output Type from "Table" to "Column vectors" in the "Imported Data" section of the Import tab of the Import Tool's toolstrip (near the middle of the tab.) But I would encourage you to consider storing your data in a table instead.

All those pieces of data are related and so if they're in one variable, a table, it's more difficult to accidentally overwrite one of those pieces of data with something unrelated. For example, two of your columns are named Date and Time. How easy would it be to accidentally define "Date = datetime('today')" or something similar?

I see in the code you posted that you're checking for the existence of variables named Speedmph or Speedkph. In a table that's a little bit trickier to do but not that much more difficult.

>> ismember('Speedmph', Mx00CASESTUDY0001.Properties.VariableNames)
ans =
  logical
   0
>> ismember('Speedkph', Mx00CASESTUDY0001.Properties.VariableNames)
ans =
  logical
   1

You could encapsulate that in a function if you needed to perform those queries repeatedly.

function IV = isVariable(thetable, variableName)
IV = ismember(variableName, thetable.Properties.VariableNames);

Or you could use try and catch.

>> units = 'mph';
>> try
  speed = Mx00CASESTUDY0001.Speedmph;
catch
  units = 'kph';
  speed = Mx00CASESTUDY0001.Speedkph;
end

The try / catch approach would also let you do some tricks like dynamic variable referencing. If units is 'mph' this will compare the variable speed with the variable Speedmph in the table, and if units is 'kph' it will compare speed with the variable Speedkph in the table.

>> isequal(speed, Mx00CASESTUDY0001.(['Speed' units]))
ans =
  logical
   1

If I've misunderstood what you're trying to do, can you clarify what these sentences from your question meant? "The 2014a version would read and import first line as the variable and the columns as the variable data. Now, the import is importing the WHOLE file in to the worspace instead of individual variables like I've just described."

22 Comments
Show 20 older commentsHide 20 older comments

dpb on 17 Apr 2018

Open in MATLAB Online

Ah, but something has...the question is determining specifically, what. Since strcmp is exact and byte-oriented; I'm wondering if there's possibly an issue between releases that all of a sudden now, your data internally is 2-byte character instead of 1-byte or something such as that.

Externally the data haven't changed; granted, I'm wondering if the internal storage has between releases and trying to see if can determine that without having later than R2016b here which seems still functional.

What I was suggested was to take one of (your choice, just should have some values that should match of course)

...
AQ = strcmp(Details,Q);
AH = strcmp(Details,Half);
...

and

savefile = 'AQfile.mat';
save(savefile, 'Details', 'Q', 'AQ');

and attach the file. Then can load an image of what your system has here and see if it looks any different than that loaded natively here.

I suppose TMW could have broken strcmp when introduced the string class but seems unlikely the issue is really in it but more likely in how it's being used or the internal data storage to me.

I think I can download the same release but that likely isn't something I'll have time to do "real soon now"...trying to see if can get somewhere more quickly to work you around your issues.

dpb on 17 Apr 2018

Edited: dpb on 17 Apr 2018

Open in MATLAB Online

There's the problem; the Deteils data got imported as categorical instead of character/cellstr string...that's why they don't compare (but I'd've thunk that would give a type mismatch; have to 'splore the "why" of that. But, as I suspected, the problem is the internal data storage. Have to figure out how to prevent that from happening; I'm pretty sure you can manually go in at the uiimport window and select the data type by variable, but that'd be a real pit(proverbial)a(ppendage) to do all the time.

I came in to dinner from field and started the R2017b download while heating it up; not sure if can get installed and take enough time right now but

>> D=categorical(Details);
>> whos D
Name          Size            Bytes  Class          Attributes
D         16459x1             18679  categorical              
>> ix=strcmp(Details,'_1/4');  % what we already know...
>> sum(ix)
ans =
 812
>> ix=strcmp(D,'_1/4');  % try with categorical array
>> sum(ix)
ans =
   0
>>

There's your answer and indeed, even w/ R2016b (and probably since forever) a categorical is accepted by strcmp() but, not terribly surprisingly, doesn't do an exact match since their internal storage is different.

A workaround to get you where you want to go in a hurry is

Details=cellstr(Details);

There are some advantages in using categorical in the long run/big picture of your application, and there are several areas as have noted that could be made significantly more efficient, but the above should solve your immediate problem and make the code run again.

NB: Do a

whos

on all variables in the workspace and if there are others that are being imported as categorical as well that you want as character/cellstring, convert them, too.

I've got to go back to the field, but that should let you get past top dead center...

NB2: Afore I heads back out...just checked here and in R2016b uiimport only allows selecting the Deteils column (why's it misspelled, btw????) as either Text or Date; apparently later releases at some point expanded the allowable choices and they made it try to be helpful in choosing Categorical as a default to save memory; sometimes "new" isn't necessarily "improved"; I'm guess with your release you could change the type before you import instead of making the fixup in the code but unless there's some way to fixup the default internal logic that would be an "everytime" thing, too.

Dylan Mecca on 17 Apr 2018

I don't know why I didn't think to do variable conversions. For an immediate feedback, the conversion seems to be working and the code looks to be acting back to normal. I should now more the more I look at these results.

dpb on 17 Apr 2018

The key thing to learn here re: debugging when something mysterious happens is to poke at the internals thoroughly and don't presume what may be wrong; prove/disprove every hypothesis with data.

Here, looking at the internal storage to be sure you have what you think you have instead of thinking it was like it was before was the key--I was guessing it was going to show char but be 2-bytes/per instead of one; hadn't thought of it getting automagically converted to a categorical.

I also hadn't ever tried that operation on a categorical so wasn't aware of that behavior; I did submit a comment to TMW on the failure in the string comparison functions to address their behavior; it is totally undocumented; guess nobody ever thought to do anything with them when they introduced the new data class, specifically.

R2017b is installing so can't do a check immediately; it may be that if you create the name to go with the category then the string comparison will work but then I doubt one saves enough memory to make it worth the effort.

Sign in to comment.

Problem importing files-- matlab version changes

0 Comments
Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments
Show 1 older commentHide 1 older comment

More Answers (3)

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

22 Comments
Show 20 older commentsHide 20 older comments

See Also

Categories

Tags

Community Treasure Hunt

Problem importing files-- matlab version changes

0 Comments Show -2 older commentsHide -2 older comments

Accepted Answer

3 Comments Show 1 older commentHide 1 older comment

More Answers (3)

0 Comments Show -2 older commentsHide -2 older comments

1 Comment Show -1 older commentsHide -1 older comments

22 Comments Show 20 older commentsHide 20 older comments

See Also

Categories

Tags

Community Treasure Hunt

0 Comments
Show -2 older commentsHide -2 older comments

3 Comments
Show 1 older commentHide 1 older comment

0 Comments
Show -2 older commentsHide -2 older comments

1 Comment
Show -1 older commentsHide -1 older comments

22 Comments
Show 20 older commentsHide 20 older comments