Showing posts with label cell columns. Show all posts
Showing posts with label cell columns. Show all posts

Tuesday, 28 June 2016

Problem with MATLAB textscan only reading in first cell

I was having a problem with MATLAB using the textscan() function, where it was only reading in the data for the first cell, but wasn't throwing any errors.

I was using the following command:

T = textscan(fid, format_spec, 'HeaderLines', 0);

Evaluating T seemed to show that the file was being read correctly, with the correct number of colums matching my CSV.

>> T
T =
  Columns 1 through 4
    {1x1 cell}    [11]    [0x1 double]    [0x1 double]
  Columns 5 through 7
    [0x1 double]    [0x1 double]    [0x1 double]
  Columns 8 through 10
    [0x1 double]    [0x1 double]    [0x1 double]

The problem was that I was reading a CSV file, and the default setting for textscan uses another option for the delimiter. By leaving this field unspecified, the function was not able to handle the file correctly. I changed the line to the following:

T = textscan(fid, format_spec, 'Delimiter', ',', 'HeaderLines', 0);

This now read in the entire file correctly, revealing all of the rows were read in as expected:

>> T
T =
  Columns 1 through 2
    {81231x1 cell}    [81231x1 double]
  Columns 3 through 4
    [81231x1 double]    [81231x1 double]
  Columns 5 through 6
    [81231x1 double]    [81231x1 double]
  Columns 7 through 8
    [81231x1 double]    [81231x1 double]
  Columns 9 through 10
    [81231x1 double]    [81231x1 double]

Note that the data was read into "cells", which means that it must be handled slightly differently. In this case, if I evaluate T(1), this doesn't give all of the data from column one, I just get the following:


>> T(1)
ans =
    {81231x1 cell}

Instead, I have to use the following command:

>> T{1}
ans = 
    '2016-02-05_19-09-50'    '2016-02-05_19-10-38'    '2016-02-05_19-21-43'    '2016-02-05_19-22-31'    '2016-02-05_19-23-19'    '2016-02-05_19-24-08'    '2016-02-05_19-26-11'    '2016-02-05_19-26-59'    '2016-02-05_19-27-47'    '2016-02-05_19-28-36'

Note the curly braces around the number after T. It is also possible to only return a subset of the data by specifying the indices after the curly braces, as follows:

>> T{1}(1:3)
ans = 
    '2016-02-05_19-09-50'    '2016-02-05_19-10-38'    '2016-02-05_19-21-43'

Finally, we can get the data into a format we are more used to by using the cell2mat() function.

dates=cell2mat(T{1})

This concerts the data into a matrix. In my case I have read in strings, and so the matrix is a 2d array. Now to view the first three dates, I can use the following command:

>> dates(1:3,:)
ans =
2016-02-05_19-09-502016-02-05_19-10-382016-02-05_19-21-43

And from here the data can be treated as normal.