Calculate similarity between data columns in the data matrix
Show older comments
Hello
I have an Excel file containing recording data. The first four columns are log data:
- Column 1 logs the stimulus IDs over time. Values such as 1, 2, 3, 4, etc., represent stimulus IDs, while 0 indicates no stimulus was presented.
- Column 2 logs the intervals of stimulus presentation:
- 1 = pre-stimulus interval,
- 2 = stimulus interval,
- 3 = post-stimulus interval.
I need to calculate the correlation of data in each column (from column 6 to the last column) with the data in column 5 during the stimulus interval (i.e., where the value in column 2 equals 2) and do it only for specific stimuli in the analysis, such as Stimulus 1 and Stimulus 3 (Column 1). I would like to save the correlation coefficient for each column. Could you help me implement this in MATLAB?
I am attaching example data file below
Many Thanks in Advance!
Accepted Answer
More Answers (1)
Sameer
on 5 Dec 2024
Hi @EK
From my understanding, you want to calculate the correlation between a specific column ("column 5") and other columns in your dataset during specific conditions: when a stimulus is presented (interval value 2) and for certain stimulus IDs (1 and 3).
1. Select columns for stimulus IDs, intervals, and the column of interest (column 5).
dataArray = table2array(data);
stimulusID = dataArray(:, 1);
interval = dataArray(:, 2);
column5 = dataArray(:, 5);
2. Identify rows where the stimulus interval is 2 and the stimulus ID is either 1 or 3.
filterIdx = (interval == 2) & (stimulusID == 1 | stimulusID == 3);
3. Loop through each column from column 6 to the last column and calculate the correlation with column 5 for the filtered data.
numColumns = size(dataArray, 2);
correlationCoefficients = zeros(1, numColumns - 5);
for col = 6:numColumns
columnData = dataArray(filterIdx, col);
column5Data = column5(filterIdx);
correlationCoefficients(col - 5) = corr(column5Data, columnData);
end
Hope this helps!
3 Comments
Andrew Frane
on 5 Dec 2024
I'm guessing the OP wanted to compute the correlations for each stimulus separately, rather than pooling the Stimulus 1 and Stimulus 3 data together, though it wasn't entirely clear.
But either way, just calling the corrcoef function once to obtain the full correlation matrix would be more efficient than calling the corr function repeatedly in a for-loop to compute the correlations one at a time (also, corr requires the Statistics and Machine Learning toolbox). So I'd suggest replacing your entire Step 3 with something like this, which produces the same result:
% correlation matrix for variable 5:end in filtered data matrix
correlationMatrix = corrcoef( dataArray(filterIdx, 5:end) ) ;
% extract row 1 from that correlation matrix (and omit first value, which is
% just the correlation of variable 5 with itself) so we have a vector of
% correlations between variable 5 and each subsequent respective variable
correlationCoefficients = correlationMatrix(1, 2:end) ;
EK
on 7 Dec 2024
Andrew Frane
on 7 Dec 2024
I edited my accepted answer so it also includes a version for Stimulus 1 and 3 pooled together.
Categories
Find more on Correlation and Convolution in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!