What is the difference between observation and variable in case of pca() function?

6 views (last 30 days)
I have a matrix with m rows and n columns that is built from n number of individuals for person identification. So, n is the number of person and m is the number of feature's value of the person or pixel's intensity values of the person's image.
Now I want to calculate pca using Matlab's pca() function. But, It makes me confused about observation and variables. What will I call n and m? Which one represents observation and which one represents variable?

Accepted Answer

Star Strider
Star Strider on 22 Nov 2017
From the documentation:
  • coeff = pca(X) returns the principal component coefficients, also known as loadings, for the n - by - p data matrix X. Rows of X correspond to observations and columns correspond to variables.
So each row is an observation and each column is a variable.
  8 Comments
K M Ibrahim Khalilullah
K M Ibrahim Khalilullah on 23 Nov 2017
Thanks, I want to use the PCA to reduce dimensions of the data. After that, it will be used for classification
Star Strider
Star Strider on 23 Nov 2017
My pleasure.
Principal components analysis is quite good for that. I actually once used it to design a filter when I did a linear discriminant analysis on short-time Fourier transforms of EEGs to classify what task the person was doing. The filter then isolated the relevant frequencies. making the classification much more efficient.

Sign in to comment.

More Answers (1)

Image Analyst
Image Analyst on 23 Nov 2017
The person is treated as a variable and the feature value is the observation. The feature values for each person are listed in a column of the table. For example, maybe you have two people in a family and want to see if there is a relationship between the weight of the two people over time. You might be taking measurements every day (or month or whatever). So if you had 4 time points and 2 people, the array would be
w11, w21
w12, w22
w13, w23
w14, w24
where w1* are the weights of person #1, and w2* are the weights of person #2. The coefficients would be
coefficients = pca(weightsMatrix);
Where weightsMatrix is that 2-D array I gave above.
  2 Comments
K M Ibrahim Khalilullah
K M Ibrahim Khalilullah on 23 Nov 2017
It makes me more confusion. my understanding is people/persons are observations and their feature values are variable. Please see this link: https://en.wikipedia.org/wiki/Data_matrix_(multivariate_statistics)
Image Analyst
Image Analyst on 23 Nov 2017
Please describe the measurements you are making. If the ID of persons is your observation then you may need logistic regression instead of PCA.

Sign in to comment.

Categories

Find more on Dimensionality Reduction and Feature Extraction in Help Center and File Exchange

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!