Methods to filter a feature matrix to change balance of labels (binary classification)?

4 views (last 30 days)
I am working on a binary classification problem that is very unbalanced, e.g., 90% class 0, and 10% class 1. I am experimenting with filtering the feature matrix to increase the percentage of class 1 observations (e.g., remove all observations where feature 1 is < X and feature 10 > Y), which has shown promising results. I'm trying to find good methods of doing this, any suggestions or links to related processes would be really appreciated!
Due to the nature of the problem, the precision of classifying class 1 is my priority (i.e. it is okay if only 0.5% of observations are identified as class 1, as long as those classifications have good accuracy). Considering this, I don't mind losing a significant portion of class 1 observations from filtering, assuming class 1 is the resulting dominant class.

Answers (1)

Rahul
Rahul on 12 Mar 2025
I understand the nature of your binery classification question containing highly imabalanced data.
Here are some options which can be considered in this case:
  • Functions like 'corrcoef' and 'sequentialfs' can be used identify custom features based on imbalanced data.
  • You can set logical thresholds on certain features and standardize data using 'normalize' function.
  • Undersampling the majority class, or oversampling the minority class using metods like SMOTE can be helpful in this scenario. The following MATLAB FileExchange submissions can be used:
The following MathWorks documentations can be referred to know more:
Hope this helps! Thanks.

Categories

Find more on Statistics and Machine Learning Toolbox in Help Center and File Exchange

Products


Release

R2021a

Community Treasure Hunt

Find the treasures in MATLAB Central and discover how the community can help you!

Start Hunting!