Dimensionality reduction is a way of converting the higher dimensions dataset into lesser dimensions dataset ensuring that it provides similar information. The number of input features, variables, or columns present in a given dataset is known as dimensionality, and the process to reduce these features is called dimensionality reduction. A dataset contains a huge number of input features in various cases, which makes the predictive modeling task more complicated. Because it is very difficult to visualize or make predictions for the training dataset with a high number of features, for such cases, dimensionality reduction techniques are required to use.
Components of Dimensionality Reduction
There are two components of dimensionality reduction:
Feature selection: In this, we try to find a subset of the original set of variables, or features, to get a smaller subset which can be used to model the problem.
Feature extraction: This reduces the data in a high dimensional space to a lower dimension space, i.e. a space with lesser no. of dimensions.
Advantages of Dimensionality Reduction
- By reducing the dimensions of the features, the space required to store the dataset also gets reduced.
- Less Computation training time is required for reduced dimensions of features.
- Reduced dimensions of features of the dataset help in visualizing the data quickly.
- It removes the redundant features (if present) by taking care of multicollinearity.
Disadvantages of dimensionality Reduction
- Some data may be lost due to dimensionality reduction.
- In the PCA dimensionality reduction technique, sometimes the principal components required to consider are unknown.
#DimensionalityReduction #MachineLearning #Probyto #ProbytoAI
Subscribe and follow us for latest news in Data Science and Machine learning and stay updated!