Feature selection

The Feature Subset Selection ( FSS ), short feature selection, is an approach of machine learning, in which only a subset of the available features for a learning algorithm. FSS is necessary because it is sometimes technically impossible to involve all of the features, or because there is differentiation problems when a large number of features, but only a small number of data sets is available.

Filter approach

Compute a measure to distinguish between classes. Fair the weight of the features and choose the best n. In this feature subset of the learning algorithm is applied. Filters can either univariate (eg Euclidean distance, chi -square test ) or multivariate (eg correlation -based filter) calculate the intrinsic properties of the data.

Advantages:

  • Quickly computable
  • Scalable
  • Intuitively interpretable

Cons:

  • Redundant features ( Related Features will have similar weighting)
  • Ignored dependencies with the learning algorithm

Wrapper approach

Search the set of all possible feature subsets. For each subset of the training algorithm is applied. The search can be either deterministic ( eg forward selection, backward elimination ) or randomly (ex: simulated annealing, genetic algorithms).

Advantages:

  • Finding a feature subset that fits perfectly on the learning algorithm
  • Refers also combinations of features, and not only individually each feature
  • Removes redundant features
  • Easy to implement
  • Interacts with learning algorithm

Cons:

  • Very time consuming
  • Consists in heuristically the risk to find only local optima
  • Risk of overfitting the data
  • Depending on the learning algorithm

Embedded approach

The search for an optimal subset is connected directly to the learning algorithm.

Advantages:

  • Better operational performance and lower complexity
  • Dependencies between data points are modeled

Cons:

  • Selecting the subset depends strongly on the learning algorithm.

Examples:

  • Decision trees
  • Weighted naive Bayes
  • Selection of the subset using the weighting vector of SVM
328590
de