my-server
← Wiki

Natarajan dimension

In the theory of Probably Approximately Correct Machine Learning, the Natarajan dimension characterizes the complexity of learning a set of functions, generalizing from the Vapnik–Chervonenkis dimension for boolean functions to multi-class functions. Originally introduced as the Generalized Dimension by Natarajan, it was subsequently renamed the Natarajan Dimension by Haussler and Long.

Definition

Let be a set of functions from a set to a set . shatters a set if there exist two functions such that

  • For every .
  • For every , there exists a function such that

for all and for all .

The Natarajan dimension of H is the maximal cardinality of a set shattered by .

It is easy to see that if , the Natarajan dimension collapses to the Vapnik–Chervonenkis dimension.

Shalev-Shwartz and Ben-David present comprehensive material on multi-class learning and the Natarajan dimension, including uniform convergence and learnability. Recently, Cohen et al showed that the Natarajan dimension is the dominant term governing agnostic multi-class PAC learnability.

References