Misplaced Pages

Concept class

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
Computational learning theory
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
This article provides insufficient context for those unfamiliar with the subject. Please help improve the article by providing more context for the reader. (December 2021) (Learn how and when to remove this message)
This article may need to be rewritten to comply with Misplaced Pages's quality standards, as the article should cover more aspects of concept classes in a more balanced way, so as to avoid violating WP:UNDUE. You can help. The talk page may contain suggestions. (December 2021)
(Learn how and when to remove this message)

In computational learning theory in mathematics, a concept over a domain X is a total Boolean function over X. A concept class is a class of concepts. Concept classes are a subject of computational learning theory.

Concept class terminology frequently appears in model theory associated with probably approximately correct (PAC) learning. In this setting, if one takes a set Y as a set of (classifier output) labels, and X is a set of examples, the map c : X Y {\displaystyle c:X\to Y} , i.e. from examples to classifier labels (where Y = { 0 , 1 } {\displaystyle Y=\{0,1\}} and where c is a subset of X), c is then said to be a concept. A concept class C {\displaystyle C} is then a collection of such concepts.

Given a class of concepts C, a subclass D is reachable if there exists a sample s such that D contains exactly those concepts in C that are extensions to s. Not every subclass is reachable.

Background

This section needs expansion. You can help by adding to it. (December 2021)

A sample s {\displaystyle s} is a partial function from X {\displaystyle X} to { 0 , 1 } {\displaystyle \{0,1\}} . Identifying a concept with its characteristic function mapping X {\displaystyle X} to { 0 , 1 } {\displaystyle \{0,1\}} , it is a special case of a sample.

Two samples are consistent if they agree on the intersection of their domains. A sample s {\displaystyle s'} extends another sample s {\displaystyle s} if the two are consistent and the domain of s {\displaystyle s} is contained in the domain of s {\displaystyle s'} .

Examples

Suppose that C = S + ( X ) {\displaystyle C=S^{+}(X)} . Then:

  • the subclass { { x } } {\displaystyle \{\{x\}\}} is reachable with the sample s = { ( x , 1 ) } {\displaystyle s=\{(x,1)\}} ;
  • the subclass S + ( Y ) {\displaystyle S^{+}(Y)} for Y X {\displaystyle Y\subseteq X} are reachable with a sample that maps the elements of X Y {\displaystyle X-Y} to zero;
  • the subclass S ( X ) {\displaystyle S(X)} , which consists of the singleton sets, is not reachable.

Applications

Let C {\displaystyle C} be some concept class. For any concept c C {\displaystyle c\in C} , we call this concept 1 / d {\displaystyle 1/d} -good for a positive integer d {\displaystyle d} if, for all x X {\displaystyle x\in X} , at least 1 / d {\displaystyle 1/d} of the concepts in C {\displaystyle C} agree with c {\displaystyle c} on the classification of x {\displaystyle x} . The fingerprint dimension F D ( C ) {\displaystyle FD(C)} of the entire concept class C {\displaystyle C} is the least positive integer d {\displaystyle d} such that every reachable subclass C C {\displaystyle C'\subseteq C} contains a concept that is 1 / d {\displaystyle 1/d} -good for it. This quantity can be used to bound the minimum number of equivalence queries needed to learn a class of concepts according to the following inequality: F D ( C ) 1 # E Q ( C ) F D ( C ) ln ( | C | ) {\textstyle FD(C)-1\leq \#EQ(C)\leq \lceil FD(C)\ln(|C|)\rceil } .

References

  1. Chase, H., & Freitag, J. (2018). Model Theory and Machine Learning. arXiv preprint arXiv:1801.06566.
  2. ^ Angluin, D. (2004). "Queries revisited" (PDF). Theoretical Computer Science. 313 (2): 188–191. doi:10.1016/j.tcs.2003.11.004.
Category: