Misplaced Pages

Deterministic noise

Article snapshot taken from Wikipedia with creative commons attribution-sharealike license. Give it a read and then ask your questions in the chat. We can research this topic together.
This article has multiple issues. Please help improve it or discuss these issues on the talk page. (Learn how and when to remove these messages)
A major contributor to this article appears to have a close connection with its subject. It may require cleanup to comply with Misplaced Pages's content policies, particularly neutral point of view. Please discuss further on the talk page. (April 2012) (Learn how and when to remove this message)
This article needs additional citations for verification. Please help improve this article by adding citations to reliable sources. Unsourced material may be challenged and removed.
Find sources: "Deterministic noise" – news · newspapers · books · scholar · JSTOR (April 2012) (Learn how and when to remove this message)
(Learn how and when to remove this message)

In (supervised) machine learning, specifically when learning from data, there are situations when the data values cannot be modeled. This may arise if there are random fluctuations or measurement errors in the data which are not modeled, and can be appropriately called stochastic noise; or, when the phenomenon being modeled (or learned) is too complex, and so the data contains this added complexity that is not modeled. This added complexity in the data has been called deterministic noise. Though these two types of noise arise from different causes, their adverse effect on learning is similar. The overfitting occurs because the model attempts to fit the (stochastic or deterministic) noise (that part of the data that it cannot model) at the expense of fitting that part of the data which it can model. When either type of noise is present, it is usually advisable to regularize the learning algorithm to prevent overfitting the model to the data and getting inferior performance. Regularization typically results in a lower variance model at the expense of bias.

One may also try to alleviate the effects of noise by detecting and removing the noisy training examples prior to training the supervised learning algorithm. There are several algorithms that identify noisy training examples, and removing the suspected noisy training examples prior to training will usually improve the performance.

References

  1. Yaser S.Abu-Mostafa; Malik Magdon-Ismail; Hsuan-Tien Lin (March 2012). Learning From Data. amlbook.
  2. C.E. Brodely and M.A. Friedl (1999). Identifying and Eliminating Mislabeled Training Instances, Journal of Artificial Intelligence Research 11, 131-167. (http://jair.org/media/606/live-606-1803-jair.pdf Archived 2016-05-12 at the Wayback Machine)
  3. M.R. Smith; T. Martinez (2011). "Improving Classification Accuracy by Identifying and Removing Instances that Should Be Misclassified". Proceedings of International Joint Conference on Neural Networks (IJCNN 2011). pp. 2690–2697. CiteSeerX 10.1.1.221.1371. doi:10.1109/IJCNN.2011.6033571.


Stub icon

This robotics-related article is a stub. You can help Misplaced Pages by expanding it.

Categories: