Talk:Boltzmann machine: Difference between revisions

Browse history interactively ← Previous edit Next edit →Content deleted Content addedVisual WikitextInline

Revision as of 16:53, 12 October 2010 editSmackBot (talk \| contribs)3,734,324 editsm →{-1,1} or {0,1} ?: Subst: {{unsigned}} (& regularise templates)← Previous edit		Revision as of 08:31, 26 September 2011 edit undoAmiDaniel (talk \| contribs)Extended confirmed users15,065 edits →Good article, but...: :Yes, it's named for Ludwig Boltzmann. ~~~~Next edit →
Line 52:		Line 52:

	WHY is it called a Boltzmann machine? Is it named after ]? The Ludwig Boltzmann article references this one... but that can't be determinant. --] (]) 22:10, 12 January 2009 (UTC)		WHY is it called a Boltzmann machine? Is it named after ]? The Ludwig Boltzmann article references this one... but that can't be determinant. --] (]) 22:10, 12 January 2009 (UTC)

			:Yes, it's named for Ludwig Boltzmann. ] (]) 08:31, 26 September 2011 (UTC)

	== Question : the new phase of learning ?==		== Question : the new phase of learning ?==

Revision as of 08:31, 26 September 2011

== Training sign I removed the minus sign from the RHS of this:

{\frac {\partial {G}}{\partial {w_{ij}}}}={\frac {1}{T}}

If p+ is clamped and p- is unclamped, then we want to make the weights MORE like the correlation of the clamped and less like unclamped, I think ... please check this! Charles Fox

You're incorrect, the minus sign is needed —Preceding unsigned comment added by 72.137.60.77 (talk) 17:36, 5 April 2009 (UTC)

What does "marginalize" mean in the following?

"We denote the converged distribution, after we marginalize it over the visible units V, as P − (V)." There is no other instance of this word in the article. Even a technically-minded reader wouldn't understand this article if the word isn't defined anywhere. - Will

CRF

Is the Boltzmann machine the same as a Conditional Random Field? If so that should be mentioned somewhere!

No, it isn't. A CRF can however be viewed as convexified Boltzmann machine with hand-picked features. - DaveWF 06:10, 19 April 2007 (UTC)

The threshold

What is the importance of the threshold parameter? How is it set?

Learned like any other parameter. Just have a connection wired to '+1' all the time instead of another unit. I should add this. - DaveWF 06:10, 19 April 2007 (UTC)

Can threshold be referred to as bias? Also, the link on threshold takes you to the disambiguation page, which has no articles describing threshold in this context.

The Training Section

I have a problem understanding what is P+(Vα). P+ is the distribution of the states after the values for Vα are fixed. So P+(Vα) should be 1 for those fixed values and 0 for any other values to Vα.

Also, what does α iterate over in the summation for G?

The cost function

What is the cost function? What cost does it measure? How do we train the network if we have more than one input?

{-1,1} or {0,1} ?

In the definition of s the article claims that s_i is either -1 or 1. Five lines below, it says that the nodes are in state 0 or 1, which is also what I found in (admittedly older) literature on the subject. Is the {-1,1} simply wrong or am I missing something? —Preceding unsigned comment added by Drivehonor (talk • contribs) 13:56, 7 August 2007

I think either representation should work. But I'm not sure. Can anyone confirm this? —Preceding unsigned comment added by Zholyte (talk • contribs) 19:47, 10 November 2007 (UTC)

You probably just don't understand, because it doesn't matter at all. —Preceding unsigned comment added by 130.15.15.193 (talk) 23:12, 2 December 2009 (UTC)

Good article, but...

WHY is it called a Boltzmann machine? Is it named after Ludwig Boltzmann? The Ludwig Boltzmann article references this one... but that can't be determinant. --Nehushtan (talk) 22:10, 12 January 2009 (UTC)

Yes, it's named for Ludwig Boltzmann. AmiDaniel (talk) 08:31, 26 September 2011 (UTC)

Question : the new phase of learning ?

Question, "Later, the weights are updated to maximize the probability of the network producing the completed data." What's this mean ? Is this new phase of learning ? Is this mean that : $p_{ij}^{+}$ are set as constatn in this phase - compute as a characteristic of learning set ? - for example if our data set is {{1,1,0},{1,0,1},{1,0,0}} then : $p_{12}^{+}=1/3$ , : $p_{13}^{+}1/3$ , : $p_{23}^{+}=0$ in all later iterations ? Peter 212.76.37.154 (talk) 16:04, 28 January 2009 (UTC)