CMU 11-785 L19 Hopfield network-优快云博客

Hopfield网络是一种循环神经网络，用于内容寻址存储和联想记忆。每个神经元与其他所有神经元都有连接，当信号相反时，神经元会翻转其状态，导致能量函数下降。该网络在每次翻转后都朝着能量最小值演化，从而稳定在存储的模式上。然而，Hopfield网络存在存储容量限制和非正交模式的问题，可能导致错误的记忆召回。

摘要生成于 C知道，由 DeepSeek-R1 满血版支持，前往体验 >

Hopfield Net

So far, neural networks for computation are all feedforward structures

Loopy network

在这里插入图片描述

Each neuron is a perceptron with +1/-1 output
- Every neuron receives input from every other neuron
- Every neuron outputs signals to every other neuron

在这里插入图片描述

At each time each neuron receives a “field” $∑j≠iwjiyj+bi\sum_{j \neq i} w_{j i} y_{j}+b_{i}$
- If the sign of the field matches its own sign, it does not respond
- If the sign of the field opposes its own sign, it “flips” to match the sign of the field

在这里插入图片描述

If the sign of the field at any neuron opposes its own sign, it “flips” to match the field
- Which will change the field at other nodes
- Which may then flip… and so on…

Filp behavior

Let $y^{-}_{i}$ be the output of the $i$ -th neuron just before it responds to the current field
Let $y_{i}^{+}$ be the output of the $i$ -th neuron just after it responds to the current field
if $yi−=sign⁡(∑j≠iwjiyj+bi)y_{i}^{-}=\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)$ , then $y_{i}^{+} = -y_{i}^{-}$
- If the sign of the field matches its own sign, it does not flip
- $y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)-y_{i}^{-}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)=0$
if $yi−≠sign⁡(∑j≠iwjiyj+bi)y_{i}^{-}\neq\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)$ , then $y_{i}^{+} = -y_{i}^{-}$
- $y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)-y_{i}^{-}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)=2 y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)$
- This term is always positive!
Every flip of a neuron is guaranteed to locally increase $yi(∑j≠iwjiyj+bi)y_{i}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)$

Globally

Consider the following sum across all nodes

$\begin{array}{c} D\left(y_{1}, y_{2}, \ldots, y_{N}\right)=\sum_{i} y_{i}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right) \\\\ =\sum_{i, j \neq i} w_{i j} y_{i} y_{j}+\sum_{i} b_{i} y_{i} \end{array}$

Assume $w_{ii} = 0$
For any unit $k$ that “flips” because of the local field

$\Delta D\left(y_{k}\right)=D\left(y_{1}, \ldots, y_{k}^{+}, \ldots, y_{N}\right)-D\left(y_{1}, \ldots, y_{k}^{-}, \ldots, y_{N}\right)$

$\Delta D\left(y_{k}\right)=\left(y_{k}^{+}-y_{k}^{-}\right)\left(\sum_{j \neq k} w_{j k} y_{j}+b_{k}\right)$

This is always positive!
Every flip of a unit results in an increase in $D$

Overall

Flipping a unit will result in an increase (non-decrease) of

$D=\sum_{i, j \neq i} w_{i j} y_{i} y_{j}+\sum_{i} b_{i} y_{i}$

$D$ is bounded

$D_{\max }=\sum_{i, j \neq i}\left|w_{i j}\right|+\sum_{i}\left|b_{i}\right|$

The minimum increment of $D$ in a flip is

$\Delta D_{\min }=\min _{i,\{y_{i}, i=1 . \ldots N\}} 2|\sum_{j \neq i} w_{j i} y_{j}+b_{i}|$

Any sequence of flips must converge in a finite number of steps
- Think of this as an infinite deep network where every weights at every layers are identical
- Find the maximum layer!

The Energy of a Hopfield Net

Define the Energy of the network as

$E=-\sum_{i, j \neq i} w_{i j} y_{i} y_{j}-\sum_{i} b_{i} y_{i}$

Just the negative of $D$
The evolution of a Hopfield network constantly decreases its energy
This is analogous to the potential energy of a spin glass(Magnetic diploes)
- The system will evolve until the energy hits a local minimum
We remove bias for better understanding
The network will evolve until it arrives at a local minimum in the energy contour

Content-addressable memory

在这里插入图片描述

Each of the minima is a “stored” pattern
- If the network is initialized close to a stored pattern, it will inevitably evolve to the pattern
This is a content addressable memory
- Recall memory content from partial or corrupt values
Also called associative memory
- Evolve and recall pattern by content, not by location

Evolution

在这里插入图片描述

The network will evolve until it arrives at a local minimum in the energy contour
We proved that every change in the network will result in decrease in energy
So path to energy minimum is monotonic

For 2-neuron net

在这里插入图片描述

Symmetric
- $−12yTWy=−12(−y)TW(−y)-\frac{1}{2} \mathbf{y}^{T} \mathbf{W} \mathbf{y}=-\frac{1}{2}(-\mathbf{y})^{T} \mathbf{W}(-\mathbf{y})$
- If $y^\hat{y}$ is a local minimum, so is $−y^-\hat{y}$

Computational algorithm

在这里插入图片描述

Very simple
Updates can be done sequentially, or all at once
Convergence when it deos not chage significantly any more

$E=-\sum_{i} \sum_{j>i} w_{j i} y_{j} y_{i}$

Issues

Store a specific pattern

A network can store multiple patterns
- Every stable point is a stored pattern
- So we could design the net to store multiple patterns
  - Remember that every stored pattern $P$ is actually two stored patterns, $P$ and $- P$
How could the quadrtic function have multiple minimum? (Convex function)
- Input has constrain (belong to $(- 1, 1)$ )
Hebbian learning: $w_{j i}=y_{j} y_{i}$
Design a stationary pattern
- $sign⁡(∑j≠iwjiyj)=yi∀i\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}\right)=y_{i} \quad \forall i$
So
- $sign⁡(∑j≠iwjiyj)=sign⁡(∑j≠iyjyiyj)\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}\right)=\operatorname{sign}\left(\sum_{j \neq i} y_{j} y_{i} y_{j}\right)$
- $=sign⁡(∑j≠iyj2yi)=sign⁡(yi)=yi\quad=\operatorname{sign}\left(\sum_{j \neq i} y_{j}^{2} y_{i}\right)=\operatorname{sign}\left(y_{i}\right)=y_{i}$
Energy
- $E=−∑i∑j<iwjiyjyi=−∑i∑j<iyi2yj2=−∑i∑j<i1=−0.5N(N−1)\begin{aligned} E=&-\sum_{i} \sum_{j<i} w_{j i} y_{j} y_{i}=-\sum_{i} \sum_{j<i} y_{i}^{2} y_{j}^{2} \\\\ &=-\sum_{i} \sum_{j<i} 1=-0.5 N(N-1) \end{aligned}$
- This is the lowest possible energy value for the network

在这里插入图片描述

Stored pattern has lowest energy
No matter where it begin, it will evolve into yellow pattern(lowest energy)

How many patterns can we store?

To store more than one pattern

$w_{j i}=\sum_{\mathbf{y}_{p} \in\left\{\mathbf{y}_{p}\right\}} y_{i}^{p} y_{j}^{p}$

${y_P\}$ is the set of patterns to store
Super/subscript $p$ represents the specific pattern
Hopfield: For a network of neurons can store up to ~ $0.15 N$ patterns through Hebbian learning(Provided in PPT)

Orthogonal/ Non-orthogonal patterns

Orthogonal patterns
- Patterns are local minima (stationary and stable)
  - No other local minima exist
  - But patterns perfectly confusable for recall
Non-orthogonal patterns
- Patterns are local minima (stationary and stable)
  - No other local minima exist
    - Actual wells for patterns
  - Patterns may be perfectly recalled! (Note K > 0.14 N)
Two orthogonal 6-bit patterns
- Perfectly stationary and stable
- Several spurious “fake-memory” local minima…

Observations

Many “parasitic” patterns
- Undesired patterns that also become stable or attractors
Patterns that are non-orthogonal easier to remember
- I.e. patterns that are closer are easier to remember than patterns that are farther!!
Seems possible to store K > 0.14N patterns
- i.e. obtain a weight matrix W such that K > 0.14N patterns are stationary
- Possible to make more than 0.14N patterns at-least 1-bit stable