CMU 11-785 L19 Hopfield network

Hopfield网络是一种循环神经网络,用于内容寻址存储和联想记忆。每个神经元与其他所有神经元都有连接,当信号相反时,神经元会翻转其状态,导致能量函数下降。该网络在每次翻转后都朝着能量最小值演化,从而稳定在存储的模式上。然而,Hopfield网络存在存储容量限制和非正交模式的问题,可能导致错误的记忆召回。

摘要生成于 C知道 ,由 DeepSeek-R1 满血版支持, 前往体验 >

Hopfield Net

  • So far, neural networks for computation are all feedforward structures

Loopy network

在这里插入图片描述

  • Each neuron is a perceptron with +1/-1 output
    • Every neuron receives input from every other neuron
    • Every neuron outputs signals to every other neuron

在这里插入图片描述

  • At each time each neuron receives a “field” ∑j≠iwjiyj+bi\sum_{j \neq i} w_{j i} y_{j}+b_{i}j=iwjiyj+bi
    • If the sign of the field matches its own sign, it does not respond
    • If the sign of the field opposes its own sign, it “flips” to match the sign of the field

在这里插入图片描述

  • If the sign of the field at any neuron opposes its own sign, it “flips” to match the field
    • Which will change the field at other nodes
    • Which may then flip… and so on…

Filp behavior

  • Let yi−y^{-}_{i}yi be the output of the iii-th neuron just before it responds to the current field

  • Let yi+y_{i}^{+}yi+ be the output of the iii-th neuron just after it responds to the current field

  • if yi−=sign⁡(∑j≠iwjiyj+bi)y_{i}^{-}=\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)yi=sign(j=iwjiyj+bi), then yi+=−yi−y_{i}^{+} = -y_{i}^{-}yi+=yi

    • If the sign of the field matches its own sign, it does not flip

    • yi+(∑j≠iwjiyj+bi)−yi−(∑j≠iwjiyj+bi)=0 y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)-y_{i}^{-}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)=0 yi+j=iwjiyj+biyij=iwjiyj+bi=0

  • if yi−≠sign⁡(∑j≠iwjiyj+bi)y_{i}^{-}\neq\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)yi=sign(j=iwjiyj+bi), then yi+=−yi−y_{i}^{+} = -y_{i}^{-}yi+=yi

    • yi+(∑j≠iwjiyj+bi)−yi−(∑j≠iwjiyj+bi)=2yi+(∑j≠iwjiyj+bi) y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)-y_{i}^{-}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)=2 y_{i}^{+}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right) yi+j=iwjiyj+biyij=iwjiyj+bi=2yi+j=iwjiyj+bi

    • This term is always positive!

  • Every flip of a neuron is guaranteed to locally increase yi(∑j≠iwjiyj+bi)y_{i}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right)yi(j=iwjiyj+bi)

Globally

  • Consider the following sum across all nodes

D(y1,y2,…,yN)=∑iyi(∑j≠iwjiyj+bi)=∑i,j≠iwijyiyj+∑ibiyi \begin{array}{c} D\left(y_{1}, y_{2}, \ldots, y_{N}\right)=\sum_{i} y_{i}\left(\sum_{j \neq i} w_{j i} y_{j}+b_{i}\right) \\\\ =\sum_{i, j \neq i} w_{i j} y_{i} y_{j}+\sum_{i} b_{i} y_{i} \end{array} D(y1,y2,,yN)=iyi(j=iwjiyj+bi)=i,j=iwijyiyj+ibiyi

  • Assume wii=0w_{ii} = 0wii=0
  • For any unit kkk that “flips” because of the local field

ΔD(yk)=D(y1,…,yk+,…,yN)−D(y1,…,yk−,…,yN) \Delta D\left(y_{k}\right)=D\left(y_{1}, \ldots, y_{k}^{+}, \ldots, y_{N}\right)-D\left(y_{1}, \ldots, y_{k}^{-}, \ldots, y_{N}\right) ΔD(yk)=D(y1,,yk+,,yN)D(y1,,yk,,yN)

ΔD(yk)=(yk+−yk−)(∑j≠kwjkyj+bk) \Delta D\left(y_{k}\right)=\left(y_{k}^{+}-y_{k}^{-}\right)\left(\sum_{j \neq k} w_{j k} y_{j}+b_{k}\right) ΔD(yk)=(yk+yk)j=kwjkyj+bk

  • This is always positive!
  • Every flip of a unit results in an increase in DDD

Overall

  • Flipping a unit will result in an increase (non-decrease) of

D=∑i,j≠iwijyiyj+∑ibiyi D=\sum_{i, j \neq i} w_{i j} y_{i} y_{j}+\sum_{i} b_{i} y_{i} D=i,j=iwijyiyj+ibiyi

  • DDD is bounded

Dmax⁡=∑i,j≠i∣wij∣+∑i∣bi∣ D_{\max }=\sum_{i, j \neq i}\left|w_{i j}\right|+\sum_{i}\left|b_{i}\right| Dmax=i,j=iwij+ibi

  • The minimum increment of DDD in a flip is

ΔDmin⁡=min⁡i,{yi,i=1.…N}2∣∑j≠iwjiyj+bi∣ \Delta D_{\min }=\min _{i,\{y_{i}, i=1 . \ldots N\}} 2|\sum_{j \neq i} w_{j i} y_{j}+b_{i}| ΔDmin=i,{yi,i=1.N}min2j=iwjiyj+bi

  • Any sequence of flips must converge in a finite number of steps
    • Think of this as an infinite deep network where every weights at every layers are identical
    • Find the maximum layer!

The Energy of a Hopfield Net

  • Define the Energy of the network as

E=−∑i,j≠iwijyiyj−∑ibiyi E=-\sum_{i, j \neq i} w_{i j} y_{i} y_{j}-\sum_{i} b_{i} y_{i} E=i,j=iwijyiyjibiyi

  • Just the negative of DDD

  • The evolution of a Hopfield network constantly decreases its energy

  • This is analogous to the potential energy of a spin glass(Magnetic diploes)

    • The system will evolve until the energy hits a local minimum
  • We remove bias for better understanding
    在这里插入图片描述

  • The network will evolve until it arrives at a local minimum in the energy contour

Content-addressable memory

在这里插入图片描述

  • Each of the minima is a “stored” pattern
    • If the network is initialized close to a stored pattern, it will inevitably evolve to the pattern
  • This is a content addressable memory
    • Recall memory content from partial or corrupt values
  • Also called associative memory
    • Evolve and recall pattern by content, not by location

Evolution

在这里插入图片描述

  • The network will evolve until it arrives at a local minimum in the energy contour
  • We proved that every change in the network will result in decrease in energy
  • So path to energy minimum is monotonic
For 2-neuron net

在这里插入图片描述

  • Symmetric
    • −12yTWy=−12(−y)TW(−y)-\frac{1}{2} \mathbf{y}^{T} \mathbf{W} \mathbf{y}=-\frac{1}{2}(-\mathbf{y})^{T} \mathbf{W}(-\mathbf{y})21yTWy=21(y)TW(y)
    • If y^\hat{y}y^ is a local minimum, so is −y^-\hat{y}y^

Computational algorithm

在这里插入图片描述

  • Very simple
  • Updates can be done sequentially, or all at once
  • Convergence when it deos not chage significantly any more

E=−∑i∑j>iwjiyjyi E=-\sum_{i} \sum_{j>i} w_{j i} y_{j} y_{i} E=ij>iwjiyjyi

Issues

Store a specific pattern

  • A network can store multiple patterns
    • Every stable point is a stored pattern
    • So we could design the net to store multiple patterns
      • Remember that every stored pattern PPP is actually two stored patterns, PPP and −P-PP
  • How could the quadrtic function have multiple minimum? (Convex function)
    • Input has constrain (belong to (−1,1)(-1,1)(1,1) )
  • Hebbian learning: wji=yjyiw_{j i}=y_{j} y_{i}wji=yjyi
  • Design a stationary pattern
    • sign⁡(∑j≠iwjiyj)=yi∀i\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}\right)=y_{i} \quad \forall isign(j=iwjiyj)=yii
  • So
    • sign⁡(∑j≠iwjiyj)=sign⁡(∑j≠iyjyiyj)\operatorname{sign}\left(\sum_{j \neq i} w_{j i} y_{j}\right)=\operatorname{sign}\left(\sum_{j \neq i} y_{j} y_{i} y_{j}\right)sign(j=iwjiyj)=sign(j=iyjyiyj)
    • =sign⁡(∑j≠iyj2yi)=sign⁡(yi)=yi\quad=\operatorname{sign}\left(\sum_{j \neq i} y_{j}^{2} y_{i}\right)=\operatorname{sign}\left(y_{i}\right)=y_{i}=sign(j=iyj2yi)=sign(yi)=yi
  • Energy
    • E=−∑i∑j<iwjiyjyi=−∑i∑j<iyi2yj2=−∑i∑j<i1=−0.5N(N−1)\begin{aligned} E=&-\sum_{i} \sum_{j<i} w_{j i} y_{j} y_{i}=-\sum_{i} \sum_{j<i} y_{i}^{2} y_{j}^{2} \\\\ &=-\sum_{i} \sum_{j<i} 1=-0.5 N(N-1) \end{aligned}E=ij<iwjiyjyi=ij<iyi2yj2=ij<i1=0.5N(N1)
    • This is the lowest possible energy value for the network

在这里插入图片描述

  • Stored pattern has lowest energy
  • No matter where it begin, it will evolve into yellow pattern(lowest energy)

How many patterns can we store?

  • To store more than one pattern

wji=∑yp∈{yp}yipyjp w_{j i}=\sum_{\mathbf{y}_{p} \in\left\{\mathbf{y}_{p}\right\}} y_{i}^{p} y_{j}^{p} wji=yp{yp}yipyjp

  • {yP}\{y_P\}{yP} is the set of patterns to store
  • Super/subscript ppp represents the specific pattern
  • Hopfield: For a network of neurons can store up to ~0.15N0.15N0.15N patterns through Hebbian learning(Provided in PPT)

Orthogonal/ Non-orthogonal patterns

  • Orthogonal patterns
    在这里插入图片描述

    • Patterns are local minima (stationary and stable)
      • No other local minima exist
      • But patterns perfectly confusable for recall
  • Non-orthogonal patterns
    在这里插入图片描述

    • Patterns are local minima (stationary and stable)
      • No other local minima exist
        • Actual wells for patterns
      • Patterns may be perfectly recalled! (Note K > 0.14 N)
  • Two orthogonal 6-bit patterns
    在这里插入图片描述

    • Perfectly stationary and stable
    • Several spurious “fake-memory” local minima…

Observations

  • Many “parasitic” patterns

    • Undesired patterns that also become stable or attractors
  • Patterns that are non-orthogonal easier to remember

    • I.e. patterns that are closer are easier to remember than patterns that are farther!!
  • Seems possible to store K > 0.14N patterns

    • i.e. obtain a weight matrix W such that K > 0.14N patterns are stationary
    • Possible to make more than 0.14N patterns at-least 1-bit stable
评论
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值