LSTM

Long short term memory (LSTM) models are variant of RNNs. They are modified to try to remember short-term $z$ and long-term dependencies $c$ . The purpose of memory cells is to remember things for a long time.

LSTMs were the practical analogy of convolutional neural networks for RNNs

Forget gate $f_{t}$
- If element ‘j’ of $f_{t}$ is 0, then we clear element $c_{t j}$ from the memory (set it to 0).
- If it is 1, then we keep the old value.
- “Given the input and previous activation, are the elements in memory still relevant?”
Input gate $i_{t}$
- If element ‘j’ of $i_{t}$ is 0, then we do not add any new information to $c_{t j}$ (no input).
- If it is 1, then we “value” to the memory (where “value” is also a function of input and previous at ).
- “Given the input and previous activation, should I write something new to memory?”
Output gate $o_{t}$
- If element ‘j’ of $o_{t}$ is 0, then we do not read value $c_{t j}$ from the memory (no output).
- If it is 1, then we load from the memory.
- “Given the input and previous activation, should I read what is in memory?”

jzhao.xyz

LSTM

Graph View

Backlinks