References: Why sequences need memory

Source material

Source curriculum (structural mirror, cited as further study):
• MIT 6.S191, "Introduction to Deep Learning", Lecture 2: "Deep Sequence Modeling"
  Instructors: Alexander Amini and Ava Amini (MIT)
  Course page: https://introtodeeplearning.com
  Code and labs: https://github.com/aamini/introtodeeplearning
  License: MIT (slides, code, and labs); videos are YouTube standard
  Required attribution: "© Alexander Amini and Ava Amini, MIT 6.S191:
    Introduction to Deep Learning, IntroToDeepLearning.com"
Clawdemy's lessons are original prose that follows the pedagogical arc of this
course. We do not reproduce or transcribe the lectures; we cite them as the
recommended companion. Course materials are used under their MIT license with
the attribution above; all rights to the original videos remain with the creators.

Watch this next

MIT 6.S191, Lecture 2: Deep Sequence Modeling by Alexander and Ava Amini. The lecture this lesson mirrors, from the course page. It covers recurrent networks with the instructors’ own diagrams and then carries on into attention and transformers (which this track splits into the next lesson). Watch the recurrent-network portion now; the attention portion pairs with lesson 3.

Going deeper

A short, durable list. Each link is a specific next step, not a generic pile.

The Unreasonable Effectiveness of Recurrent Neural Networks by Andrej Karpathy. The famous post that made RNNs click for a generation of learners. It trains a character-level recurrent network on Shakespeare, code, and more, and shows the surprising structure it learns. The most enjoyable way to see “carry a memory forward” actually working.
Understanding LSTM Networks by Christopher Olah. The clearest visual explanation of the gated-memory designs this lesson only named. If you want to see exactly how the keep/overwrite/forget gates work, start here; the diagrams are the gold standard.

Adjacent topics

Where this connects inside the track.

What deep learning adds (lesson 1). The previous lesson set up the tour and the four problem shapes. Sequences are the first shape; this lesson is that promised first stop.
Attention and transformers, in brief (lesson 3). Recurrence reads a sequence one step at a time and strains on long-range links. The next lesson meets a different answer that looks at all positions at once and weighs what matters. Track 5 (Transformers and LLMs) then goes deep on the architecture that idea produced.