Editing
Long Context Memory
(section)
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== <span style="color: #FFFFFF;">Remembering</span> == * '''Context window''' β The maximum number of tokens an LLM can process in a single forward pass. * '''Long context''' β Context windows exceeding 32k tokens, enabling processing of long documents, books, and extended conversations. * '''KV cache''' β Key-Value cache storing attention keys and values for all processed tokens; grows linearly with context length. * '''Lost in the Middle''' β An empirical finding that LLMs perform worse at retrieving information from the middle of long contexts vs. the beginning/end. * '''Needle in a Haystack (NIAH)''' β A benchmark hiding a specific fact in a long document and asking the model to retrieve it; tests effective context utilization. * '''RULER''' β A more comprehensive long-context benchmark covering multi-hop retrieval, aggregation, and ordering. * '''RoPE (Rotary Position Embedding)''' β A position encoding method that generalizes to longer sequences than training length via "context extension." * '''YaRN''' β A technique for extending RoPE-based models to longer contexts without full retraining. * '''Ring Attention''' β A distributed attention mechanism enabling near-infinite context by distributing KV cache across devices. * '''Sliding window attention''' β Restricts attention to a local window; efficient but loses long-range information. * '''Retrieval-augmented memory''' β Augmenting model context with retrieved relevant chunks from external memory stores. * '''Episodic memory''' β Storing and retrieving specific past events or conversations, enabling persistent agent memory. * '''Working memory''' β The information currently held in the context window; limited by context length. * '''Compressive memory''' β Summarizing and compressing older context to extend effective memory beyond the raw context window. </div> <div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">
Summary:
Please note that all contributions to BloomWiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
BloomWiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information