Editing Embeddings and Vector Databases (section)

== <span style="color: #FFFFFF;">Remembering</span> ==
* '''Embedding''' — A dense vector of real numbers representing the meaning of a piece of data. Similar items have vectors that are close together in the embedding space.
* '''Embedding model''' — A neural network trained to produce embeddings. Examples: text-embedding-3-small (OpenAI), BGE-M3 (BAAI), all-MiniLM-L6-v2 (Sentence Transformers).
* '''Dimensionality''' — The number of values in an embedding vector. Common sizes: 384, 768, 1536, 3072. Higher dimensions can capture more nuance but require more storage and compute.
* '''Semantic similarity''' — The degree to which two items mean the same thing, encoded as the geometric distance between their embeddings.
* '''Cosine similarity''' — The most common similarity metric for embeddings; measures the angle between two vectors. Values range from -1 (opposite) to 1 (identical).
* '''Dot product''' — An alternative similarity metric; equivalent to cosine similarity when vectors are normalized.
* '''L2 distance (Euclidean)''' — The straight-line distance between two vectors; used in some retrieval scenarios.
* '''Vector database''' — A database optimized for storing embedding vectors and performing fast approximate nearest neighbor (ANN) search. Examples: Pinecone, Weaviate, Chroma, Qdrant, Milvus, pgvector.
* '''ANN (Approximate Nearest Neighbor)''' — An algorithm that finds vectors approximately close to a query vector very quickly (sacrificing exact precision for speed).
* '''HNSW (Hierarchical Navigable Small World)''' — The most widely used ANN index structure, offering excellent speed-recall trade-offs.
* '''Metadata filtering''' — Restricting vector search results to items matching certain criteria (e.g., only articles from 2024, only products in category "electronics").
* '''Biencoder''' — A model that encodes queries and documents independently into embedding space, enabling fast retrieval (e.g., Sentence-BERT).
* '''Cross-encoder''' — A model that takes a query-document pair as input and outputs a relevance score; more accurate than biencoder but much slower (used for reranking).
* '''Chunking''' — Splitting large documents into smaller pieces before embedding, since embedding models have token limits.
</div>

<div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;">