Editing
Few Shot Zero Shot
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
<div style="background-color: #4B0082; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> {{BloomIntro}} Few-shot and zero-shot learning address one of the most fundamental challenges in AI: learning from very little data. Standard deep learning requires thousands to millions of labeled examples. Few-shot learning achieves high performance with just 1β10 examples per class. Zero-shot learning requires no task-specific examples at all β the model generalizes entirely from its pre-existing knowledge and the description of the new task. These capabilities are increasingly important as AI is applied to specialized domains where labeled data is scarce. </div> __TOC__ <div style="background-color: #000080; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> == <span style="color: #FFFFFF;">Remembering</span> == * '''Few-shot learning''' β Learning to classify or solve tasks with very few labeled examples per class (1β10). * '''Zero-shot learning''' β Making predictions for classes or tasks never seen during training, using semantic descriptions. * '''N-way K-shot''' β A standard few-shot task specification: N classes, K labeled examples per class in the support set. * '''Zero-shot classification''' β Classifying inputs into categories not seen during training, using class descriptions or embeddings. * '''CLIP (Contrastive Language-Image Pre-training)''' β OpenAI model that enables zero-shot image classification by comparing image embeddings to text class descriptions. * '''In-context learning''' β LLMs performing few-shot tasks from examples in the context window, without weight updates. * '''Semantic embeddings''' β Dense vector representations encoding semantic meaning, enabling zero-shot similarity comparisons. * '''Class prototype''' β The average embedding of all support set examples for a class; used in Prototypical Networks for few-shot classification. * '''Attribute-based zero-shot''' β Zero-shot learning using human-defined semantic attributes to describe each class. * '''Generalized zero-shot learning''' β Testing on both seen and unseen classes simultaneously; harder than standard zero-shot. * '''Imagenet zero-shot''' β CLIP achieves 75%+ accuracy on ImageNet without seeing a single ImageNet training example. * '''Prompt-based few-shot''' β Providing 1β10 examples in the LLM prompt to demonstrate the desired task format. </div> <div style="background-color: #006400; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> == <span style="color: #FFFFFF;">Understanding</span> == '''Zero-shot learning''' with CLIP: Train a model to align image and text representations. At inference, compute the image embedding and compare it against text embeddings of all possible class descriptions ("a photo of a cat", "a photo of a dog"). The class with the highest similarity is the prediction β without ever training on these specific classes. '''Why does zero-shot work?''' CLIP was trained on 400M image-text pairs. Through this training it has learned that images of dogs and text about dogs inhabit similar regions of embedding space. At zero-shot time, new class descriptions ("a photo of a Tibetan Mastiff") can be correctly associated with unseen images because the semantic alignment was learned during pre-training. '''In-context few-shot learning''': GPT-4 can learn to perform a new task from 3-5 examples in the prompt β no gradient updates. The model recognizes the pattern in the examples and continues it for new inputs. This is surprisingly powerful for classification, translation, format conversion, and reasoning tasks. '''The few-shot learning / meta-learning connection''': Few-shot learning and meta-learning address the same problem from different angles. Meta-learning trains a model explicitly to learn from few examples (gradient-based: MAML; metric-based: Prototypical Networks). LLM in-context learning achieves similar results without explicit meta-training β an emergent capability. '''Retrieval-augmented zero-shot''': When semantic class descriptions aren't available, retrieve relevant documents at inference time and use them to ground predictions β extending the model's effective knowledge without fine-tuning. </div> <div style="background-color: #8B0000; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> == <span style="color: #FFFFFF;">Applying</span> == '''CLIP zero-shot classification:''' <syntaxhighlight lang="python"> import torch import clip from PIL import Image model, preprocess = clip.load("ViT-B/32", device="cuda") # Zero-shot classification without any task-specific training def zero_shot_classify(image_path: str, class_names: list) -> dict: image = preprocess(Image.open(image_path)).unsqueeze(0).to("cuda") # Create text descriptions for each class texts = clip.tokenize([f"a photo of a {cls}" for cls in class_names]).to("cuda") with torch.no_grad(): image_features = model.encode_image(image) text_features = model.encode_text(texts) # Normalize and compute cosine similarity image_features /= image_features.norm(dim=-1, keepdim=True) text_features /= text_features.norm(dim=-1, keepdim=True) similarity = (100.0 * image_features @ text_features.T).softmax(dim=-1) return {cls: float(sim) for cls, sim in zip(class_names, similarity[0])} # Works for ANY class names β zero training examples needed! results = zero_shot_classify("wildlife_photo.jpg", ["lion", "elephant", "giraffe", "zebra", "cheetah", "rhinoceros"]) print(sorted(results.items(), key=lambda x: -x[1])) </syntaxhighlight> '''Few-shot classification with Prototypical Networks:''' <syntaxhighlight lang="python"> import torch import torch.nn.functional as F def prototypical_classify(support_embeddings, support_labels, query_embeddings, n_classes): """ support_embeddings: (n_classes * k_shot, D) support set embeddings query_embeddings: (n_query, D) query embeddings Returns: predicted class for each query """ # Compute class prototypes (mean of support embeddings per class) prototypes = torch.stack([ support_embeddings[support_labels == c].mean(0) for c in range(n_classes) ]) # (n_classes, D) # Classify queries by nearest prototype dists = torch.cdist(query_embeddings, prototypes) # (n_query, n_classes) return dists.argmin(dim=1) </syntaxhighlight> ; Few-shot / zero-shot approach selection : '''Vision, zero-shot''' β CLIP (ViT-L/14 for best quality) : '''NLP, zero-shot''' β LLM with task description in system prompt : '''NLP, few-shot''' β LLM with 3-10 examples in context : '''Vision, few-shot''' β Fine-tune CLIP or DINO with support set : '''Structured few-shot''' β Prototypical Networks for consistent task structure </div> <div style="background-color: #8B4500; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> == <span style="color: #FFFFFF;">Analyzing</span> == {| class="wikitable" |+ Zero-Shot vs. Few-Shot vs. Full Supervision ! Approach !! Data Needed !! Accuracy !! Flexibility !! Deployment Cost |- | Zero-shot (CLIP/LLM) || 0 || Medium || Very high || Low (API) |- | Few-shot in-context || 1β10 examples || Medium-high || Very high || Low (API) |- | Few-shot fine-tuning || ~100 || High || Medium || Medium |- | Full supervision || 1000β100k || Highest || Low (task-specific) || High |} '''Failure modes''': Zero-shot accuracy drops dramatically for specialized/technical domains not well represented in pre-training data. Class name ambiguity β "bank" (financial institution vs. river bank) causes misclassification without context. In-context learning is sensitive to example order and formatting. Generalized zero-shot learning typically suffers from the "hubness problem" β test embeddings cluster near a few seen classes. </div> <div style="background-color: #483D8B; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> == <span style="color: #FFFFFF;">Evaluating</span> == Evaluation on standard benchmarks: '''miniImageNet''' and '''tieredImageNet''' for few-shot vision; '''FLAN''' and '''SuperGLUE''' for few-shot NLP; '''VTAB''' for transfer learning. Always evaluate on truly unseen classes (no leakage). For CLIP zero-shot: compare on ImageNet-V2, ObjectNet (distribution shift variants). For LLM few-shot: measure across diverse k values (0, 1, 4, 8 shots) to characterize the few-shot learning curve. </div> <div style="background-color: #2F4F4F; color: #FFFFFF; padding: 20px; border-radius: 8px; margin-bottom: 15px;"> == <span style="color: #FFFFFF;">Creating</span> == Designing a few-shot deployment pipeline: # Start with zero-shot: use CLIP or GPT-4 with class descriptions β no data collection needed. # If accuracy insufficient, collect 5-10 examples per class with domain experts. # Use Prototypical Networks or CLIP linear probe on support set embeddings. # If still insufficient, collect 100+ examples per class for standard fine-tuning. # Monitor class-level performance: some classes may be harder for zero-shot than others β target annotation effort at weak classes. # Continuous: as more labeled data accumulates, transition from few-shot to supervised models where it's cost-effective. [[Category:Artificial Intelligence]] [[Category:Machine Learning]] [[Category:Few-Shot Learning]] </div>
Summary:
Please note that all contributions to BloomWiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
BloomWiki:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Template used on this page:
Template:BloomIntro
(
edit
)
Navigation menu
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Namespaces
Page
Discussion
English
Views
Read
Edit
View history
More
Search
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Tools
What links here
Related changes
Special pages
Page information