Large language models (LLMs) rely on prompts to perform tasks like text generation, classification, and question answering. But not all prompts are created equal. In this post, we’ll explore the key differences between soft prompts and hard prompts, how they work, and when to use each.
What Are Hard Prompts?
A hard prompt is a manually written instruction in natural language. For example:
“Classify the sentiment of this review as Positive or Negative: ‘This movie was amazing!’”
Hard prompts work well for zero-shot or few-shot learning, where you guide the LLM by providing clear instructions. However, they have limitations:
Fixed format: Cannot adapt dynamically to new tasks.
May be suboptimal: Slight wording changes can significantly impact performance.
What Are Soft Prompts?
A soft prompt is a trainable embedding vector prepended to the input text. Unlike hard prompts, soft prompts are not human-readable and live in the model’s embedding space. As part of the training process, the soft prompt is optimized to elicit the desired behavior from the LLM. The data used to train the soft prompt is examples of the task at hand, such as sentiment classification.
Instead of:
“Classify the sentiment of this review as Positive or Negative: …”
We prepend a learned embedding (a matrix of shape p × d, where p is the number of prompt tokens and d is the embedding dimension). The model then learns to associate these embeddings with task-specific behavior.
Soft prompts are optimized using gradient descent, but the LLM remains frozen — only the soft prompt embeddings are updated.
The benefit of soft prompts is that they can be trained on a small amount of data, making them a powerful alternative to fine-tuning large models. They are particularly useful for domain-specific tasks where hard prompts may not perform well.
Key Differences: Soft Prompts vs. Hard Prompts
Feature
Soft Prompt
Hard Prompt
Definition
A trainable embedding (vector) prepended to input
A manually written text prompt
Model Changes?
No, the LLM remains frozen
No, the LLM remains frozen
Optimization
Learned via gradient descent
Manually engineered
Format
Continuous vectors in embedding space (p × d)
Discrete text (natural language)
Interpretability
Hard to interpret (not human-readable)
Easy to understand
Performance
Usually better for domain-specific tasks
Can work well but is often suboptimal
Use Cases
Few-shot learning, task adaptation, fine-tuning alternative
General prompting, zero-shot inference
Why Do Soft Prompts Work?
Soft prompts work because LLMs already contain vast amounts of general knowledge. Instead of fine-tuning the entire model, we only train a small embedding vector to steer the model’s behavior.
They are particularly useful when:
Fine-tuning is too expensive (e.g., large LLMs like GPT-4).
We need task-specific adaptation without modifying model weights.
Hard prompts underperform, and we want an optimized alternative.
Example
import torchimport torch.nn as nnimport torch.optim as optimimport numpy as np# Define a simple model with learnable soft promptsclass SoftPromptModel(nn.Module):def__init__(self, prompt_size: int, embedding_dim: int):super().__init__()self.soft_prompt = nn.Parameter(torch.randn(prompt_size, embedding_dim))def forward(self):returnself.soft_prompt# Define parametersprompt_size =5# Number of soft prompt tokensembedding_dim =10# Embedding size per tokennum_epochs =100# Number of training epochslearning_rate =0.01# Learning rate# Initialize modelmodel = SoftPromptModel(prompt_size, embedding_dim)optimizer = optim.Adam(model.parameters(), lr=learning_rate)loss_fn = nn.MSELoss()# Dummy target soft prompt embeddingstarget_embeddings = torch.randn(prompt_size, embedding_dim)# Training loopfor epoch inrange(num_epochs): optimizer.zero_grad() output = model() loss = loss_fn(output, target_embeddings) loss.backward() optimizer.step()if (epoch +1) %10==0:print(f"Epoch [{epoch +1}/{num_epochs}], Loss: {loss.item():.4f}")# Final learned soft prompt embeddingsprint("Final Soft Prompt Embeddings:", model().detach().numpy())
Use hard prompts for general tasks where simple instructions work well.
Use soft prompts when fine-tuning is infeasible but you need better performance.
Soft prompts are better for domain-specific adaptation without modifying the LLM.
Both approaches have their place, but soft prompts provide a powerful, low-cost alternative to fine-tuning. As LLMs continue to evolve, expect soft prompting techniques to become increasingly important in real-world applications.