Single-Shot Brevity Training | LLM Response Optimization

The Problem

Large Language Models often generate excessively verbose responses, even when concise, informative answers would be more valuable. This experiment explores a simple yet effective approach to guide models toward brevity without sacrificing information quality.

The Approach

Rather than abstract instructions like "be concise," this framework uses single-shot training: demonstrating the desired format with one concrete example in the system prompt.

Two-Phase Methodology

Phase 1: Baseline Evaluation

Tested 14 models using a standardized product recommendation prompt (power bank selection) without any brevity instructions to establish natural response lengths.

Phase 2: Single-Shot Training

Selected models received system prompts containing one optimized response example to guide future outputs toward similar brevity.

Key Findings

5.5x

Difference between longest and shortest responses

794

Mean response length (words)

60-75%

Word reduction in optimized examples

Model Response Length Comparison

Bar chart comparing word counts across 14 LLM models

Comparison of response lengths across 14 evaluated models

Comprehensive Verbosity Analysis

Four-panel analysis of response verbosity characteristics

Multi-faceted examination of response characteristics and patterns

Response Length Variation

Longest: 1,632 words (OpenAI GPT-OSS-120B)
Shortest: 295 words (AI21 Jamba Large)
Standard deviation: 456 words

Most Concise Performers

AI21 Jamba Large - 295 words
Mistral Large - 352 words
Meta Llama 4 Maverick - 397 words

Most Verbose Performers

OpenAI GPT-OSS-120B - 1,632 words
Google Gemini 2.5 Flash - 1,607 words

Repository Contents

Raw Response Data: Complete baseline outputs from all tested models
Optimized Examples: Demonstrating ideal brevity (60-75% word reduction)
Model-Specific System Prompts: Implementing single-shot training for practical application
Statistical Analysis: Comprehensive comparison of response lengths and patterns

Practical Applications

This approach offers several benefits for LLM deployment:

Cost Reduction: Shorter responses mean fewer output tokens and lower API costs
User Experience: Concise responses are faster to read and process
Efficiency: One example is simpler than complex prompt engineering
Reusability: The framework can be adapted to different use cases and domains

Get Involved

This is an open experiment exploring effective LLM training techniques. The repository includes all data, prompts, and analysis for transparency and reproducibility.

Explore the Repository Share Feedback