automl · mo374z · Feb 15, 2026 · Feb 15, 2026 · Feb 15, 2026 · Feb 15, 2026
@@ -0,0 +1,13 @@
+## Release v2.2.1
+### What's changed
+
+#### Added features:
+* add a multi-objective task
+* more seamless and robust interfaces between the components
+
+#### Further changes:
+* refactor the tutorial
+* improve robustness in handling strings & prompt objects
+* fees in block tracking and idx subsampling in CAPO
+
+**Full Changelog**: [here](https://github.com/finitearth/promptolution/compare/2.2.0...v2.2.1)
@@ -47,6 +47,7 @@ nav:
   - Home: index.md
   - Release Notes:
     - Overview: release-notes.md
+    - v2.2.1: release-notes/v2.2.1.md
     - v2.2.0: release-notes/v2.2.0.md
     - v2.1.0: release-notes/v2.1.0.md
     - v2.0.1: release-notes/v2.0.1.md
@@ -74,7 +75,7 @@ nav:
     - Exemplar Selectors: api/exemplar_selectors.md
   - Tutorials:
     - Getting Started: examples/getting_started.md
-    - LLM as Judge Tutorial: examples/llm_as_judge_tutorial.md
+    - LLM-as-a-Judge Tutorial: examples/llm_as_judge_tutorial.md
     - Reward Task Tutorial: examples/reward_task_tutorial.md
 
 markdown_extensions:

@@ -60,7 +60,7 @@
 
 
 class JudgeTask(BaseTask):
-    """Task that evaluates a predictor using an LLM as a judge, optionally accepting a ground truth."""
+    """Task that evaluates a predictor using an LLM-as-a-judge, optionally accepting a ground truth."""
 
     def __init__(
         self,

@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "promptolution"
-version = "2.2.0"
+version = "2.2.1"
 description = "A framework for prompt optimization and a zoo of prompt optimization algorithms."
 authors = ["Tom Zehle, Moritz Schlager, Timo Heiß"]
 readme = "README.md"

@@ -6,7 +6,6 @@
 
 from promptolution.optimizers.capo import CAPO
 from promptolution.utils.prompt import Prompt
-from promptolution.utils.templates import CAPO_CROSSOVER_TEMPLATE, CAPO_MUTATION_TEMPLATE
 
 
 def test_capo_initialization(mock_meta_llm, mock_predictor, initial_prompts, mock_task, mock_df):
@@ -195,18 +194,20 @@ def test_capo_crossover_prompt(mock_meta_llm, mock_predictor, initial_prompts, m
         meta_llm=mock_meta_llm,
         initial_prompts=initial_prompts,
         df_few_shots=mock_df,
+        crossovers_per_iter=1,  # Only perform one crossover so we can test the exact prompt
     )
 
+    import random
+
+    random.seed(42)
     mother = Prompt("Classify the sentiment of the text.", ["Input: I love this! Output: Positive"])
     father = Prompt("Determine if the review is positive or negative.", ["Input: This is terrible. Output: Negative"])
     optimizer._crossover([mother, father])
 
-    full_task_desc = mock_task.task_description + "\n" + optimizer.predictor.extraction_description
-
     expected_meta_prompt = (
-        CAPO_CROSSOVER_TEMPLATE.replace("<mother>", mother.instruction)
+        optimizer.crossover_template.replace("<mother>", mother.instruction)
         .replace("<father>", father.instruction)
-        .replace("<task_desc>", full_task_desc)
+        .strip()
     )
 
     assert str(mock_meta_llm.call_history[0]["prompts"][0]) == expected_meta_prompt
@@ -221,13 +222,10 @@ def test_capo_mutate_prompt(mock_meta_llm, mock_predictor, initial_prompts, mock
         initial_prompts=initial_prompts,
         df_few_shots=mock_df,
     )
-    full_task_desc = mock_task.task_description + "\n" + optimizer.predictor.extraction_description
 
     parent = Prompt("Classify the sentiment of the text.", ["Input: I love this! Output: Positive"])
     optimizer._mutate([parent])
 
-    expected_meta_prompt = CAPO_MUTATION_TEMPLATE.replace("<instruction>", parent.instruction).replace(
-        "<task_desc>", full_task_desc
-    )
+    expected_meta_prompt = optimizer.mutation_template.replace("<instruction>", parent.instruction)
 
     assert mock_meta_llm.call_history[0]["prompts"][0] == expected_meta_prompt
@@ -8,7 +8,7 @@
     "\n",
     "## Welcome to Promptolution! \n",
     "\n",
-    "Discover a powerful tool for evolving and optimizing your LLM prompts. This notebook provides a friendly introduction to Promptolution's core functionality.\n",
+    "Discover a powerful tool for evolving and optimizing your LLM prompts. This notebook provides a friendly introduction to Promptolution's core functionality, by showcasing how you can easily find the best prompt to solve a classification problem.\n",
     "\n",
     "We're excited to have you try Promptolution - let's get started!"
    ]
@@ -73,7 +73,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Below, we're using a subsample of the subjectivity dataset from Hugging Face as an example. When using your own dataset, simply ensure you name the input column \"x\" and the target column \"y\", and provide a brief description of your task, that will parsed to the meta-llm during optimization."
+    "Below, we're using a subsample of the subjectivity dataset from Hugging Face as an example. When using your own dataset, simply ensure you name the input column \"x\" and the target column \"y\", and provide a brief description of your task, that will passed to the meta-llm during optimization."
    ]
   },
   {
@@ -104,7 +104,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "We've defined some starter prompts below, but feel free to experiment! You might also want to explore create_prompts_from_samples to automatically generate initial prompts based on your data."
+    "We've defined some starter prompts below, but you don't need to do this necessarily, since Promptolution can also automatically generates initial prompts based on your data or the provided task description."
    ]
   },
   {
@@ -146,28 +146,28 @@
     "1. vLLM backend (for efficient serving of large language models)\n",
     "1. API-based LLMs (compatible with any provider following the OpenAI standard)\n",
     "\n",
-    "For this demonstration, we'll use the DeepInfra API, but you can easily switch to other providers like Anthropic or OpenAI by simply changing the base_url and llm string in the configuration."
+    "For this demonstration, we'll use the DeepInfra API, but you can easily switch to other providers like Anthropic or OpenAI."
    ]
   },
   {
    "cell_type": "code",
-   "execution_count": 15,
+   "execution_count": null,
    "metadata": {},
    "outputs": [],
    "source": [
-    "api_key = \"YOUR_API_KEY\"  # Replace with your Promptolution API key"
+    "api_key = \"YOUR_API_KEY\"  # Replace with your API key"
    ]
   },
   {
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Here's an explanation of each configuration parameter in the ExperimentConfig:\n",
-    "- `optimizer`: The algorithm used for prompt optimization. Currently we support \"capo\", \"evopromptga\", \"evopromptde\", and \"opro\". For this example, we use \"capo\" as it is capable of leveraging few-shot examples.\n",
-    "- `task_description`: A string describing the task you're optimizing prompts for. This is used to provide the meta-llm with context about your task.\n",
+    "Here's an explanation of the most important configuration parameters in the ExperimentConfig:\n",
+    "- `optimizer`: The algorithm used for prompt optimization. For this example, we use \"capo\" as it is capable of leveraging few-shot examples.\n",
+    "- `task_description`: A string describing the task you're optimizing prompts for.\n",
     "- `prompts`: A list of initial prompt strings that will be used as the starting point for optimization.\n",
-    "- `n_steps`: The number of optimization steps to run. Higher values allow more exploration and refinement but require more API calls and computational resources.\n",
-    "- `api_url`: The API endpoint URL used to access the language model. This example uses DeepInfra's API which follows the OpenAI standard.\n",
+    "- `n_steps`: The number of optimization steps to run.\n",
+    "- `api_url`: The API endpoint URL used to access the language model.\n",
     "- `llm`: The LLM to use for the experiment, as both downstream and meta LLM.\n",
     "- `token`: Your API authentication token required to access the language model service."
    ]

@@ -4,11 +4,11 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "# Getting Started: LLM as a Judge with Promptolution\n",
+    "# Getting Started: LLM-as-a-Judge with Promptolution\n",
     "\n",
     "## Welcome to Promptolution! \n",
     "\n",
-    "Discover a powerful tool for evolving and optimizing your LLM prompts. This notebook provides a friendly introduction to one of Promptolution's most advanced features: LLM as a Judge.\n",
+    "Discover a powerful tool for evolving and optimizing your LLM prompts. This notebook provides a friendly introduction to one of Promptolution's most advanced features: LLM-as-a-Judge.\n",
     "\n",
     "While the standard getting_started notebook shows how to optimize for classification tasks, this guide will focus on something different. We'll optimize prompts for a creative task where there's no single \"correct\" answer: *Finding an optimal argument for a statement*!"
    ]
@@ -26,7 +26,7 @@
     "- The helpfulness of a summary?\n",
     "- The persuasiveness of an essay?\n",
     "\n",
-    "This is where LLM as a Judge comes in. Instead of relying on a pre-defined dataset of labels, we use another powerful Language Model (the \"judge\") to score the output of our prompts. The process looks like this:\n",
+    "This is where LLM-as-a-Judge comes in. Instead of relying on a pre-defined dataset of labels, we use another powerful Language Model (the \"judge\") to score the output of our prompts. The process looks like this:\n",
     "\n",
     "A candidate prompt is used to generate a response (e.g., an argument).\n",
     "A \"judge\" LLM then evaluates this response based on the task provided and assigns a score.\n",
@@ -111,6 +111,13 @@
     "df = pd.read_csv(\"hf://datasets/ibm-research/argument_quality_ranking_30k/dev.csv\").sample(300)"
    ]
   },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "Let's look at what we're working with:"
+   ]
+  },
   {
    "cell_type": "code",
    "execution_count": 18,
@@ -141,13 +148,6 @@
     "Our task: **Given a controversial statement, generate the strongest possible argument supporting that position.**"
    ]
   },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Let's look at what we're working with:"
-   ]
-  },
   {
    "cell_type": "markdown",
    "metadata": {},
@@ -202,7 +202,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "api_key = \"YOUR_API_KEY\"  # Replace with your Promptolution API key"
+    "api_key = \"YOUR_API_KEY\"  # Replace with your API key"
    ]
   },
   {
@@ -257,9 +257,9 @@
     "With everything configured, you're ready to optimize your prompts! The run_experiment function will:\n",
     "\n",
     "1. Evaluate your initial prompts by generating arguments and having the judge LLM score them\n",
-    "1. Use evolutionary operators (mutation, crossover) to create new prompt variations from the 1. best-performing ones\n",
+    "1. Use evolutionary operators (mutation, crossover) to create new prompt variations from the best-performing ones\n",
     "1. Test these new prompt candidates and select the fittest ones for the next generation\n",
-    "1. Repeat this evolutionary process for the specified number of steps, gradually improving prompt 1. quality"
+    "1. Repeat this evolutionary process for the specified number of steps, gradually improving prompt quality"
    ]
   },
   {

@@ -50,18 +50,9 @@
   },
   {
    "cell_type": "code",
-   "execution_count": 1,
+   "execution_count": null,
    "metadata": {},
-   "outputs": [
-    {
-     "name": "stderr",
-     "output_type": "stream",
-     "text": [
-      "c:\\Users\\tzehl\\anaconda3\\envs\\d\\Lib\\site-packages\\tqdm\\auto.py:21: TqdmWarning: IProgress not found. Please update jupyter and ipywidgets. See https://ipywidgets.readthedocs.io/en/stable/user_install.html\n",
-      "  from .autonotebook import tqdm as notebook_tqdm\n"
-     ]
-    }
-   ],
+   "outputs": [],
    "source": [
     "import pandas as pd\n",
     "from promptolution.utils import ExperimentConfig\n",
@@ -147,7 +138,7 @@
    "cell_type": "markdown",
    "metadata": {},
    "source": [
-    "Here are some starter prompts for JSON extraction. Feel free to experiment with your own approaches!"
+    "Here is a starter prompts for JSON extraction. Feel free to experiment with your own approaches!"
    ]
   },
   {
@@ -184,7 +175,7 @@
     "1. vLLM backend (for efficient serving of large language models)\n",
     "1. API-based LLMs (compatible with any provider following the OpenAI standard)\n",
     "\n",
-    "For this demonstration, we'll use the DeepInfra API, but you can easily switch to other providers like Anthropic or OpenAI by simply changing the base_url and llm string in the configuration."
+    "For this demonstration, we'll use the DeepInfra API, but you can easily switch to other providers like Anthropic or OpenAI."
    ]
   },
   {
@@ -193,21 +184,7 @@
    "metadata": {},
    "outputs": [],
    "source": [
-    "api_key = \"YOUR_API_KEY\"  # Replace with your Promptolution API key"
-   ]
-  },
-  {
-   "cell_type": "markdown",
-   "metadata": {},
-   "source": [
-    "Here's an explanation of each configuration parameter in the ExperimentConfig:\n",
-    "- `optimizer`: The algorithm used for prompt optimization. Currently we support \"capo\", \"evopromptga\", \"evopromptde\", and \"opro\". For this example, we use \"capo\" as it is capable of leveraging few-shot examples.\n",
-    "- `task_description`: A string describing the task you're optimizing prompts for. This is used to provide the meta-llm with context about your task.\n",
-    "- `prompts`: A list of initial prompt strings that will be used as the starting point for optimization.\n",
-    "- `n_steps`: The number of optimization steps to run. Higher values allow more exploration and refinement but require more API calls and computational resources.\n",
-    "- `api_url`: The API endpoint URL used to access the language model. This example uses DeepInfra's API which follows the OpenAI standard.\n",
-    "- `llm`: The LLM to use for the experiment, as both downstream and meta LLM.\n",
-    "- `token`: Your API authentication token required to access the language model service."
+    "api_key = \"YOUR_API_KEY\"  # Replace with your API key"
    ]
   },
   {
@@ -447,7 +424,7 @@
  ],
  "metadata": {
   "kernelspec": {
-   "display_name": "d",
+   "display_name": "promptolution-t4XIP6Xc-py3.12",
    "language": "python",
    "name": "python3"
   },
@@ -461,7 +438,7 @@
    "name": "python",
    "nbconvert_exporter": "python",
    "pygments_lexer": "ipython3",
-   "version": "3.12.9"
+   "version": "3.12.3"
   }
  },
  "nbformat": 4,