uv run pytest tests/integration/test_miprov2_descriptions.py -v -s 
====================================================================== test session starts ======================================================================
platform darwin -- Python 3.12.8, pytest-9.0.2, pluggy-1.6.0 -- /Users/davidberenstein/conductor/workspaces/dspydantic/kingston/.venv/bin/python3
cachedir: .pytest_cache
rootdir: /Users/davidberenstein/conductor/workspaces/dspydantic/kingston
configfile: pyproject.toml
plugins: anyio-4.12.0
collected 9 items                                                                                                                                               

tests/integration/test_miprov2_descriptions.py::test_medical_record_optimization[miprov2zeroshot] ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: MedicalRecord                                                                                                                                          │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: single-pass (all fields together)                                                                                                                       │
│ Examples: 5                                                                                                                                                   │
│ Fields: 5                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
│ Optimizing: 5 field descriptions                                                                                                                              │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
               Initial Field Descriptions                
                                                         
  Field          Type   Description                      
 ─────────────────────────────────────────────────────── 
  patient_name   str    Full name of the patient         
  age            str    Patient's age in years           
  diagnosis      str    Medical condition diagnosed      
  medication     str    Prescribed medication name       
  dosage         str    Medication dosage and frequency  
                                                         

  Train: 4 examples | Val: 1 examples
FAILED
tests/integration/test_miprov2_descriptions.py::test_reservation_descriptions_not_meta_instructions[miprov2zeroshot] ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: Reservation                                                                                                                                            │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: single-pass (all fields together)                                                                                                                       │
│ Examples: 5                                                                                                                                                   │
│ Fields: 5                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
│ Optimizing: 5 field descriptions                                                                                                                              │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                         Initial Field Descriptions                         
                                                                            
  Field        Type                                     Description         
 ────────────────────────────────────────────────────────────────────────── 
  restaurant   str                                      Restaurant name     
  date         str                                      Date                
  time         str                                      Time                
  party_size   Literal["1", "2", "3", "4", "5", "6+"]   Number of guests    
  seating      Literal["indoor", "outdoor", "bar"]      Seating preference  
                                                                            

  Train: 4 examples | Val: 1 examples
FAILED
tests/integration/test_miprov2_descriptions.py::test_transaction_descriptions_not_meta_instructions[miprov2zeroshot] ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: Transaction                                                                                                                                            │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: single-pass (all fields together)                                                                                                                       │
│ Examples: 5                                                                                                                                                   │
│ Fields: 6                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
│ Optimizing: 6 field descriptions                                                                                                                              │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                                 Initial Field Descriptions                                 
                                                                                            
  Field      Type                                            Description                    
 ────────────────────────────────────────────────────────────────────────────────────────── 
  broker     str                                             Financial institution          
  amount     str                                             Transaction amount             
  security   str                                             Stock or financial instrument  
  date       str                                             Transaction date               
  status     Literal["pending", "completed", "failed"]       Status                         
  type       Literal["equity", "bond", "option", "future"]   Type                           
                                                                                            

  Train: 4 examples | Val: 1 examples
FAILED
tests/integration/test_miprov2_descriptions.py::test_skip_field_description_preserves_originals ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: Reservation                                                                                                                                            │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: single-pass (all fields together)                                                                                                                       │
│ Examples: 5                                                                                                                                                   │
│ Fields: 5                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                         Initial Field Descriptions                         
                                                                            
  Field        Type                                     Description         
 ────────────────────────────────────────────────────────────────────────── 
  restaurant   str                                      Restaurant name     
  date         str                                      Date                
  time         str                                      Time                
  party_size   Literal["1", "2", "3", "4", "5", "6+"]   Number of guests    
  seating      Literal["indoor", "outdoor", "bar"]      Seating preference  
                                                                            

  Train: 4 examples | Val: 1 examples
FAILED
tests/integration/test_miprov2_descriptions.py::test_sequential_optimization_shows_improvement ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: MedicalRecord                                                                                                                                          │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: sequential (field-by-field)                                                                                                                             │
│ Examples: 5                                                                                                                                                   │
│ Fields: 5                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
│ Optimizing: 5 field descriptions                                                                                                                              │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
               Initial Field Descriptions                
                                                         
  Field          Type   Description                      
 ─────────────────────────────────────────────────────── 
  patient_name   str    Full name of the patient         
  age            str    Patient's age in years           
  diagnosis      str    Medical condition diagnosed      
  medication     str    Prescribed medication name       
  dosage         str    Medication dosage and frequency  
                                                         

  Train: 4 examples | Val: 1 examples
  Baseline score: 100.00%

Phase 1: Optimizing field descriptions (deepest-first)...
FAILED
tests/integration/test_miprov2_descriptions.py::test_early_stopping_patience ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: MedicalRecord                                                                                                                                          │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: sequential (field-by-field)                                                                                                                             │
│ Examples: 5                                                                                                                                                   │
│ Fields: 5                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
│ Optimizing: 5 field descriptions                                                                                                                              │
│ Early stopping: after 2 fields without improvement                                                                                                            │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
               Initial Field Descriptions                
                                                         
  Field          Type   Description                      
 ─────────────────────────────────────────────────────── 
  patient_name   str    Full name of the patient         
  age            str    Patient's age in years           
  diagnosis      str    Medical condition diagnosed      
  medication     str    Prescribed medication name       
  dosage         str    Medication dosage and frequency  
                                                         

  Train: 4 examples | Val: 1 examples
  Baseline score: 100.00%

Phase 1: Optimizing field descriptions (deepest-first)...
  [1/5] patient_name (depth 0) ... FAILED
tests/integration/test_miprov2_descriptions.py::test_auto_generate_prompts ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: MedicalRecord                                                                                                                                          │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: single-pass (all fields together)                                                                                                                       │
│ Examples: 5                                                                                                                                                   │
│ Fields: 5                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
│ Optimizing: 5 field descriptions, system prompt, instruction prompt                                                                                           │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
               Initial Field Descriptions                
                                                         
  Field          Type   Description                      
 ─────────────────────────────────────────────────────── 
  patient_name   str    Full name of the patient         
  age            str    Patient's age in years           
  diagnosis      str    Medical condition diagnosed      
  medication     str    Prescribed medication name       
  dosage         str    Medication dosage and frequency  
                                                         

  System prompt (auto-generated): You are an expert at extracting structured MedicalRecord data from text. Be precise and faithful to the source text.
  Instruction prompt (auto-generated): Extract the following fields from the given text: patient_name, age, diagnosis, medication, dosage. Return only values 
that are explicitly stated or clearly implied.

  Train: 4 examples | Val: 1 examples
FAILED
tests/integration/test_miprov2_descriptions.py::test_optimizer_compatibility[miprov2zeroshot] ╭─────────────────────────────────────────────────────────────────── DSPydantic Optimization ───────────────────────────────────────────────────────────────────╮
│ Model: Reservation                                                                                                                                            │
│ Optimizer: MIPROV2ZEROSHOT                                                                                                                                    │
│ Mode: single-pass (all fields together)                                                                                                                       │
│ Examples: 5                                                                                                                                                   │
│ Fields: 5                                                                                                                                                     │
│ Threads: 1                                                                                                                                                    │
│ Optimizing: 5 field descriptions                                                                                                                              │
╰───────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────╯
                         Initial Field Descriptions                         
                                                                            
  Field        Type                                     Description         
 ────────────────────────────────────────────────────────────────────────── 
  restaurant   str                                      Restaurant name     
  date         str                                      Date                
  time         str                                      Time                
  party_size   Literal["1", "2", "3", "4", "5", "6+"]   Number of guests    
  seating      Literal["indoor", "outdoor", "bar"]      Seating preference  
                                                                            

  Train: 4 examples | Val: 1 examples

Step 1: Evaluating baseline configuration...
  Baseline score: 100.00%

Step 2: Optimizing 5 field descriptions...
2026/03/20 07:09:56 INFO dspy.teleprompt.mipro_optimizer_v2: 
RUNNING WITH THE FOLLOWING LIGHT AUTO RUN SETTINGS:
num_trials: 25
minibatch: False
num_fewshot_candidates: 6
num_instruct_candidates: 6
valset size: 1

2026/03/20 07:09:56 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 1: BOOTSTRAP FEWSHOT EXAMPLES <==
2026/03/20 07:09:56 INFO dspy.teleprompt.mipro_optimizer_v2: These will be used for informing instruction proposal.

2026/03/20 07:09:56 INFO dspy.teleprompt.mipro_optimizer_v2: Bootstrapping N=6 sets of demonstrations...
Bootstrapping set 1/6
Bootstrapping set 2/6
 25%|███████████████████████████████▌                                                                                              | 1/4 [00:14<00:44, 14.73s/it]
Bootstrapped 1 full traces after 1 examples for up to 1 rounds, amounting to 1 attempts.
Bootstrapping set 3/6
 75%|██████████████████████████████████████████████████████████████████████████████████████████████▌                               | 3/4 [00:11<00:03,  3.73s/it]
Bootstrapped 3 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.
Bootstrapping set 4/6
 75%|██████████████████████████████████████████████████████████████████████████████████████████████▌                               | 3/4 [00:00<00:00, 54.79it/s]
Bootstrapped 3 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.
Bootstrapping set 5/6
 75%|██████████████████████████████████████████████████████████████████████████████████████████████▌                               | 3/4 [00:00<00:00, 53.28it/s]
Bootstrapped 3 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.
Bootstrapping set 6/6
 75%|██████████████████████████████████████████████████████████████████████████████████████████████▌                               | 3/4 [00:00<00:00, 32.14it/s]
Bootstrapped 3 full traces after 3 examples for up to 1 rounds, amounting to 3 attempts.
2026/03/20 07:10:22 INFO dspy.teleprompt.mipro_optimizer_v2: 
==> STEP 2: PROPOSE INSTRUCTION CANDIDATES <==
2026/03/20 07:10:22 INFO dspy.teleprompt.mipro_optimizer_v2: We will use the few-shot examples from the previous step, a generated dataset summary, a summary of the program code, and a randomly selected prompting tip to propose instructions.
2026/03/20 07:10:23 INFO dspy.teleprompt.mipro_optimizer_v2: 
Proposing N=6 instructions...

