Delimitation

Evaluates the proper use of delimiters in prompts provided to Large Language Models.

Purpose: This test, dubbed the “Delimitation Test”, is engineered to assess whether prompts provided to the Language Learning Model (LLM) correctly use delimiters to mark different sections of the input. Well-delimited prompts simplify the interpretation process for LLM, ensuring responses are precise and accurate.

Test Mechanism: The test employs an LLM to examine prompts for appropriate use of delimiters such as triple quotation marks, XML tags, and section titles. Each prompt is assigned a score from 1 to 10 based on its delimitation integrity. Those with scores equal to or above the preset threshold (which is 7 by default, although it can be adjusted as necessary) pass the test.

Signs of High Risk:

  • The test identifies prompts where a delimiter is missing, improperly placed, or incorrect, which can lead to misinterpretation by the LLM.
  • A high-risk scenario may involve complex prompts with multiple tasks or diverse data where correct delimitation is integral to understanding.
  • Low scores (below the threshold) are a clear indicator of high risk.

Strengths:

  • This test ensures clarity in the demarcation of different components of given prompts.
  • It helps reduce ambiguity in understanding prompts, particularly for complex tasks.
  • Scoring allows for quantified insight into the appropriateness of delimiter usage, aiding continuous improvement.

Limitations:

  • The test only checks for the presence and placement of delimiter, not whether the correct delimiter type is used for the specific data or task.
  • It may not fully reveal the impacts of poor delimitation on LLM’s final performance.
  • Depending on the complexity of the tasks and prompts, the preset score threshold may not be refined enough, requiring regular manual adjustment.