Delimitation
Evaluates the proper use of delimiters in prompts provided to Large Language Models.
Purpose: This test, dubbed the “Delimitation Test”, is engineered to assess whether prompts provided to the Language Learning Model (LLM) correctly use delimiters to mark different sections of the input. Well-delimited prompts simplify the interpretation process for LLM, ensuring responses are precise and accurate.
Test Mechanism: The test employs an LLM to examine prompts for appropriate use of delimiters such as triple quotation marks, XML tags, and section titles. Each prompt is assigned a score from 1 to 10 based on its delimitation integrity. Those with scores equal to or above the preset threshold (which is 7 by default, although it can be adjusted as necessary) pass the test.
Signs of High Risk:
- The test identifies prompts where a delimiter is missing, improperly placed, or incorrect, which can lead to misinterpretation by the LLM.
- A high-risk scenario may involve complex prompts with multiple tasks or diverse data where correct delimitation is integral to understanding.
- Low scores (below the threshold) are a clear indicator of high risk.
Strengths:
- This test ensures clarity in the demarcation of different components of given prompts.
- It helps reduce ambiguity in understanding prompts, particularly for complex tasks.
- Scoring allows for quantified insight into the appropriateness of delimiter usage, aiding continuous improvement.
Limitations:
- The test only checks for the presence and placement of delimiter, not whether the correct delimiter type is used for the specific data or task.
- It may not fully reveal the impacts of poor delimitation on LLM’s final performance.
- Depending on the complexity of the tasks and prompts, the preset score threshold may not be refined enough, requiring regular manual adjustment.