• About
  • Get Started
  • Guides
  • Developers
    • ValidMind Library
    • Supported Models
    • QuickStart Notebook

    • TESTING
    • Run Tests & Test Suites
    • Test Descriptions
    • Test Sandbox (BETA)

    • CODE SAMPLES
    • All Code Samples · LLM · NLP · Time Series · Etc.
    • Download Code Samples · notebooks.zip
    • Try it on JupyterHub

    • REFERENCE
    • ValidMind Library API
  • Support
  • Training
  • validmind.com
  • Documentation
    • About ​ValidMind
    • Get Started
    • Guides
    • Support

    • Developers
    • ValidMind Library

    • ValidMind Academy
    • Training Courses

    • validmind.com
  • Training
    • ValidMind Academy

    • Fundamentals
    • For Administrators
    • For Developers
    • For Validators
  • Log In
    • Public Internet
    • ValidMind Platform · US1
    • ValidMind Platform · CA1

    • Private Link
    • Virtual Private ValidMind (VPV)

    • Which login should I use?
  1. tests

EU AI Act Compliance — Read our original regulation brief on how the EU AI Act aims to balance innovation with safety and accountability, setting standards for responsible AI use

  • ValidMind Library

  • Python API
  • 2.8.12
  • init
  • init_dataset
  • init_model
  • init_r_model
  • get_test_suite
  • log_metric
  • preview_template
  • print_env
  • reload
  • run_documentation_tests
  • run_test_suite
  • tags
  • tasks
  • test
  • RawData
    • RawData
    • inspect
    • serialize

  • Submodules
  • __version__
  • datasets
    • classification
      • customer_churn
      • taiwan_credit
    • credit_risk
      • lending_club
      • lending_club_bias
    • nlp
      • cnn_dailymail
      • twitter_covid_19
    • regression
      • fred
      • lending_club
  • errors
  • test_suites
    • classifier
    • cluster
    • embeddings
    • llm
    • nlp
    • parameters_optimization
    • regression
    • statsmodels_timeseries
    • summarization
    • tabular_datasets
    • text_data
    • time_series
  • tests
    • data_validation
      • ACFandPACFPlot
      • ADF
      • AutoAR
      • AutoMA
      • AutoStationarity
      • BivariateScatterPlots
      • BoxPierce
      • ChiSquaredFeaturesTable
      • ClassImbalance
      • CommonWords
      • DatasetDescription
      • DatasetSplit
      • DescriptiveStatistics
      • DickeyFullerGLS
      • Duplicates
      • EngleGrangerCoint
      • FeatureTargetCorrelationPlot
      • Hashtags
      • HighCardinality
      • HighPearsonCorrelation
      • IQROutliersBarPlot
      • IQROutliersTable
      • IsolationForestOutliers
      • JarqueBera
      • KPSS
      • LJungBox
      • LaggedCorrelationHeatmap
      • LanguageDetection
      • Mentions
      • MissingValues
      • MissingValuesBarPlot
      • MutualInformation
      • PearsonCorrelationMatrix
      • PhillipsPerronArch
      • PolarityAndSubjectivity
      • ProtectedClassesCombination
      • ProtectedClassesDescription
      • ProtectedClassesDisparity
      • ProtectedClassesThresholdOptimizer
      • Punctuations
      • RollingStatsPlot
      • RunsTest
      • ScatterPlot
      • ScoreBandDefaultRates
      • SeasonalDecompose
      • Sentiment
      • ShapiroWilk
      • Skewness
      • SpreadPlot
      • StopWords
      • TabularCategoricalBarPlots
      • TabularDateTimeHistograms
      • TabularDescriptionTables
      • TabularNumericalHistograms
      • TargetRateBarPlots
      • TextDescription
      • TimeSeriesDescription
      • TimeSeriesDescriptiveStatistics
      • TimeSeriesFrequency
      • TimeSeriesHistogram
      • TimeSeriesLinePlot
      • TimeSeriesMissingValues
      • TimeSeriesOutliers
      • TooManyZeroValues
      • Toxicity
      • UniqueRows
      • WOEBinPlots
      • WOEBinTable
      • ZivotAndrewsArch
      • nlp
    • model_validation
      • AdjustedMutualInformation
      • AdjustedRandIndex
      • AutoARIMA
      • BertScore
      • BleuScore
      • CalibrationCurve
      • ClassifierPerformance
      • ClassifierThresholdOptimization
      • ClusterCosineSimilarity
      • ClusterPerformanceMetrics
      • ClusterSizeDistribution
      • CompletenessScore
      • ConfusionMatrix
      • ContextualRecall
      • CumulativePredictionProbabilities
      • DurbinWatsonTest
      • FeatureImportance
      • FeaturesAUC
      • FowlkesMallowsScore
      • GINITable
      • HomogeneityScore
      • HyperParametersTuning
      • KMeansClustersOptimization
      • KolmogorovSmirnov
      • Lilliefors
      • MeteorScore
      • MinimumAccuracy
      • MinimumF1Score
      • MinimumROCAUCScore
      • ModelMetadata
      • ModelParameters
      • ModelPredictionResiduals
      • ModelsPerformanceComparison
      • OverfitDiagnosis
      • PermutationFeatureImportance
      • PopulationStabilityIndex
      • PrecisionRecallCurve
      • PredictionProbabilitiesHistogram
      • ROCCurve
      • RegardScore
      • RegressionCoeffs
      • RegressionErrors
      • RegressionErrorsComparison
      • RegressionFeatureSignificance
      • RegressionModelForecastPlot
      • RegressionModelForecastPlotLevels
      • RegressionModelSensitivityPlot
      • RegressionModelSummary
      • RegressionPerformance
      • RegressionPermutationFeatureImportance
      • RegressionR2Square
      • RegressionR2SquareComparison
      • RegressionResidualsPlot
      • RobustnessDiagnosis
      • RougeScore
      • SHAPGlobalImportance
      • ScoreProbabilityAlignment
      • ScorecardHistogram
      • SilhouettePlot
      • TimeSeriesPredictionWithCI
      • TimeSeriesPredictionsPlot
      • TimeSeriesR2SquareBySegments
      • TokenDisparity
      • ToxicityScore
      • TrainingTestDegradation
      • VMeasure
      • WeakspotsDiagnosis
      • sklearn
      • statsmodels
      • statsutils
    • prompt_validation
      • Bias
      • Clarity
      • Conciseness
      • Delimitation
      • NegativeInstruction
      • Robustness
      • Specificity
      • ai_powered_test
  • unit_metrics
  • vm_models

On this page

  • list_tests
  • load_test
  • describe_test
  • run_test
  • list_tags
  • list_tasks
  • list_tasks_and_tags
  • test
  • tags
  • tasks
  • register_test_provider
  • LoadTestError
    • LoadTestError
  • LocalTestProvider
    • LocalTestProvider
    • list_tests
    • load_test
  • TestProvider
    • list_tests
    • load_test
  • Edit this page
  • Report an issue

validmind.tests

ValidMind Tests Module

  • data_validation
  • model_validation
  • prompt_validation

list_tests

deflist_tests(filter:Optional[str]=None,task:Optional[str]=None,tags:Optional[List[str]]=None,pretty:bool=True,truncate:bool=True) → Union[Dict[str, Callable[..., Any]], None]:

List all available tests with optional filtering

load_test

defload_test(test_id:str,test_func:Optional[Callable[..., Any]]=None,reload:bool=False) → Callable[..., Any]:

Load a test by test ID

Test IDs are in the format namespace.path_to_module.TestClassOrFuncName[:tag]. The tag is optional and is used to distinguish between multiple results from the same test.

Arguments

  • test_id (str): The test ID in the format namespace.path_to_module.TestName[:tag]
  • test_func (callable, optional): The test function to load. If not provided, the test will be loaded from the test provider. Defaults to None.

describe_test

defdescribe_test(test_id:Optional[TestID (Union of validmind.data_validation.*, validmind.model_validation.*, validmind.prompt_validation.* and str)]=None,raw:bool=False,show:bool=True) → Union[str, HTML, Dict[str, Any]]:

Describe a test's functionality and parameters

run_test

defrun_test(test_id:Union[TestID (Union of validmind.data_validation.*, validmind.model_validation.*, validmind.prompt_validation.* and str), None]=None,name:Union[str, None]=None,unit_metrics:Union[List[TestID (Unit metrics from validmind.unit_metrics.*)], None]=None,inputs:Union[Dict[str, Any], None]=None,input_grid:Union[Dict[str, List[Any]], List[Dict[str, Any]], None]=None,params:Union[Dict[str, Any], None]=None,param_grid:Union[Dict[str, List[Any]], List[Dict[str, Any]], None]=None,show:bool=True,generate_description:bool=True,title:Optional[str]=None,post_process_fn:Union[Callable[[validmind.vm_models.TestResult], None], None]=None,**kwargs) → validmind.vm_models.TestResult:

Run a ValidMind or custom test

This function is the main entry point for running tests. It can run simple unit metrics, ValidMind and custom tests, composite tests made up of multiple unit metrics and comparison tests made up of multiple tests.

Arguments

  • test_id (TestID): Test ID to run. Not required if name and unit_metrics provided.
  • params (dict): Parameters to customize test behavior. See test details for available parameters.
  • param_grid (Union[Dict[str, List[Any]], List[Dict[str, Any]]]): For comparison tests, either:
  • Dict mapping parameter names to lists of values (creates Cartesian product)
  • List of parameter dictionaries to test
  • inputs (Dict[str, Any]): Test inputs (models/datasets initialized with vm.init_model/dataset)
  • input_grid (Union[Dict[str, List[Any]], List[Dict[str, Any]]]): For comparison tests, either:
  • Dict mapping input names to lists of values (creates Cartesian product)
  • List of input dictionaries to test
  • name (str): Test name (required for composite metrics)
  • unit_metrics (list): Unit metric IDs to run as composite metric
  • show (bool, optional): Whether to display results. Defaults to True.
  • generate_description (bool, optional): Whether to generate a description. Defaults to True.
  • title (str): Custom title for the test result
  • post_process_fn (Callable[[TestResult], None]): Function to post-process the test result

Returns

  • A TestResult object containing the test results

Raises

  • ValueError: If the test inputs are invalid
  • LoadTestError: If the test class fails to load

list_tags

deflist_tags() → Set[str]:

List all available tags

list_tasks

deflist_tasks() → Set[str]:

List all available tasks

list_tasks_and_tags

deflist_tasks_and_tags(as_json:bool=False) → Union[str, Dict[str, List[str]]]:

List all available tasks and tags

test

deftest(func_or_id:Union[Callable[..., Any], str, None]):

Decorator for creating and registering custom tests

This decorator registers the function it wraps as a test function within ValidMind under the provided ID. Once decorated, the function can be run using the validmind.tests.run_test function.

The function can take two different types of arguments:

  • Inputs: ValidMind model or dataset (or list of models/datasets). These arguments must use the following names: model, models, dataset, datasets.
  • Parameters: Any additional keyword arguments of any type (must have a default value) that can have any name.

The function should return one of the following types:

  • Table: Either a list of dictionaries or a pandas DataFrame
  • Plot: Either a matplotlib figure or a plotly figure
  • Scalar: A single number (int or float)
  • Boolean: A single boolean value indicating whether the test passed or failed

The function may also include a docstring. This docstring will be used and logged as the metric's description.

Arguments

  • func_or_id (Union[Callable[..., Any], str, None]): Either the function to decorate or the test ID. If None, the function name is used.

Returns

  • The decorated function.

tags

deftags(*tags:str):

Decorator for specifying tags for a test.

Arguments

  • *tags: The tags to apply to the test.

tasks

deftasks(*tasks:str):

Decorator for specifying the task types that a test is designed for.

Arguments

  • *tasks: The task types that the test is designed for.

register_test_provider

defregister_test_provider(namespace:str,test_provider:validmind.vm_models.TestProvider):

Register an external test provider

Arguments

  • namespace (str): The namespace of the test provider
  • test_provider (TestProvider): The test provider

LoadTestError

classLoadTestError(BaseError):

Exception raised when an error occurs while loading a test.

Inherited members

  • From BaseError: class BaseError, description
  • From builtins.BaseException: with_traceback, add_note

LoadTestError

LoadTestError(message:str,original_error:Optional[validmind.vm_models.Exception]=None)

LocalTestProvider

classLocalTestProvider:

Test providers in ValidMind are responsible for loading tests from different sources, such as local files, databases, or remote services. The LocalTestProvider specifically loads tests from the local file system.

To use the LocalTestProvider, you need to provide the root_folder, which is the root directory for local tests. The test_id is a combination of the namespace (set when registering the test provider) and the path to the test class module, where slashes are replaced by dots and the .py extension is left out.

Example usage:

# Create an instance of LocalTestProvider with the root folder
test_provider = LocalTestProvider("/path/to/tests/folder")

# Register the test provider with a namespace
register_test_provider("my_namespace", test_provider)

# List all tests in the namespace (returns a list of test IDs)
test_provider.list_tests()
# this is used by the validmind.tests.list_tests() function to aggregate all tests
# from all test providers

# Load a test using the test_id (namespace + path to test class module)
test = test_provider.load_test("my_namespace.my_test_class")
# full path to the test class module is /path/to/tests/folder/my_test_class.py

Arguments

  • root_folder (str): The root directory for local tests.

LocalTestProvider

LocalTestProvider(root_folder:str)

Initialize the LocalTestProvider with the given root_folder (see class docstring for details)

Arguments

  • root_folder (str): The root directory for local tests.

list_tests

deflist_tests(self) → List[str]:

List all tests in the given namespace

Returns

  • A list of test IDs

load_test

defload_test(self,test_id:str) → Callable[..., Any]:

Load the test function identified by the given test_id

Arguments

  • test_id (str): The test ID (does not contain the namespace under which the test is registered)

Returns

  • The test function

Raises

  • FileNotFoundError: If the test is not found

TestProvider

classTestProvider(Protocol):

Protocol for user-defined test providers

list_tests

deflist_tests(self) → List[str]:

List all tests in the given namespace

Returns

  • A list of test IDs

load_test

defload_test(self,test_id:str) → callable:

Load the test function identified by the given test_id

Arguments

  • test_id (str): The test ID (does not contain the namespace under which the test is registered)

Returns

  • The test function

Raises

  • FileNotFoundError: If the test is not found
time_series
data_validation

© Copyright 2023-2024 ValidMind Inc. All Rights Reserved.

  • Edit this page
  • Report an issue
Cookie Preferences
  • validmind.com

  • Privacy Policy

  • Terms of Use