janus.metrics.llm_metrics#
Classes#
The output of an LLM evaluation metric. |
Functions#
|
Load a default prompt from a file. |
|
Calculate the LLM self evaluation score. |
|
CLI option to calculate the LLM self evaluation score. |
|
CLI option to calculate the LLM self evaluation score, for evaluations which |
Module Contents#
- class janus.metrics.llm_metrics.LLMMetricOutput#
Bases:
langchain_core.pydantic_v1.BaseModel
The output of an LLM evaluation metric.
- janus.metrics.llm_metrics.load_prompt(path, language, parser)#
Load a default prompt from a file.
- Parameters:
path (pathlib.Path) – The path to the file.
language (str) – The language of the prompt.
pydantic_model – The Pydantic model to use for parsing the output.
parser (langchain_core.output_parsers.BaseOutputParser) –
- Returns:
The prompt text.
- Return type:
langchain_core.prompts.PromptTemplate
- janus.metrics.llm_metrics.evaluate(target, language, model, prompt_path, reference=None)#
Calculate the LLM self evaluation score.
- Parameters:
target (str) – The target text.
language (str) – The language that the target code is written in.
prompt_path (pathlib.Path) – The filepath of the prompt text
reference (str | None) – The reference text.
model (str) –
- Returns:
The LLM Evaluation score.
- janus.metrics.llm_metrics.llm_evaluate_option(target, metric='quality', prompt=None, num_eval=1, **kwargs)#
CLI option to calculate the LLM self evaluation score.
- Parameters:
target (str) – The target text.
reference – The reference text.
metric (typing_extensions.Annotated[str, typer.Option('--metric', '-m', help='The pre-defined metric to use for evaluation.', click_type=click.Choice(['quality', 'clarity', 'faithfulness', 'completeness', 'hallucination', 'readability', 'usefulness']))]) – The pre-defined metric to use for evaluation.
prompt (typing_extensions.Annotated[str, None, typer.Option('--prompt', '-P', help='A custom prompt in a .txt file to use for evaluation.')]) – The prompt text.
num_eval (typing_extensions.Annotated[int, typer.Option('-n', '--num-eval', help='Number of times to run the evaluation')]) –
- Returns:
The LLM Evaluation score.
- Return type:
Any
- janus.metrics.llm_metrics.llm_evaluate_ref_option(target, reference, metric='faithfulness', prompt=None, num_eval=1, **kwargs)#
CLI option to calculate the LLM self evaluation score, for evaluations which require a reference file (e.g. faithfulness)
- Parameters:
target (str) – The target text.
reference (str) – The reference text.
metric (typing_extensions.Annotated[str, typer.Option('--metric', '-m', help='The pre-defined metric to use for evaluation.', click_type=click.Choice(['faithfulness']))]) – The pre-defined metric to use for evaluation.
prompt (typing_extensions.Annotated[str, None, typer.Option('--prompt', '-P', help='A custom prompt in a .txt file to use for evaluation.')]) – The prompt text.
num_eval (typing_extensions.Annotated[int, typer.Option('-n', '--num-eval', help='Number of times to run evaluation for pair')]) –
- Returns:
The LLM Evaluation score.
- Return type:
Any