集成
ragas.integrations.langchain
EvaluatorChain
EvaluatorChain(metric: Metric, **kwargs: Any)
Bases: Chain
, RunEvaluator
Wrapper around ragas Metrics to use them with langsmith.
Source code in src/ragas/integrations/langchain.py
evaluate_run
Evaluate a langsmith run
Source code in src/ragas/integrations/langchain.py
ragas.integrations.langsmith
upload_dataset
upload_dataset(
dataset: Testset,
dataset_name: str,
dataset_desc: str = "",
) -> Dataset
Uploads a new dataset to LangSmith, converting it from a TestDataset object to a pandas DataFrame before upload. If a dataset with the specified name already exists, the function raises an error.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset
|
TestDataset
|
The dataset to be uploaded. |
required |
dataset_name
|
str
|
The name for the new dataset in LangSmith. |
required |
dataset_desc
|
str
|
A description for the new dataset. The default is an empty string. |
''
|
Returns:
Type | Description |
---|---|
Dataset
|
The dataset object as stored in LangSmith after upload. |
Raises:
Type | Description |
---|---|
ValueError
|
If a dataset with the specified name already exists in LangSmith. |
Notes
The function attempts to read a dataset by the given name to check its existence. If not found, it proceeds to upload the dataset after converting it to a pandas DataFrame. This involves specifying input and output keys for the dataset being uploaded.
Source code in src/ragas/integrations/langsmith.py
evaluate
evaluate(
dataset_name: str,
llm_or_chain_factory: Any,
experiment_name: Optional[str] = None,
metrics: Optional[list] = None,
verbose: bool = False,
) -> Dict[str, Any]
Evaluates a language model or a chain factory on a specified dataset using LangSmith, with the option to customize metrics and verbosity.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
dataset_name
|
str
|
The name of the dataset to use for evaluation. This dataset must exist in LangSmith. |
required |
llm_or_chain_factory
|
Any
|
The language model or chain factory to be evaluated. This parameter is flexible and can accept a variety of objects depending on the implementation. |
required |
experiment_name
|
Optional[str]
|
The name of the experiment. This can be used to categorize or identify the evaluation run within LangSmith. The default is None. |
None
|
metrics
|
Optional[list]
|
A list of custom metrics (functions or evaluators) to be used for the evaluation. If None, a default set of metrics (answer relevancy, context precision, context recall, and faithfulness) are used. The default is None. |
None
|
verbose
|
bool
|
If True, detailed progress and results will be printed during the evaluation process. The default is False. |
False
|
Returns:
Type | Description |
---|---|
Dict[str, Any]
|
A dictionary containing the results of the evaluation. |
Raises:
Type | Description |
---|---|
ValueError
|
If the specified dataset does not exist in LangSmith. |
See Also
Client.read_dataset : Method to read an existing dataset. Client.run_on_dataset : Method to run the evaluation on the specified dataset.
Examples:
>>> results = evaluate(
... dataset_name="MyDataset",
... llm_or_chain_factory=my_llm,
... experiment_name="experiment_1_with_vanila_rag",
... verbose=True
... )
>>> print(results)
{'evaluation_result': ...}
Notes
The function initializes a client to interact with LangSmith, validates the existence of the specified dataset, prepares evaluation metrics, and runs the evaluation, returning the results. Custom evaluation metrics can be specified, or a default set will be used if none are provided.
Source code in src/ragas/integrations/langsmith.py
82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 |
|
ragas.integrations.llama_index
ragas.integrations.opik
OpikTracer
Bases: OpikTracer
Callback for Opik that can be used to log traces and evaluation scores to the Opik platform.
Attributes:
Name | Type | Description |
---|---|---|
tags |
list[string]
|
The tags to set on each trace. |
metadata |
dict
|
Additional metadata to log for each trace. |