Inference

class elasticsearch.client.InferenceClient(client)
Parameters:

client (BaseClient)

delete(*, inference_id, task_type=None, dry_run=None, error_trace=None, filter_path=None, force=None, human=None, pretty=None)

Delete an inference endpoint

https://www.elastic.co/guide/en/elasticsearch/reference/8.16/delete-inference-api.html

Parameters:
  • inference_id (str) – The inference Id

  • task_type (str | Literal['completion', 'rerank', 'sparse_embedding', 'text_embedding'] | None) – The task type

  • dry_run (bool | None) – When true, the endpoint is not deleted, and a list of ingest processors which reference this endpoint is returned

  • force (bool | None) – When true, the inference endpoint is forcefully deleted even if it is still being used by ingest processors or semantic text fields

  • error_trace (bool | None)

  • filter_path (str | Sequence[str] | None)

  • human (bool | None)

  • pretty (bool | None)

Return type:

ObjectApiResponse[Any]

get(*, task_type=None, inference_id=None, error_trace=None, filter_path=None, human=None, pretty=None)

Get an inference endpoint

https://www.elastic.co/guide/en/elasticsearch/reference/8.16/get-inference-api.html

Parameters:
  • task_type (str | Literal['completion', 'rerank', 'sparse_embedding', 'text_embedding'] | None) – The task type

  • inference_id (str | None) – The inference Id

  • error_trace (bool | None)

  • filter_path (str | Sequence[str] | None)

  • human (bool | None)

  • pretty (bool | None)

Return type:

ObjectApiResponse[Any]

inference(*, inference_id, input=None, task_type=None, error_trace=None, filter_path=None, human=None, pretty=None, query=None, task_settings=None, timeout=None, body=None)

Perform inference on the service

https://www.elastic.co/guide/en/elasticsearch/reference/8.16/post-inference-api.html

Parameters:
  • inference_id (str) – The inference Id

  • input (str | Sequence[str] | None) – Inference input. Either a string or an array of strings.

  • task_type (str | Literal['completion', 'rerank', 'sparse_embedding', 'text_embedding'] | None) – The task type

  • query (str | None) – Query input, required for rerank task. Not required for other tasks.

  • task_settings (Any | None) – Optional task settings

  • timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Specifies the amount of time to wait for the inference request to complete.

  • error_trace (bool | None)

  • filter_path (str | Sequence[str] | None)

  • human (bool | None)

  • pretty (bool | None)

  • body (Dict[str, Any] | None)

Return type:

ObjectApiResponse[Any]

put(*, inference_id, inference_config=None, body=None, task_type=None, error_trace=None, filter_path=None, human=None, pretty=None)

Create an inference endpoint

https://www.elastic.co/guide/en/elasticsearch/reference/8.16/put-inference-api.html

Parameters:
  • inference_id (str) – The inference Id

  • inference_config (Mapping[str, Any] | None)

  • task_type (str | Literal['completion', 'rerank', 'sparse_embedding', 'text_embedding'] | None) – The task type

  • body (Mapping[str, Any] | None)

  • error_trace (bool | None)

  • filter_path (str | Sequence[str] | None)

  • human (bool | None)

  • pretty (bool | None)

Return type:

ObjectApiResponse[Any]