Machine Learning (ML)
- class elasticsearch.client.MlClient(client)
- Parameters:
client (BaseClient)
- clear_trained_model_deployment_cache(*, model_id, error_trace=None, filter_path=None, human=None, pretty=None)
Clear trained model deployment cache. Cache will be cleared on all nodes where the trained model is assigned. A trained model deployment may have an inference cache enabled. As requests are handled by each allocated node, their responses may be cached on that individual node. Calling this API clears the caches without restarting the deployment.
- close_job(*, job_id, allow_no_match=None, error_trace=None, filter_path=None, force=None, human=None, pretty=None, timeout=None, body=None)
Close anomaly detection jobs. A job can be opened and closed multiple times throughout its lifecycle. A closed job cannot receive data or perform analysis operations, but you can still explore and navigate results. When you close a job, it runs housekeeping tasks such as pruning the model history, flushing buffers, calculating final results and persisting the model snapshots. Depending upon the size of the job, it could take several minutes to close and the equivalent time to re-open. After it is closed, the job has a minimal overhead on the cluster except for maintaining its meta data. Therefore it is a best practice to close jobs that are no longer required to process data. If you close an anomaly detection job whose datafeed is running, the request first tries to stop the datafeed. This behavior is equivalent to calling stop datafeed API with the same timeout and force parameters as the close job request. When a datafeed that has a specified end date stops, it automatically closes its associated job.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-close-job.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job. It can be a job identifier, a group name, or a wildcard expression. You can close multiple anomaly detection jobs in a single API request by using a group name, a comma-separated list of jobs, or a wildcard expression. You can close all jobs by using _all or by specifying * as the job identifier.
allow_no_match (bool | None) – Refer to the description for the allow_no_match query parameter.
force (bool | None) – Refer to the descriptiion for the force query parameter.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Refer to the description for the timeout query parameter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- delete_calendar(*, calendar_id, error_trace=None, filter_path=None, human=None, pretty=None)
Delete a calendar. Removes all scheduled events from a calendar, then deletes it.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-calendar.html
- delete_calendar_event(*, calendar_id, event_id, error_trace=None, filter_path=None, human=None, pretty=None)
Delete events from a calendar.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-calendar-event.html
- Parameters:
- Return type:
- delete_calendar_job(*, calendar_id, job_id, error_trace=None, filter_path=None, human=None, pretty=None)
Delete anomaly jobs from a calendar.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-calendar-job.html
- Parameters:
- Return type:
- delete_data_frame_analytics(*, id, error_trace=None, filter_path=None, force=None, human=None, pretty=None, timeout=None)
Delete a data frame analytics job.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/delete-dfanalytics.html
- Parameters:
id (str) – Identifier for the data frame analytics job.
force (bool | None) – If true, it deletes a job that is not stopped; this method is quicker than stopping and deleting the job.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – The time to wait for the job to be deleted.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- delete_datafeed(*, datafeed_id, error_trace=None, filter_path=None, force=None, human=None, pretty=None)
Delete a datafeed.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-datafeed.html
- Parameters:
datafeed_id (str) – A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
force (bool | None) – Use to forcefully delete a started datafeed; this method is quicker than stopping and deleting the datafeed.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- delete_expired_data(*, job_id=None, error_trace=None, filter_path=None, human=None, pretty=None, requests_per_second=None, timeout=None, body=None)
Delete expired ML data. Deletes all job results, model snapshots and forecast data that have exceeded their retention days period. Machine learning state documents that are not associated with any job are also deleted. You can limit the request to a single or set of anomaly detection jobs by using a job identifier, a group name, a comma-separated list of jobs, or a wildcard expression. You can delete expired data for all anomaly detection jobs by using _all, by specifying * as the <job_id>, or by omitting the <job_id>.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-expired-data.html
- Parameters:
job_id (str | None) – Identifier for an anomaly detection job. It can be a job identifier, a group name, or a wildcard expression.
requests_per_second (float | None) – The desired requests per second for the deletion processes. The default behavior is no throttling.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – How long can the underlying delete processes run until they are canceled.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- delete_filter(*, filter_id, error_trace=None, filter_path=None, human=None, pretty=None)
Delete a filter. If an anomaly detection job references the filter, you cannot delete the filter. You must update or delete the job before you can delete the filter.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-filter.html
- delete_forecast(*, job_id, forecast_id=None, allow_no_forecasts=None, error_trace=None, filter_path=None, human=None, pretty=None, timeout=None)
Delete forecasts from a job. By default, forecasts are retained for 14 days. You can specify a different retention period with the expires_in parameter in the forecast jobs API. The delete forecast API enables you to delete one or more forecasts before they expire.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-forecast.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
forecast_id (str | None) – A comma-separated list of forecast identifiers. If you do not specify this optional parameter or if you specify _all or * the API deletes all forecasts from the job.
allow_no_forecasts (bool | None) – Specifies whether an error occurs when there are no forecasts. In particular, if this parameter is set to false and there are no forecasts associated with the job, attempts to delete all forecasts return an error.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Specifies the period of time to wait for the completion of the delete operation. When this period of time elapses, the API fails and returns an error.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- delete_job(*, job_id, delete_user_annotations=None, error_trace=None, filter_path=None, force=None, human=None, pretty=None, wait_for_completion=None)
Delete an anomaly detection job. All job configuration, model state and results are deleted. It is not currently possible to delete multiple jobs using wildcards or a comma separated list. If you delete a job that has a datafeed, the request first tries to delete the datafeed. This behavior is equivalent to calling the delete datafeed API with the same timeout and force parameters as the delete job request.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-job.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
delete_user_annotations (bool | None) – Specifies whether annotations that have been added by the user should be deleted along with any auto-generated annotations when the job is reset.
force (bool | None) – Use to forcefully delete an opened job; this method is quicker than closing and deleting the job.
wait_for_completion (bool | None) – Specifies whether the request should return immediately or wait until the job deletion completes.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- delete_model_snapshot(*, job_id, snapshot_id, error_trace=None, filter_path=None, human=None, pretty=None)
Delete a model snapshot. You cannot delete the active model snapshot. To delete that snapshot, first revert to a different one. To identify the active model snapshot, refer to the model_snapshot_id in the results from the get jobs API.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-delete-snapshot.html
- delete_trained_model(*, model_id, error_trace=None, filter_path=None, force=None, human=None, pretty=None)
Delete an unreferenced trained model. The request deletes a trained inference model that is not referenced by an ingest pipeline.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/delete-trained-models.html
- Parameters:
- Return type:
- delete_trained_model_alias(*, model_id, model_alias, error_trace=None, filter_path=None, human=None, pretty=None)
Delete a trained model alias. This API deletes an existing model alias that refers to a trained model. If the model alias is missing or refers to a model other than the one identified by the model_id, this API returns an error.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/delete-trained-models-aliases.html
- estimate_model_memory(*, analysis_config=None, error_trace=None, filter_path=None, human=None, max_bucket_cardinality=None, overall_cardinality=None, pretty=None, body=None)
Estimate job model memory usage. Makes an estimation of the memory usage for an anomaly detection job model. It is based on analysis configuration details for the job and cardinality estimates for the fields it references.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-apis.html
- Parameters:
analysis_config (Mapping[str, Any] | None) – For a list of the properties that you can specify in the analysis_config component of the body of this API.
max_bucket_cardinality (Mapping[str, int] | None) – Estimates of the highest cardinality in a single bucket that is observed for influencer fields over the time period that the job analyzes data. To produce a good answer, values must be provided for all influencer fields. Providing values for fields that are not listed as influencers has no effect on the estimation.
overall_cardinality (Mapping[str, int] | None) – Estimates of the cardinality that is observed for fields over the whole time period that the job analyzes data. To produce a good answer, values must be provided for fields referenced in the by_field_name, over_field_name and partition_field_name of any detectors. Providing values for other fields has no effect on the estimation. It can be omitted from the request if no detectors have a by_field_name, over_field_name or partition_field_name.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- evaluate_data_frame(*, evaluation=None, index=None, error_trace=None, filter_path=None, human=None, pretty=None, query=None, body=None)
Evaluate data frame analytics. The API packages together commonly used evaluation metrics for various types of machine learning features. This has been designed for use on indexes created by data frame analytics. Evaluation requires both a ground truth field and an analytics result field to be present.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/evaluate-dfanalytics.html
- Parameters:
evaluation (Mapping[str, Any] | None) – Defines the type of evaluation you want to perform.
index (str | None) – Defines the index in which the evaluation will be performed.
query (Mapping[str, Any] | None) – A query clause that retrieves a subset of data from the source index.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- explain_data_frame_analytics(*, id=None, allow_lazy_start=None, analysis=None, analyzed_fields=None, description=None, dest=None, error_trace=None, filter_path=None, human=None, max_num_threads=None, model_memory_limit=None, pretty=None, source=None, body=None)
Explain data frame analytics config. This API provides explanations for a data frame analytics config that either exists already or one that has not been created yet. The following explanations are provided: * which fields are included or not in the analysis and why, * how much memory is estimated to be required. The estimate can be used when deciding the appropriate value for model_memory_limit setting later on. If you have object fields or fields that are excluded via source filtering, they are not included in the explanation.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/explain-dfanalytics.html
- Parameters:
id (str | None) – Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
allow_lazy_start (bool | None) – Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node.
analysis (Mapping[str, Any] | None) – The analysis configuration, which contains the information necessary to perform one of the following types of analysis: classification, outlier detection, or regression.
analyzed_fields (Mapping[str, Any] | None) – Specify includes and/or excludes patterns to select which fields will be included in the analysis. The patterns specified in excludes are applied last, therefore excludes takes precedence. In other words, if the same field is specified in both includes and excludes, then the field will not be included in the analysis.
description (str | None) – A description of the job.
dest (Mapping[str, Any] | None) – The destination configuration, consisting of index and optionally results_field (ml by default).
max_num_threads (int | None) – The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.
model_memory_limit (str | None) – The approximate maximum amount of memory resources that are permitted for analytical processing. If your elasticsearch.yml file contains an xpack.ml.max_model_memory_limit setting, an error occurs when you try to create data frame analytics jobs that have model_memory_limit values greater than that setting.
source (Mapping[str, Any] | None) – The configuration of how to source the analysis data. It requires an index. Optionally, query and _source may be specified.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- flush_job(*, job_id, advance_time=None, calc_interim=None, end=None, error_trace=None, filter_path=None, human=None, pretty=None, skip_time=None, start=None, body=None)
Force buffered data to be processed. The flush jobs API is only applicable when sending data for analysis using the post data API. Depending on the content of the buffer, then it might additionally calculate new results. Both flush and close operations are similar, however the flush is more efficient if you are expecting to send more data for analysis. When flushing, the job remains open and is available to continue analyzing data. A close operation additionally prunes and persists the model state to disk and the job must be opened again before analyzing further data.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-flush-job.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
advance_time (str | Any | None) – Refer to the description for the advance_time query parameter.
calc_interim (bool | None) – Refer to the description for the calc_interim query parameter.
end (str | Any | None) – Refer to the description for the end query parameter.
skip_time (str | Any | None) – Refer to the description for the skip_time query parameter.
start (str | Any | None) – Refer to the description for the start query parameter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- forecast(*, job_id, duration=None, error_trace=None, expires_in=None, filter_path=None, human=None, max_model_memory=None, pretty=None, body=None)
Predict future behavior of a time series. Forecasts are not supported for jobs that perform population analysis; an error occurs if you try to create a forecast for a job that has an over_field_name in its configuration. Forcasts predict future behavior based on historical data.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-forecast.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job. The job must be open when you create a forecast; otherwise, an error occurs.
duration (str | Literal[-1] | ~typing.Literal[0] | None) – Refer to the description for the duration query parameter.
expires_in (str | Literal[-1] | ~typing.Literal[0] | None) – Refer to the description for the expires_in query parameter.
max_model_memory (str | None) – Refer to the description for the max_model_memory query parameter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_buckets(*, job_id, timestamp=None, anomaly_score=None, desc=None, end=None, error_trace=None, exclude_interim=None, expand=None, filter_path=None, from_=None, human=None, page=None, pretty=None, size=None, sort=None, start=None, body=None)
Get anomaly detection job results for buckets. The API presents a chronological view of the records, grouped by bucket.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-bucket.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
timestamp (str | Any | None) – The timestamp of a single bucket result. If you do not specify this parameter, the API returns information about all buckets.
anomaly_score (float | None) – Refer to the description for the anomaly_score query parameter.
desc (bool | None) – Refer to the description for the desc query parameter.
end (str | Any | None) – Refer to the description for the end query parameter.
exclude_interim (bool | None) – Refer to the description for the exclude_interim query parameter.
expand (bool | None) – Refer to the description for the expand query parameter.
from – Skips the specified number of buckets.
size (int | None) – Specifies the maximum number of buckets to obtain.
sort (str | None) – Refer to the desription for the sort query parameter.
start (str | Any | None) – Refer to the description for the start query parameter.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_calendar_events(*, calendar_id, end=None, error_trace=None, filter_path=None, from_=None, human=None, job_id=None, pretty=None, size=None, start=None)
Get info about events in calendars.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-calendar-event.html
- Parameters:
calendar_id (str) – A string that uniquely identifies a calendar. You can get information for multiple calendars by using a comma-separated list of ids or a wildcard expression. You can get information for all calendars by using _all or * or by omitting the calendar identifier.
end (str | Any | None) – Specifies to get events with timestamps earlier than this time.
from – Skips the specified number of events.
job_id (str | None) – Specifies to get events for a specific anomaly detection job identifier or job group. It must be used with a calendar identifier of _all or *.
size (int | None) – Specifies the maximum number of events to obtain.
start (str | Any | None) – Specifies to get events with timestamps after this time.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_calendars(*, calendar_id=None, error_trace=None, filter_path=None, from_=None, human=None, page=None, pretty=None, size=None, body=None)
Get calendar configuration info.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-calendar.html
- Parameters:
calendar_id (str | None) – A string that uniquely identifies a calendar. You can get information for multiple calendars by using a comma-separated list of ids or a wildcard expression. You can get information for all calendars by using _all or * or by omitting the calendar identifier.
from – Skips the specified number of calendars. This parameter is supported only when you omit the calendar identifier.
page (Mapping[str, Any] | None) – This object is supported only when you omit the calendar identifier.
size (int | None) – Specifies the maximum number of calendars to obtain. This parameter is supported only when you omit the calendar identifier.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_categories(*, job_id, category_id=None, error_trace=None, filter_path=None, from_=None, human=None, page=None, partition_field_value=None, pretty=None, size=None, body=None)
Get anomaly detection job results for categories.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-category.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
category_id (str | None) – Identifier for the category, which is unique in the job. If you specify neither the category ID nor the partition_field_value, the API returns information about all categories. If you specify only the partition_field_value, it returns information about all categories for the specified partition.
from – Skips the specified number of categories.
page (Mapping[str, Any] | None) – Configures pagination. This parameter has the from and size properties.
partition_field_value (str | None) – Only return categories for the specified partition.
size (int | None) – Specifies the maximum number of categories to obtain.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_data_frame_analytics(*, id=None, allow_no_match=None, error_trace=None, exclude_generated=None, filter_path=None, from_=None, human=None, pretty=None, size=None)
Get data frame analytics job configuration info. You can get information for multiple data frame analytics jobs in a single API request by using a comma-separated list of data frame analytics jobs or a wildcard expression.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/get-dfanalytics.html
- Parameters:
id (str | None) – Identifier for the data frame analytics job. If you do not specify this option, the API returns information for the first hundred data frame analytics jobs.
allow_no_match (bool | None) – Specifies what to do when the request: 1. Contains wildcard expressions and there are no data frame analytics jobs that match. 2. Contains the _all string or no identifiers and there are no matches. 3. Contains wildcard expressions and there are only partial matches. The default value returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.
exclude_generated (bool | None) – Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.
from – Skips the specified number of data frame analytics jobs.
size (int | None) – Specifies the maximum number of data frame analytics jobs to obtain.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_data_frame_analytics_stats(*, id=None, allow_no_match=None, error_trace=None, filter_path=None, from_=None, human=None, pretty=None, size=None, verbose=None)
Get data frame analytics jobs usage info.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/get-dfanalytics-stats.html
- Parameters:
id (str | None) – Identifier for the data frame analytics job. If you do not specify this option, the API returns information for the first hundred data frame analytics jobs.
allow_no_match (bool | None) – Specifies what to do when the request: 1. Contains wildcard expressions and there are no data frame analytics jobs that match. 2. Contains the _all string or no identifiers and there are no matches. 3. Contains wildcard expressions and there are only partial matches. The default value returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.
from – Skips the specified number of data frame analytics jobs.
size (int | None) – Specifies the maximum number of data frame analytics jobs to obtain.
verbose (bool | None) – Defines whether the stats response should be verbose.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_datafeed_stats(*, datafeed_id=None, allow_no_match=None, error_trace=None, filter_path=None, human=None, pretty=None)
Get datafeeds usage info. You can get statistics for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get statistics for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. If the datafeed is stopped, the only information you receive is the datafeed_id and the state. This API returns a maximum of 10,000 datafeeds.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-datafeed-stats.html
- Parameters:
datafeed_id (str | Sequence[str] | None) – Identifier for the datafeed. It can be a datafeed identifier or a wildcard expression. If you do not specify one of these options, the API returns information about all datafeeds.
allow_no_match (bool | None) – Specifies what to do when the request: 1. Contains wildcard expressions and there are no datafeeds that match. 2. Contains the _all string or no identifiers and there are no matches. 3. Contains wildcard expressions and there are only partial matches. The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_datafeeds(*, datafeed_id=None, allow_no_match=None, error_trace=None, exclude_generated=None, filter_path=None, human=None, pretty=None)
Get datafeeds configuration info. You can get information for multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can get information for all datafeeds by using _all, by specifying * as the <feed_id>, or by omitting the <feed_id>. This API returns a maximum of 10,000 datafeeds.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-datafeed.html
- Parameters:
datafeed_id (str | Sequence[str] | None) – Identifier for the datafeed. It can be a datafeed identifier or a wildcard expression. If you do not specify one of these options, the API returns information about all datafeeds.
allow_no_match (bool | None) – Specifies what to do when the request: 1. Contains wildcard expressions and there are no datafeeds that match. 2. Contains the _all string or no identifiers and there are no matches. 3. Contains wildcard expressions and there are only partial matches. The default value is true, which returns an empty datafeeds array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.
exclude_generated (bool | None) – Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_filters(*, filter_id=None, error_trace=None, filter_path=None, from_=None, human=None, pretty=None, size=None)
Get filters. You can get a single filter or all filters.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-filter.html
- Parameters:
- Return type:
- get_influencers(*, job_id, desc=None, end=None, error_trace=None, exclude_interim=None, filter_path=None, from_=None, human=None, influencer_score=None, page=None, pretty=None, size=None, sort=None, start=None, body=None)
Get anomaly detection job results for influencers. Influencers are the entities that have contributed to, or are to blame for, the anomalies. Influencer results are available only if an influencer_field_name is specified in the job configuration.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-influencer.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
desc (bool | None) – If true, the results are sorted in descending order.
end (str | Any | None) – Returns influencers with timestamps earlier than this time. The default value means it is unset and results are not limited to specific timestamps.
exclude_interim (bool | None) – If true, the output excludes interim results. By default, interim results are included.
from – Skips the specified number of influencers.
influencer_score (float | None) – Returns influencers with anomaly scores greater than or equal to this value.
page (Mapping[str, Any] | None) – Configures pagination. This parameter has the from and size properties.
size (int | None) – Specifies the maximum number of influencers to obtain.
sort (str | None) – Specifies the sort field for the requested influencers. By default, the influencers are sorted by the influencer_score value.
start (str | Any | None) – Returns influencers with timestamps after this time. The default value means it is unset and results are not limited to specific timestamps.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_job_stats(*, job_id=None, allow_no_match=None, error_trace=None, filter_path=None, human=None, pretty=None)
Get anomaly detection jobs usage info.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-job-stats.html
- Parameters:
job_id (str | None) – Identifier for the anomaly detection job. It can be a job identifier, a group name, a comma-separated list of jobs, or a wildcard expression. If you do not specify one of these options, the API returns information for all anomaly detection jobs.
allow_no_match (bool | None) – Specifies what to do when the request: 1. Contains wildcard expressions and there are no jobs that match. 2. Contains the _all string or no identifiers and there are no matches. 3. Contains wildcard expressions and there are only partial matches. If true, the API returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If false, the API returns a 404 status code when there are no matches or only partial matches.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_jobs(*, job_id=None, allow_no_match=None, error_trace=None, exclude_generated=None, filter_path=None, human=None, pretty=None)
Get anomaly detection jobs configuration info. You can get information for multiple anomaly detection jobs in a single API request by using a group name, a comma-separated list of jobs, or a wildcard expression. You can get information for all anomaly detection jobs by using _all, by specifying * as the <job_id>, or by omitting the <job_id>.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-job.html
- Parameters:
job_id (str | Sequence[str] | None) – Identifier for the anomaly detection job. It can be a job identifier, a group name, or a wildcard expression. If you do not specify one of these options, the API returns information for all anomaly detection jobs.
allow_no_match (bool | None) – Specifies what to do when the request: 1. Contains wildcard expressions and there are no jobs that match. 2. Contains the _all string or no identifiers and there are no matches. 3. Contains wildcard expressions and there are only partial matches. The default value is true, which returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.
exclude_generated (bool | None) – Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_memory_stats(*, node_id=None, error_trace=None, filter_path=None, human=None, master_timeout=None, pretty=None, timeout=None)
Get machine learning memory usage info. Get information about how machine learning jobs and trained models are using memory, on each node, both within the JVM heap, and natively, outside of the JVM.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/get-ml-memory.html
- Parameters:
node_id (str | None) – The names of particular nodes in the cluster to target. For example, nodeId1,nodeId2 or ml:true
master_timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Period to wait for a connection to the master node. If no response is received before the timeout expires, the request fails and returns an error.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Period to wait for a response. If no response is received before the timeout expires, the request fails and returns an error.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_model_snapshot_upgrade_stats(*, job_id, snapshot_id, allow_no_match=None, error_trace=None, filter_path=None, human=None, pretty=None)
Get anomaly detection job model snapshot upgrade usage info.
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
snapshot_id (str) – A numerical character string that uniquely identifies the model snapshot. You can get information for multiple snapshots by using a comma-separated list or a wildcard expression. You can get all snapshots by using _all, by specifying * as the snapshot ID, or by omitting the snapshot ID.
allow_no_match (bool | None) – Specifies what to do when the request: - Contains wildcard expressions and there are no jobs that match. - Contains the _all string or no identifiers and there are no matches. - Contains wildcard expressions and there are only partial matches. The default value is true, which returns an empty jobs array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_model_snapshots(*, job_id, snapshot_id=None, desc=None, end=None, error_trace=None, filter_path=None, from_=None, human=None, page=None, pretty=None, size=None, sort=None, start=None, body=None)
Get model snapshots info.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-snapshot.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
snapshot_id (str | None) – A numerical character string that uniquely identifies the model snapshot. You can get information for multiple snapshots by using a comma-separated list or a wildcard expression. You can get all snapshots by using _all, by specifying * as the snapshot ID, or by omitting the snapshot ID.
desc (bool | None) – Refer to the description for the desc query parameter.
end (str | Any | None) – Refer to the description for the end query parameter.
from – Skips the specified number of snapshots.
size (int | None) – Specifies the maximum number of snapshots to obtain.
sort (str | None) – Refer to the description for the sort query parameter.
start (str | Any | None) – Refer to the description for the start query parameter.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_overall_buckets(*, job_id, allow_no_match=None, bucket_span=None, end=None, error_trace=None, exclude_interim=None, filter_path=None, human=None, overall_score=None, pretty=None, start=None, top_n=None, body=None)
Get overall bucket results. Retrievs overall bucket results that summarize the bucket results of multiple anomaly detection jobs. The overall_score is calculated by combining the scores of all the buckets within the overall bucket span. First, the maximum anomaly_score per anomaly detection job in the overall bucket is calculated. Then the top_n of those scores are averaged to result in the overall_score. This means that you can fine-tune the overall_score so that it is more or less sensitive to the number of jobs that detect an anomaly at the same time. For example, if you set top_n to 1, the overall_score is the maximum bucket score in the overall bucket. Alternatively, if you set top_n to the number of jobs, the overall_score is high only when all jobs detect anomalies in that overall bucket. If you set the bucket_span parameter (to a value greater than its default), the overall_score is the maximum overall_score of the overall buckets that have a span equal to the jobs’ largest bucket span.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-overall-buckets.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job. It can be a job identifier, a group name, a comma-separated list of jobs or groups, or a wildcard expression. You can summarize the bucket results for all anomaly detection jobs by using _all or by specifying * as the <job_id>.
allow_no_match (bool | None) – Refer to the description for the allow_no_match query parameter.
bucket_span (str | Literal[-1] | ~typing.Literal[0] | None) – Refer to the description for the bucket_span query parameter.
end (str | Any | None) – Refer to the description for the end query parameter.
exclude_interim (bool | None) – Refer to the description for the exclude_interim query parameter.
overall_score (float | str | None) – Refer to the description for the overall_score query parameter.
start (str | Any | None) – Refer to the description for the start query parameter.
top_n (int | None) – Refer to the description for the top_n query parameter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_records(*, job_id, desc=None, end=None, error_trace=None, exclude_interim=None, filter_path=None, from_=None, human=None, page=None, pretty=None, record_score=None, size=None, sort=None, start=None, body=None)
Get anomaly records for an anomaly detection job. Records contain the detailed analytical results. They describe the anomalous activity that has been identified in the input data based on the detector configuration. There can be many anomaly records depending on the characteristics and size of the input data. In practice, there are often too many to be able to manually process them. The machine learning features therefore perform a sophisticated aggregation of the anomaly records into buckets. The number of record results depends on the number of anomalies found in each bucket, which relates to the number of time series being modeled and the number of detectors.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-get-record.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
desc (bool | None) – Refer to the description for the desc query parameter.
end (str | Any | None) – Refer to the description for the end query parameter.
exclude_interim (bool | None) – Refer to the description for the exclude_interim query parameter.
from – Skips the specified number of records.
record_score (float | None) – Refer to the description for the record_score query parameter.
size (int | None) – Specifies the maximum number of records to obtain.
sort (str | None) – Refer to the description for the sort query parameter.
start (str | Any | None) – Refer to the description for the start query parameter.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_trained_models(*, model_id=None, allow_no_match=None, decompress_definition=None, error_trace=None, exclude_generated=None, filter_path=None, from_=None, human=None, include=None, pretty=None, size=None, tags=None)
Get trained model configuration info.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/get-trained-models.html
- Parameters:
model_id (str | Sequence[str] | None) – The unique identifier of the trained model or a model alias. You can get information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression.
allow_no_match (bool | None) – Specifies what to do when the request: - Contains wildcard expressions and there are no models that match. - Contains the _all string or no identifiers and there are no matches. - Contains wildcard expressions and there are only partial matches. If true, it returns an empty array when there are no matches and the subset of results when there are partial matches.
decompress_definition (bool | None) – Specifies whether the included model definition should be returned as a JSON map (true) or in a custom compressed format (false).
exclude_generated (bool | None) – Indicates if certain fields should be removed from the configuration on retrieval. This allows the configuration to be in an acceptable format to be retrieved and then added to another cluster.
from – Skips the specified number of models.
include (str | Literal['definition', 'definition_status', 'feature_importance_baseline', 'hyperparameters', 'total_feature_importance'] | None) – A comma delimited string of optional fields to include in the response body.
size (int | None) – Specifies the maximum number of models to obtain.
tags (str | Sequence[str] | None) – A comma delimited string of tags. A trained model can have many tags, or none. When supplied, only trained models that contain all the supplied tags are returned.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- get_trained_models_stats(*, model_id=None, allow_no_match=None, error_trace=None, filter_path=None, from_=None, human=None, pretty=None, size=None)
Get trained models usage info. You can get usage information for multiple trained models in a single API request by using a comma-separated list of model IDs or a wildcard expression.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/get-trained-models-stats.html
- Parameters:
model_id (str | Sequence[str] | None) – The unique identifier of the trained model or a model alias. It can be a comma-separated list or a wildcard expression.
allow_no_match (bool | None) – Specifies what to do when the request: - Contains wildcard expressions and there are no models that match. - Contains the _all string or no identifiers and there are no matches. - Contains wildcard expressions and there are only partial matches. If true, it returns an empty array when there are no matches and the subset of results when there are partial matches.
from – Skips the specified number of models.
size (int | None) – Specifies the maximum number of models to obtain.
error_trace (bool | None)
from_ (int | None)
human (bool | None)
pretty (bool | None)
- Return type:
- infer_trained_model(*, model_id, docs=None, error_trace=None, filter_path=None, human=None, inference_config=None, pretty=None, timeout=None, body=None)
Evaluate a trained model.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/infer-trained-model.html
- Parameters:
model_id (str) – The unique identifier of the trained model.
docs (Sequence[Mapping[str, Any]] | None) – An array of objects to pass to the model for inference. The objects should contain a fields matching your configured trained model input. Typically, for NLP models, the field name is text_field. Currently, for NLP models, only a single value is allowed.
inference_config (Mapping[str, Any] | None) – The inference configuration updates to apply on the API call
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Controls the amount of time to wait for inference results.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- info(*, error_trace=None, filter_path=None, human=None, pretty=None)
Return ML defaults and limits. Returns defaults and limits used by machine learning. This endpoint is designed to be used by a user interface that needs to fully understand machine learning configurations where some options are not specified, meaning that the defaults should be used. This endpoint may be used to find out what those defaults are. It also provides information about the maximum size of machine learning jobs that could run in the current cluster configuration.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/get-ml-info.html
- open_job(*, job_id, error_trace=None, filter_path=None, human=None, pretty=None, timeout=None, body=None)
Open anomaly detection jobs. An anomaly detection job must be opened to be ready to receive and analyze data. It can be opened and closed multiple times throughout its lifecycle. When you open a new job, it starts with an empty model. When you open an existing job, the most recent model state is automatically loaded. The job is ready to resume its analysis from where it left off, once new data is received.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-open-job.html
- post_calendar_events(*, calendar_id, events=None, error_trace=None, filter_path=None, human=None, pretty=None, body=None)
Add scheduled events to the calendar.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-post-calendar-event.html
- Parameters:
calendar_id (str) – A string that uniquely identifies a calendar.
events (Sequence[Mapping[str, Any]] | None) – A list of one of more scheduled events. The event’s start and end times can be specified as integer milliseconds since the epoch or as a string in ISO 8601 format.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- post_data(*, job_id, data=None, body=None, error_trace=None, filter_path=None, human=None, pretty=None, reset_end=None, reset_start=None)
Send data to an anomaly detection job for analysis. IMPORTANT: For each job, data can be accepted from only a single connection at a time. It is not currently possible to post data to multiple jobs using wildcards or a comma-separated list.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-post-data.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job. The job must have a state of open to receive and process the data.
reset_end (str | Any | None) – Specifies the end of the bucket resetting range.
reset_start (str | Any | None) – Specifies the start of the bucket resetting range.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- preview_data_frame_analytics(*, id=None, config=None, error_trace=None, filter_path=None, human=None, pretty=None, body=None)
Preview features used by data frame analytics. Previews the extracted features used by a data frame analytics config.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/preview-dfanalytics.html
- Parameters:
id (str | None) – Identifier for the data frame analytics job.
config (Mapping[str, Any] | None) – A data frame analytics config as described in create data frame analytics jobs. Note that id and dest don’t need to be provided in the context of this API.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- preview_datafeed(*, datafeed_id=None, datafeed_config=None, end=None, error_trace=None, filter_path=None, human=None, job_config=None, pretty=None, start=None, body=None)
Preview a datafeed. This API returns the first “page” of search results from a datafeed. You can preview an existing datafeed or provide configuration details for a datafeed and anomaly detection job in the API. The preview shows the structure of the data that will be passed to the anomaly detection engine. IMPORTANT: When Elasticsearch security features are enabled, the preview uses the credentials of the user that called the API. However, when the datafeed starts it uses the roles of the last user that created or updated the datafeed. To get a preview that accurately reflects the behavior of the datafeed, use the appropriate credentials. You can also use secondary authorization headers to supply the credentials.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-preview-datafeed.html
- Parameters:
datafeed_id (str | None) – A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters. NOTE: If you use this path parameter, you cannot provide datafeed or anomaly detection job configuration details in the request body.
datafeed_config (Mapping[str, Any] | None) – The datafeed definition to preview.
end (str | Any | None) – The end time when the datafeed preview should stop
job_config (Mapping[str, Any] | None) – The configuration details for the anomaly detection job that is associated with the datafeed. If the datafeed_config object does not include a job_id that references an existing anomaly detection job, you must supply this job_config object. If you include both a job_id and a job_config, the latter information is used. You cannot specify a job_config object unless you also supply a datafeed_config object.
start (str | Any | None) – The start time from where the datafeed preview should begin
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_calendar(*, calendar_id, description=None, error_trace=None, filter_path=None, human=None, job_ids=None, pretty=None, body=None)
Create a calendar.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-put-calendar.html
- put_calendar_job(*, calendar_id, job_id, error_trace=None, filter_path=None, human=None, pretty=None)
Add anomaly detection job to calendar.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-put-calendar-job.html
- Parameters:
- Return type:
- put_data_frame_analytics(*, id, analysis=None, dest=None, source=None, allow_lazy_start=None, analyzed_fields=None, description=None, error_trace=None, filter_path=None, headers=None, human=None, max_num_threads=None, model_memory_limit=None, pretty=None, version=None, body=None)
Create a data frame analytics job. This API creates a data frame analytics job that performs an analysis on the source indices and stores the outcome in a destination index.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/put-dfanalytics.html
- Parameters:
id (str) – Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
analysis (Mapping[str, Any] | None) – The analysis configuration, which contains the information necessary to perform one of the following types of analysis: classification, outlier detection, or regression.
dest (Mapping[str, Any] | None) – The destination configuration.
source (Mapping[str, Any] | None) – The configuration of how to source the analysis data.
allow_lazy_start (bool | None) – Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node. If set to false and a machine learning node with capacity to run the job cannot be immediately found, the API returns an error. If set to true, the API does not return an error; the job waits in the starting state until sufficient machine learning node capacity is available. This behavior is also affected by the cluster-wide xpack.ml.max_lazy_ml_nodes setting.
analyzed_fields (Mapping[str, Any] | None) – Specifies includes and/or excludes patterns to select which fields will be included in the analysis. The patterns specified in excludes are applied last, therefore excludes takes precedence. In other words, if the same field is specified in both includes and excludes, then the field will not be included in the analysis. If analyzed_fields is not set, only the relevant fields will be included. For example, all the numeric fields for outlier detection. The supported fields vary for each type of analysis. Outlier detection requires numeric or boolean data to analyze. The algorithms don’t support missing values therefore fields that have data types other than numeric or boolean are ignored. Documents where included fields contain missing values, null values, or an array are also ignored. Therefore the dest index may contain documents that don’t have an outlier score. Regression supports fields that are numeric, boolean, text, keyword, and ip data types. It is also tolerant of missing values. Fields that are supported are included in the analysis, other fields are ignored. Documents where included fields contain an array with two or more values are also ignored. Documents in the dest index that don’t contain a results field are not included in the regression analysis. Classification supports fields that are numeric, boolean, text, keyword, and ip data types. It is also tolerant of missing values. Fields that are supported are included in the analysis, other fields are ignored. Documents where included fields contain an array with two or more values are also ignored. Documents in the dest index that don’t contain a results field are not included in the classification analysis. Classification analysis can be improved by mapping ordinal variable values to a single number. For example, in case of age ranges, you can model the values as 0-14 = 0, 15-24 = 1, 25-34 = 2, and so on.
description (str | None) – A description of the job.
max_num_threads (int | None) – The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.
model_memory_limit (str | None) – The approximate maximum amount of memory resources that are permitted for analytical processing. If your elasticsearch.yml file contains an xpack.ml.max_model_memory_limit setting, an error occurs when you try to create data frame analytics jobs that have model_memory_limit values greater than that setting.
version (str | None)
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_datafeed(*, datafeed_id, aggregations=None, allow_no_indices=None, chunking_config=None, delayed_data_check_config=None, error_trace=None, expand_wildcards=None, filter_path=None, frequency=None, headers=None, human=None, ignore_throttled=None, ignore_unavailable=None, indexes=None, indices=None, indices_options=None, job_id=None, max_empty_searches=None, pretty=None, query=None, query_delay=None, runtime_mappings=None, script_fields=None, scroll_size=None, body=None)
Create a datafeed. Datafeeds retrieve data from Elasticsearch for analysis by an anomaly detection job. You can associate only one datafeed with each anomaly detection job. The datafeed contains a query that runs at a defined interval (frequency). If you are concerned about delayed data, you can add a delay (query_delay’) at each interval. When Elasticsearch security features are enabled, your datafeed remembers which roles the user who created it had at the time of creation and runs the query using those same roles. If you provide secondary authorization headers, those credentials are used instead. You must use Kibana, this API, or the create anomaly detection jobs API to create a datafeed. Do not add a datafeed directly to the `.ml-config index. Do not give users write privileges on the .ml-config index.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-put-datafeed.html
- Parameters:
datafeed_id (str) – A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
aggregations (Mapping[str, Mapping[str, Any]] | None) – If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.
allow_no_indices (bool | None) – If true, wildcard indices expressions that resolve into no concrete indices are ignored. This includes the _all string or when no indices are specified.
chunking_config (Mapping[str, Any] | None) – Datafeeds might be required to search over long time periods, for several months or years. This search is split into time chunks in order to ensure the load on Elasticsearch is managed. Chunking configuration controls how the size of these time chunks are calculated; it is an advanced configuration option.
delayed_data_check_config (Mapping[str, Any] | None) – Specifies whether the datafeed checks for missing data and the size of the window. The datafeed can optionally search over indices that have already been read in an effort to determine whether any data has subsequently been added to the index. If missing data is found, it is a good indication that the query_delay is set too low and the data is being indexed after the datafeed has passed that moment in time. This check runs only on real-time datafeeds.
expand_wildcards (Sequence[str | Literal['all', 'closed', 'hidden', 'none', 'open']] | str | ~typing.Literal['all', 'closed', 'hidden', 'none', 'open'] | None) – Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values.
frequency (str | Literal[-1] | ~typing.Literal[0] | None) – The interval at which scheduled queries are made while the datafeed runs in real time. The default value is either the bucket span for short bucket spans, or, for longer bucket spans, a sensible fraction of the bucket span. When frequency is shorter than the bucket span, interim results for the last (partial) bucket are written then eventually overwritten by the full bucket results. If the datafeed uses aggregations, this value must be divisible by the interval of the date histogram aggregation.
ignore_throttled (bool | None) – If true, concrete, expanded, or aliased indices are ignored when frozen.
ignore_unavailable (bool | None) – If true, unavailable indices (missing or closed) are ignored.
indexes (str | Sequence[str] | None) – An array of index names. Wildcards are supported. If any of the indices are in remote clusters, the machine learning nodes must have the remote_cluster_client role.
indices (str | Sequence[str] | None) – An array of index names. Wildcards are supported. If any of the indices are in remote clusters, the machine learning nodes must have the remote_cluster_client role.
indices_options (Mapping[str, Any] | None) – Specifies index expansion options that are used during search
job_id (str | None) – Identifier for the anomaly detection job.
max_empty_searches (int | None) – If a real-time datafeed has never seen any data (including during any initial training period), it automatically stops and closes the associated job after this many real-time searches return no documents. In other words, it stops after frequency times max_empty_searches of real-time operation. If not set, a datafeed with no end time that sees no data remains started until it is explicitly stopped. By default, it is not set.
query (Mapping[str, Any] | None) – The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch.
query_delay (str | Literal[-1] | ~typing.Literal[0] | None) – The number of seconds behind real time that data is queried. For example, if data from 10:04 a.m. might not be searchable in Elasticsearch until 10:06 a.m., set this property to 120 seconds. The default value is randomly selected between 60s and 120s. This randomness improves the query performance when there are multiple jobs running on the same node.
runtime_mappings (Mapping[str, Mapping[str, Any]] | None) – Specifies runtime fields for the datafeed search.
script_fields (Mapping[str, Mapping[str, Any]] | None) – Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.
scroll_size (int | None) – The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window, which is 10,000 by default.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_filter(*, filter_id, description=None, error_trace=None, filter_path=None, human=None, items=None, pretty=None, body=None)
Create a filter. A filter contains a list of strings. It can be used by one or more anomaly detection jobs. Specifically, filters are referenced in the custom_rules property of detector configuration objects.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-put-filter.html
- Parameters:
filter_id (str) – A string that uniquely identifies a filter.
description (str | None) – A description of the filter.
items (Sequence[str] | None) – The items of the filter. A wildcard * can be used at the beginning or the end of an item. Up to 10000 items are allowed in each filter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_job(*, job_id, analysis_config=None, data_description=None, allow_lazy_open=None, analysis_limits=None, background_persist_interval=None, custom_settings=None, daily_model_snapshot_retention_after_days=None, datafeed_config=None, description=None, error_trace=None, filter_path=None, groups=None, human=None, model_plot_config=None, model_snapshot_retention_days=None, pretty=None, renormalization_window_days=None, results_index_name=None, results_retention_days=None, body=None)
Create an anomaly detection job. If you include a datafeed_config, you must have read index privileges on the source index.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-put-job.html
- Parameters:
job_id (str) – The identifier for the anomaly detection job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
analysis_config (Mapping[str, Any] | None) – Specifies how to analyze the data. After you create a job, you cannot change the analysis configuration; all the properties are informational.
data_description (Mapping[str, Any] | None) – Defines the format of the input data when you send data to the job by using the post data API. Note that when configure a datafeed, these properties are automatically set. When data is received via the post data API, it is not stored in Elasticsearch. Only the results for anomaly detection are retained.
allow_lazy_open (bool | None) – Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node. By default, if a machine learning node with capacity to run the job cannot immediately be found, the open anomaly detection jobs API returns an error. However, this is also subject to the cluster-wide xpack.ml.max_lazy_ml_nodes setting. If this option is set to true, the open anomaly detection jobs API does not return an error and the job waits in the opening state until sufficient machine learning node capacity is available.
analysis_limits (Mapping[str, Any] | None) – Limits can be applied for the resources required to hold the mathematical models in memory. These limits are approximate and can be set per job. They do not control the memory used by other processes, for example the Elasticsearch Java processes.
background_persist_interval (str | Literal[-1] | ~typing.Literal[0] | None) – Advanced configuration option. The time between each periodic persistence of the model. The default value is a randomized value between 3 to 4 hours, which avoids all jobs persisting at exactly the same time. The smallest allowed value is 1 hour. For very large models (several GB), persistence could take 10-20 minutes, so do not set the background_persist_interval value too low.
custom_settings (Any | None) – Advanced configuration option. Contains custom meta data about the job.
daily_model_snapshot_retention_after_days (int | None) – Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job. Valid values range from 0 to model_snapshot_retention_days.
datafeed_config (Mapping[str, Any] | None) – Defines a datafeed for the anomaly detection job. If Elasticsearch security features are enabled, your datafeed remembers which roles the user who created it had at the time of creation and runs the query using those same roles. If you provide secondary authorization headers, those credentials are used instead.
description (str | None) – A description of the job.
groups (Sequence[str] | None) – A list of job groups. A job can belong to no groups or many.
model_plot_config (Mapping[str, Any] | None) – This advanced configuration option stores model information along with the results. It provides a more detailed view into anomaly detection. If you enable model plot it can add considerable overhead to the performance of the system; it is not feasible for jobs with many entities. Model plot provides a simplified and indicative view of the model and its bounds. It does not display complex features such as multivariate correlations or multimodal data. As such, anomalies may occasionally be reported which cannot be seen in the model plot. Model plot config can be configured when the job is created or updated later. It must be disabled if performance issues are experienced.
model_snapshot_retention_days (int | None) – Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job. By default, snapshots ten days older than the newest snapshot are deleted.
renormalization_window_days (int | None) – Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen. The default value is the longer of 30 days or 100 bucket spans.
results_index_name (str | None) – A text string that affects the name of the machine learning results index. By default, the job generates an index named .ml-anomalies-shared.
results_retention_days (int | None) – Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained. Annotations generated by the system also count as results for retention purposes; they are deleted after the same number of days as results. Annotations added by users are retained forever.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_trained_model(*, model_id, compressed_definition=None, defer_definition_decompression=None, definition=None, description=None, error_trace=None, filter_path=None, human=None, inference_config=None, input=None, metadata=None, model_size_bytes=None, model_type=None, platform_architecture=None, prefix_strings=None, pretty=None, tags=None, wait_for_completion=None, body=None)
Create a trained model. Enable you to supply a trained model that is not created by data frame analytics.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/put-trained-models.html
- Parameters:
model_id (str) – The unique identifier of the trained model.
compressed_definition (str | None) – The compressed (GZipped and Base64 encoded) inference definition of the model. If compressed_definition is specified, then definition cannot be specified.
defer_definition_decompression (bool | None) – If set to true and a compressed_definition is provided, the request defers definition decompression and skips relevant validations.
definition (Mapping[str, Any] | None) – The inference definition for the model. If definition is specified, then compressed_definition cannot be specified.
description (str | None) – A human-readable description of the inference trained model.
inference_config (Mapping[str, Any] | None) – The default configuration for inference. This can be either a regression or classification configuration. It must match the underlying definition.trained_model’s target_type. For pre-packaged models such as ELSER the config is not required.
input (Mapping[str, Any] | None) – The input field names for the model definition.
metadata (Any | None) – An object map that contains metadata about the model.
model_size_bytes (int | None) – The estimated memory usage in bytes to keep the trained model in memory. This property is supported only if defer_definition_decompression is true or the model definition is not supplied.
model_type (str | Literal['lang_ident', 'pytorch', 'tree_ensemble'] | None) – The model type.
platform_architecture (str | None) – The platform architecture (if applicable) of the trained mode. If the model only works on one platform, because it is heavily optimized for a particular processor architecture and OS combination, then this field specifies which. The format of the string must match the platform identifiers used by Elasticsearch, so one of, linux-x86_64, linux-aarch64, darwin-x86_64, darwin-aarch64, or windows-x86_64. For portable models (those that work independent of processor architecture or OS features), leave this field unset.
prefix_strings (Mapping[str, Any] | None) – Optional prefix strings applied at inference
tags (Sequence[str] | None) – An array of tags to organize the model.
wait_for_completion (bool | None) – Whether to wait for all child operations (e.g. model download) to complete.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_trained_model_alias(*, model_id, model_alias, error_trace=None, filter_path=None, human=None, pretty=None, reassign=None)
Create or update a trained model alias. A trained model alias is a logical name used to reference a single trained model. You can use aliases instead of trained model identifiers to make it easier to reference your models. For example, you can use aliases in inference aggregations and processors. An alias must be unique and refer to only a single trained model. However, you can have multiple aliases for each trained model. If you use this API to update an alias such that it references a different trained model ID and the model uses a different type of data frame analytics, an error occurs. For example, this situation occurs if you have a trained model for regression analysis and a trained model for classification analysis; you cannot reassign an alias from one type of trained model to another. If you use this API to update an alias and there are very few input fields in common between the old and new trained models for the model alias, the API returns a warning.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/put-trained-models-aliases.html
- Parameters:
model_id (str) – The identifier for the trained model that the alias refers to.
model_alias (str) – The alias to create or update. This value cannot end in numbers.
reassign (bool | None) – Specifies whether the alias gets reassigned to the specified trained model if it is already assigned to a different model. If the alias is already assigned and this parameter is false, the API returns an error.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_trained_model_definition_part(*, model_id, part, definition=None, total_definition_length=None, total_parts=None, error_trace=None, filter_path=None, human=None, pretty=None, body=None)
Create part of a trained model definition.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/put-trained-model-definition-part.html
- Parameters:
model_id (str) – The unique identifier of the trained model.
part (int) – The definition part number. When the definition is loaded for inference the definition parts are streamed in the order of their part number. The first part must be 0 and the final part must be total_parts - 1.
definition (str | None) – The definition part for the model. Must be a base64 encoded string.
total_definition_length (int | None) – The total uncompressed definition length in bytes. Not base64 encoded.
total_parts (int | None) – The total number of parts that will be uploaded. Must be greater than 0.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- put_trained_model_vocabulary(*, model_id, vocabulary=None, error_trace=None, filter_path=None, human=None, merges=None, pretty=None, scores=None, body=None)
Create a trained model vocabulary. This API is supported only for natural language processing (NLP) models. The vocabulary is stored in the index as described in inference_config.*.vocabulary of the trained model definition.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/put-trained-model-vocabulary.html
- Parameters:
model_id (str) – The unique identifier of the trained model.
vocabulary (Sequence[str] | None) – The model vocabulary, which must not be empty.
merges (Sequence[str] | None) – The optional model merges if required by the tokenizer.
scores (Sequence[float] | None) – The optional vocabulary value scores if required by the tokenizer.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- reset_job(*, job_id, delete_user_annotations=None, error_trace=None, filter_path=None, human=None, pretty=None, wait_for_completion=None)
Reset an anomaly detection job. All model state and results are deleted. The job is ready to start over as if it had just been created. It is not currently possible to reset multiple jobs using wildcards or a comma separated list.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-reset-job.html
- Parameters:
job_id (str) – The ID of the job to reset.
delete_user_annotations (bool | None) – Specifies whether annotations that have been added by the user should be deleted along with any auto-generated annotations when the job is reset.
wait_for_completion (bool | None) – Should this request wait until the operation has completed before returning.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- revert_model_snapshot(*, job_id, snapshot_id, delete_intervening_results=None, error_trace=None, filter_path=None, human=None, pretty=None, body=None)
Revert to a snapshot. The machine learning features react quickly to anomalous input, learning new behaviors in data. Highly anomalous input increases the variance in the models whilst the system learns whether this is a new step-change in behavior or a one-off event. In the case where this anomalous input is known to be a one-off, then it might be appropriate to reset the model state to a time before this event. For example, you might consider reverting to a saved snapshot after Black Friday or a critical system failure.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-revert-snapshot.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
snapshot_id (str) – You can specify empty as the <snapshot_id>. Reverting to the empty snapshot means the anomaly detection job starts learning a new model from scratch when it is started.
delete_intervening_results (bool | None) – Refer to the description for the delete_intervening_results query parameter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- set_upgrade_mode(*, enabled=None, error_trace=None, filter_path=None, human=None, pretty=None, timeout=None)
Set upgrade_mode for ML indices. Sets a cluster wide upgrade_mode setting that prepares machine learning indices for an upgrade. When upgrading your cluster, in some circumstances you must restart your nodes and reindex your machine learning indices. In those circumstances, there must be no machine learning jobs running. You can close the machine learning jobs, do the upgrade, then open all the jobs again. Alternatively, you can use this API to temporarily halt tasks associated with the jobs and datafeeds and prevent new jobs from opening. You can also use this API during upgrades that do not require you to reindex your machine learning indices, though stopping jobs is not a requirement in that case. You can see the current value for the upgrade_mode setting by using the get machine learning info API.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-set-upgrade-mode.html
- Parameters:
enabled (bool | None) – When true, it enables upgrade_mode which temporarily halts all job and datafeed tasks and prohibits new job and datafeed tasks from starting.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – The time to wait for the request to be completed.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- start_data_frame_analytics(*, id, error_trace=None, filter_path=None, human=None, pretty=None, timeout=None)
Start a data frame analytics job. A data frame analytics job can be started and stopped multiple times throughout its lifecycle. If the destination index does not exist, it is created automatically the first time you start the data frame analytics job. The index.number_of_shards and index.number_of_replicas settings for the destination index are copied from the source index. If there are multiple source indices, the destination index copies the highest setting values. The mappings for the destination index are also copied from the source indices. If there are any mapping conflicts, the job fails to start. If the destination index exists, it is used as is. You can therefore set up the destination index in advance with custom settings and mappings.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/start-dfanalytics.html
- Parameters:
id (str) – Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Controls the amount of time to wait until the data frame analytics job starts.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- start_datafeed(*, datafeed_id, end=None, error_trace=None, filter_path=None, human=None, pretty=None, start=None, timeout=None, body=None)
Start datafeeds. A datafeed must be started in order to retrieve data from Elasticsearch. A datafeed can be started and stopped multiple times throughout its lifecycle. Before you can start a datafeed, the anomaly detection job must be open. Otherwise, an error occurs. If you restart a stopped datafeed, it continues processing input data from the next millisecond after it was stopped. If new data was indexed for that exact millisecond between stopping and starting, it will be ignored. When Elasticsearch security features are enabled, your datafeed remembers which roles the last user to create or update it had at the time of creation or update and runs the query using those same roles. If you provided secondary authorization headers when you created or updated the datafeed, those credentials are used instead.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-start-datafeed.html
- Parameters:
datafeed_id (str) – A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
end (str | Any | None) – Refer to the description for the end query parameter.
start (str | Any | None) – Refer to the description for the start query parameter.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Refer to the description for the timeout query parameter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- start_trained_model_deployment(*, model_id, cache_size=None, deployment_id=None, error_trace=None, filter_path=None, human=None, number_of_allocations=None, pretty=None, priority=None, queue_capacity=None, threads_per_allocation=None, timeout=None, wait_for=None)
Start a trained model deployment. It allocates the model to every machine learning node.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/start-trained-model-deployment.html
- Parameters:
model_id (str) – The unique identifier of the trained model. Currently, only PyTorch models are supported.
cache_size (int | str | None) – The inference cache size (in memory outside the JVM heap) per node for the model. The default value is the same size as the model_size_bytes. To disable the cache, 0b can be provided.
deployment_id (str | None) – A unique identifier for the deployment of the model.
number_of_allocations (int | None) – The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.
priority (str | Literal['low', 'normal'] | None) – The deployment priority.
queue_capacity (int | None) – Specifies the number of inference requests that are allowed in the queue. After the number of requests exceeds this value, new requests are rejected with a 429 error.
threads_per_allocation (int | None) – Sets the number of threads used by each model allocation during inference. This generally increases the inference speed. The inference process is a compute-bound process; any number greater than the number of available hardware threads on the machine does not increase the inference speed. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Specifies the amount of time to wait for the model to deploy.
wait_for (str | Literal['fully_allocated', 'started', 'starting'] | None) – Specifies the allocation status to wait for before returning.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- stop_data_frame_analytics(*, id, allow_no_match=None, error_trace=None, filter_path=None, force=None, human=None, pretty=None, timeout=None)
Stop data frame analytics jobs. A data frame analytics job can be started and stopped multiple times throughout its lifecycle.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/stop-dfanalytics.html
- Parameters:
id (str) – Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
allow_no_match (bool | None) – Specifies what to do when the request: 1. Contains wildcard expressions and there are no data frame analytics jobs that match. 2. Contains the _all string or no identifiers and there are no matches. 3. Contains wildcard expressions and there are only partial matches. The default value is true, which returns an empty data_frame_analytics array when there are no matches and the subset of results when there are partial matches. If this parameter is false, the request returns a 404 status code when there are no matches or only partial matches.
force (bool | None) – If true, the data frame analytics job is stopped forcefully.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Controls the amount of time to wait until the data frame analytics job stops. Defaults to 20 seconds.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- stop_datafeed(*, datafeed_id, allow_no_match=None, error_trace=None, filter_path=None, force=None, human=None, pretty=None, timeout=None, body=None)
Stop datafeeds. A datafeed that is stopped ceases to retrieve data from Elasticsearch. A datafeed can be started and stopped multiple times throughout its lifecycle.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-stop-datafeed.html
- Parameters:
datafeed_id (str) – Identifier for the datafeed. You can stop multiple datafeeds in a single API request by using a comma-separated list of datafeeds or a wildcard expression. You can close all datafeeds by using _all or by specifying * as the identifier.
allow_no_match (bool | None) – Refer to the description for the allow_no_match query parameter.
force (bool | None) – Refer to the description for the force query parameter.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Refer to the description for the timeout query parameter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- stop_trained_model_deployment(*, model_id, allow_no_match=None, error_trace=None, filter_path=None, force=None, human=None, pretty=None)
Stop a trained model deployment.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/stop-trained-model-deployment.html
- Parameters:
model_id (str) – The unique identifier of the trained model.
allow_no_match (bool | None) – Specifies what to do when the request: contains wildcard expressions and there are no deployments that match; contains the _all string or no identifiers and there are no matches; or contains wildcard expressions and there are only partial matches. By default, it returns an empty array when there are no matches and the subset of results when there are partial matches. If false, the request returns a 404 status code when there are no matches or only partial matches.
force (bool | None) – Forcefully stops the deployment, even if it is used by ingest pipelines. You can’t use these pipelines until you restart the model deployment.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- update_data_frame_analytics(*, id, allow_lazy_start=None, description=None, error_trace=None, filter_path=None, human=None, max_num_threads=None, model_memory_limit=None, pretty=None, body=None)
Update a data frame analytics job.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/update-dfanalytics.html
- Parameters:
id (str) – Identifier for the data frame analytics job. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
allow_lazy_start (bool | None) – Specifies whether this job can start when there is insufficient machine learning node capacity for it to be immediately assigned to a node.
description (str | None) – A description of the job.
max_num_threads (int | None) – The maximum number of threads to be used by the analysis. Using more threads may decrease the time necessary to complete the analysis at the cost of using more CPU. Note that the process may use additional threads for operational functionality other than the analysis itself.
model_memory_limit (str | None) – The approximate maximum amount of memory resources that are permitted for analytical processing. If your elasticsearch.yml file contains an xpack.ml.max_model_memory_limit setting, an error occurs when you try to create data frame analytics jobs that have model_memory_limit values greater than that setting.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- update_datafeed(*, datafeed_id, aggregations=None, allow_no_indices=None, chunking_config=None, delayed_data_check_config=None, error_trace=None, expand_wildcards=None, filter_path=None, frequency=None, human=None, ignore_throttled=None, ignore_unavailable=None, indexes=None, indices=None, indices_options=None, job_id=None, max_empty_searches=None, pretty=None, query=None, query_delay=None, runtime_mappings=None, script_fields=None, scroll_size=None, body=None)
Update a datafeed. You must stop and start the datafeed for the changes to be applied. When Elasticsearch security features are enabled, your datafeed remembers which roles the user who updated it had at the time of the update and runs the query using those same roles. If you provide secondary authorization headers, those credentials are used instead.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-update-datafeed.html
- Parameters:
datafeed_id (str) – A numerical character string that uniquely identifies the datafeed. This identifier can contain lowercase alphanumeric characters (a-z and 0-9), hyphens, and underscores. It must start and end with alphanumeric characters.
aggregations (Mapping[str, Mapping[str, Any]] | None) – If set, the datafeed performs aggregation searches. Support for aggregations is limited and should be used only with low cardinality data.
allow_no_indices (bool | None) – If true, wildcard indices expressions that resolve into no concrete indices are ignored. This includes the _all string or when no indices are specified.
chunking_config (Mapping[str, Any] | None) – Datafeeds might search over long time periods, for several months or years. This search is split into time chunks in order to ensure the load on Elasticsearch is managed. Chunking configuration controls how the size of these time chunks are calculated; it is an advanced configuration option.
delayed_data_check_config (Mapping[str, Any] | None) – Specifies whether the datafeed checks for missing data and the size of the window. The datafeed can optionally search over indices that have already been read in an effort to determine whether any data has subsequently been added to the index. If missing data is found, it is a good indication that the query_delay is set too low and the data is being indexed after the datafeed has passed that moment in time. This check runs only on real-time datafeeds.
expand_wildcards (Sequence[str | Literal['all', 'closed', 'hidden', 'none', 'open']] | str | ~typing.Literal['all', 'closed', 'hidden', 'none', 'open'] | None) – Type of index that wildcard patterns can match. If the request can target data streams, this argument determines whether wildcard expressions match hidden data streams. Supports comma-separated values. Valid values are: * all: Match any data stream or index, including hidden ones. * closed: Match closed, non-hidden indices. Also matches any non-hidden data stream. Data streams cannot be closed. * hidden: Match hidden data streams and hidden indices. Must be combined with open, closed, or both. * none: Wildcard patterns are not accepted. * open: Match open, non-hidden indices. Also matches any non-hidden data stream.
frequency (str | Literal[-1] | ~typing.Literal[0] | None) – The interval at which scheduled queries are made while the datafeed runs in real time. The default value is either the bucket span for short bucket spans, or, for longer bucket spans, a sensible fraction of the bucket span. When frequency is shorter than the bucket span, interim results for the last (partial) bucket are written then eventually overwritten by the full bucket results. If the datafeed uses aggregations, this value must be divisible by the interval of the date histogram aggregation.
ignore_throttled (bool | None) – If true, concrete, expanded or aliased indices are ignored when frozen.
ignore_unavailable (bool | None) – If true, unavailable indices (missing or closed) are ignored.
indexes (Sequence[str] | None) – An array of index names. Wildcards are supported. If any of the indices are in remote clusters, the machine learning nodes must have the remote_cluster_client role.
indices (Sequence[str] | None) – An array of index names. Wildcards are supported. If any of the indices are in remote clusters, the machine learning nodes must have the remote_cluster_client role.
indices_options (Mapping[str, Any] | None) – Specifies index expansion options that are used during search.
job_id (str | None)
max_empty_searches (int | None) – If a real-time datafeed has never seen any data (including during any initial training period), it automatically stops and closes the associated job after this many real-time searches return no documents. In other words, it stops after frequency times max_empty_searches of real-time operation. If not set, a datafeed with no end time that sees no data remains started until it is explicitly stopped. By default, it is not set.
query (Mapping[str, Any] | None) – The Elasticsearch query domain-specific language (DSL). This value corresponds to the query object in an Elasticsearch search POST body. All the options that are supported by Elasticsearch can be used, as this object is passed verbatim to Elasticsearch. Note that if you change the query, the analyzed data is also changed. Therefore, the time required to learn might be long and the understandability of the results is unpredictable. If you want to make significant changes to the source data, it is recommended that you clone the job and datafeed and make the amendments in the clone. Let both run in parallel and close one when you are satisfied with the results of the job.
query_delay (str | Literal[-1] | ~typing.Literal[0] | None) – The number of seconds behind real time that data is queried. For example, if data from 10:04 a.m. might not be searchable in Elasticsearch until 10:06 a.m., set this property to 120 seconds. The default value is randomly selected between 60s and 120s. This randomness improves the query performance when there are multiple jobs running on the same node.
runtime_mappings (Mapping[str, Mapping[str, Any]] | None) – Specifies runtime fields for the datafeed search.
script_fields (Mapping[str, Mapping[str, Any]] | None) – Specifies scripts that evaluate custom expressions and returns script fields to the datafeed. The detector configuration objects in a job can contain functions that use these script fields.
scroll_size (int | None) – The size parameter that is used in Elasticsearch searches when the datafeed does not use aggregations. The maximum value is the value of index.max_result_window.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- update_filter(*, filter_id, add_items=None, description=None, error_trace=None, filter_path=None, human=None, pretty=None, remove_items=None, body=None)
Update a filter. Updates the description of a filter, adds items, or removes items from the list.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-update-filter.html
- Parameters:
filter_id (str) – A string that uniquely identifies a filter.
add_items (Sequence[str] | None) – The items to add to the filter.
description (str | None) – A description for the filter.
remove_items (Sequence[str] | None) – The items to remove from the filter.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- update_job(*, job_id, allow_lazy_open=None, analysis_limits=None, background_persist_interval=None, categorization_filters=None, custom_settings=None, daily_model_snapshot_retention_after_days=None, description=None, detectors=None, error_trace=None, filter_path=None, groups=None, human=None, model_plot_config=None, model_prune_window=None, model_snapshot_retention_days=None, per_partition_categorization=None, pretty=None, renormalization_window_days=None, results_retention_days=None, body=None)
Update an anomaly detection job. Updates certain properties of an anomaly detection job.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-update-job.html
- Parameters:
job_id (str) – Identifier for the job.
allow_lazy_open (bool | None) – Advanced configuration option. Specifies whether this job can open when there is insufficient machine learning node capacity for it to be immediately assigned to a node. If false and a machine learning node with capacity to run the job cannot immediately be found, the open anomaly detection jobs API returns an error. However, this is also subject to the cluster-wide xpack.ml.max_lazy_ml_nodes setting. If this option is set to true, the open anomaly detection jobs API does not return an error and the job waits in the opening state until sufficient machine learning node capacity is available.
background_persist_interval (str | Literal[-1] | ~typing.Literal[0] | None) – Advanced configuration option. The time between each periodic persistence of the model. The default value is a randomized value between 3 to 4 hours, which avoids all jobs persisting at exactly the same time. The smallest allowed value is 1 hour. For very large models (several GB), persistence could take 10-20 minutes, so do not set the value too low. If the job is open when you make the update, you must stop the datafeed, close the job, then reopen the job and restart the datafeed for the changes to take effect.
custom_settings (Mapping[str, Any] | None) – Advanced configuration option. Contains custom meta data about the job. For example, it can contain custom URL information as shown in Adding custom URLs to machine learning results.
daily_model_snapshot_retention_after_days (int | None) – Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies a period of time (in days) after which only the first snapshot per day is retained. This period is relative to the timestamp of the most recent snapshot for this job. Valid values range from 0 to model_snapshot_retention_days. For jobs created before version 7.8.0, the default value matches model_snapshot_retention_days.
description (str | None) – A description of the job.
detectors (Sequence[Mapping[str, Any]] | None) – An array of detector update objects.
groups (Sequence[str] | None) – A list of job groups. A job can belong to no groups or many.
model_prune_window (str | Literal[-1] | ~typing.Literal[0] | None)
model_snapshot_retention_days (int | None) – Advanced configuration option, which affects the automatic removal of old model snapshots for this job. It specifies the maximum period of time (in days) that snapshots are retained. This period is relative to the timestamp of the most recent snapshot for this job.
per_partition_categorization (Mapping[str, Any] | None) – Settings related to how categorization interacts with partition fields.
renormalization_window_days (int | None) – Advanced configuration option. The period over which adjustments to the score are applied, as new data is seen.
results_retention_days (int | None) – Advanced configuration option. The period of time (in days) that results are retained. Age is calculated relative to the timestamp of the latest bucket result. If this property has a non-null value, once per day at 00:30 (server time), results that are the specified number of days older than the latest bucket result are deleted from Elasticsearch. The default value is null, which means all results are retained.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- update_model_snapshot(*, job_id, snapshot_id, description=None, error_trace=None, filter_path=None, human=None, pretty=None, retain=None, body=None)
Update a snapshot. Updates certain properties of a snapshot.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-update-snapshot.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
snapshot_id (str) – Identifier for the model snapshot.
description (str | None) – A description of the model snapshot.
retain (bool | None) – If true, this snapshot will not be deleted during automatic cleanup of snapshots older than model_snapshot_retention_days. However, this snapshot will be deleted when the job is deleted.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- update_trained_model_deployment(*, model_id, error_trace=None, filter_path=None, human=None, number_of_allocations=None, pretty=None, body=None)
Update a trained model deployment.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/update-trained-model-deployment.html
- Parameters:
model_id (str) – The unique identifier of the trained model. Currently, only PyTorch models are supported.
number_of_allocations (int | None) – The number of model allocations on each node where the model is deployed. All allocations on a node share the same copy of the model in memory but use a separate set of threads to evaluate the model. Increasing this value generally increases the throughput. If this setting is greater than the number of hardware threads it will automatically be changed to a value less than the number of hardware threads.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- upgrade_job_snapshot(*, job_id, snapshot_id, error_trace=None, filter_path=None, human=None, pretty=None, timeout=None, wait_for_completion=None)
Upgrade a snapshot. Upgrades an anomaly detection model snapshot to the latest major version. Over time, older snapshot formats are deprecated and removed. Anomaly detection jobs support only snapshots that are from the current or previous major version. This API provides a means to upgrade a snapshot to the current major version. This aids in preparing the cluster for an upgrade to the next major version. Only one snapshot per anomaly detection job can be upgraded at a time and the upgraded snapshot cannot be the current snapshot of the anomaly detection job.
https://www.elastic.co/guide/en/elasticsearch/reference/8.16/ml-upgrade-job-model-snapshot.html
- Parameters:
job_id (str) – Identifier for the anomaly detection job.
snapshot_id (str) – A numerical character string that uniquely identifies the model snapshot.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Controls the time to wait for the request to complete.
wait_for_completion (bool | None) – When true, the API won’t respond until the upgrade is complete. Otherwise, it responds as soon as the upgrade task is assigned to a node.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- validate(*, analysis_config=None, analysis_limits=None, data_description=None, description=None, error_trace=None, filter_path=None, human=None, job_id=None, model_plot=None, model_snapshot_id=None, model_snapshot_retention_days=None, pretty=None, results_index_name=None, body=None)
Validates an anomaly detection job.
https://www.elastic.co/guide/en/machine-learning/8.16/ml-jobs.html
- validate_detector(*, detector=None, body=None, error_trace=None, filter_path=None, human=None, pretty=None)
Validates an anomaly detection detector.
https://www.elastic.co/guide/en/machine-learning/8.16/ml-jobs.html