Rollup Indices
- class elasticsearch.client.RollupClient(client)
- Parameters:
client (BaseClient)
- delete_job(*, id, error_trace=None, filter_path=None, human=None, pretty=None)
Delete a rollup job. A job must be stopped before it can be deleted. If you attempt to delete a started job, an error occurs. Similarly, if you attempt to delete a nonexistent job, an exception occurs. IMPORTANT: When you delete a job, you remove only the process that is actively monitoring and rolling up data. The API does not delete any previously rolled up data. This is by design; a user may wish to roll up a static data set. Because the data set is static, after it has been fully rolled up there is no need to keep the indexing rollup job around (as there will be no new data). Thus the job can be deleted, leaving behind the rolled up data for analysis. If you wish to also remove the rollup data and the rollup index contains the data for only a single job, you can delete the whole rollup index. If the rollup index stores data from several jobs, you must issue a delete-by-query that targets the rollup job’s identifier in the rollup index. For example:
` POST my_rollup_index/_delete_by_query { "query": { "term": { "_rollup.id": "the_rollup_job_id" } } } `
https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-delete-job.html
- get_jobs(*, id=None, error_trace=None, filter_path=None, human=None, pretty=None)
Get rollup job information. Get the configuration, stats, and status of rollup jobs. NOTE: This API returns only active (both STARTED and STOPPED) jobs. If a job was created, ran for a while, then was deleted, the API does not return any details about it. For details about a historical rollup job, the rollup capabilities API may be more useful.
https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-get-job.html
- get_rollup_caps(*, id=None, error_trace=None, filter_path=None, human=None, pretty=None)
Get the rollup job capabilities. Get the capabilities of any rollup jobs that have been configured for a specific index or index pattern. This API is useful because a rollup job is often configured to rollup only a subset of fields from the source index. Furthermore, only certain aggregations can be configured for various fields, leading to a limited subset of functionality depending on that configuration. This API enables you to inspect an index and determine: 1. Does this index have associated rollup data somewhere in the cluster? 2. If yes to the first question, what fields were rolled up, what aggregations can be performed, and where does the data live?
https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-get-rollup-caps.html
- Parameters:
- Return type:
- get_rollup_index_caps(*, index, error_trace=None, filter_path=None, human=None, pretty=None)
Get the rollup index capabilities. Get the rollup capabilities of all jobs inside of a rollup index. A single rollup index may store the data for multiple rollup jobs and may have a variety of capabilities depending on those jobs. This API enables you to determine: * What jobs are stored in an index (or indices specified via a pattern)? * What target indices were rolled up, what fields were used in those rollups, and what aggregations can be performed on each job?
https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-get-rollup-index-caps.html
- put_job(*, id, cron=None, groups=None, index_pattern=None, page_size=None, rollup_index=None, error_trace=None, filter_path=None, headers=None, human=None, metrics=None, pretty=None, timeout=None, body=None)
Create a rollup job. WARNING: From 8.15.0, calling this API in a cluster with no rollup usage will fail with a message about the deprecation and planned removal of rollup features. A cluster needs to contain either a rollup job or a rollup index in order for this API to be allowed to run. The rollup job configuration contains all the details about how the job should run, when it indexes documents, and what future queries will be able to run against the rollup index. There are three main sections to the job configuration: the logistical details about the job (for example, the cron schedule), the fields that are used for grouping, and what metrics to collect for each group. Jobs are created in a STOPPED state. You can start them with the start rollup jobs API.
https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-put-job.html
- Parameters:
id (str) – Identifier for the rollup job. This can be any alphanumeric string and uniquely identifies the data that is associated with the rollup job. The ID is persistent; it is stored with the rolled up data. If you create a job, let it run for a while, then delete the job, the data that the job rolled up is still be associated with this job ID. You cannot create a new job with the same ID since that could lead to problems with mismatched job configurations.
cron (str | None) – A cron string which defines the intervals when the rollup job should be executed. When the interval triggers, the indexer attempts to rollup the data in the index pattern. The cron pattern is unrelated to the time interval of the data being rolled up. For example, you may wish to create hourly rollups of your document but to only run the indexer on a daily basis at midnight, as defined by the cron. The cron pattern is defined just like a Watcher cron schedule.
groups (Mapping[str, Any] | None) – Defines the grouping fields and aggregations that are defined for this rollup job. These fields will then be available later for aggregating into buckets. These aggs and fields can be used in any combination. Think of the groups configuration as defining a set of tools that can later be used in aggregations to partition the data. Unlike raw data, we have to think ahead to which fields and aggregations might be used. Rollups provide enough flexibility that you simply need to determine which fields are needed, not in what order they are needed.
index_pattern (str | None) – The index or index pattern to roll up. Supports wildcard-style patterns (logstash-*). The job attempts to rollup the entire index or index-pattern.
page_size (int | None) – The number of bucket results that are processed on each iteration of the rollup indexer. A larger value tends to execute faster, but requires more memory during processing. This value has no effect on how the data is rolled up; it is merely used for tweaking the speed or memory cost of the indexer.
rollup_index (str | None) – The index that contains the rollup results. The index can be shared with other rollup jobs. The data is stored so that it doesn’t interfere with unrelated jobs.
metrics (Sequence[Mapping[str, Any]] | None) – Defines the metrics to collect for each grouping tuple. By default, only the doc_counts are collected for each group. To make rollup useful, you will often add metrics like averages, mins, maxes, etc. Metrics are defined on a per-field basis and for each field you configure which metric should be collected.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – Time to wait for the request to complete.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- rollup_search(*, index, aggregations=None, aggs=None, error_trace=None, filter_path=None, human=None, pretty=None, query=None, rest_total_hits_as_int=None, size=None, typed_keys=None, body=None)
Search rolled-up data. The rollup search endpoint is needed because, internally, rolled-up documents utilize a different document structure than the original data. It rewrites standard Query DSL into a format that matches the rollup documents then takes the response and rewrites it back to what a client would expect given the original query. The request body supports a subset of features from the regular search API. The following functionality is not available: size: Because rollups work on pre-aggregated data, no search hits can be returned and so size must be set to zero or omitted entirely. highlighter, suggestors, post_filter, profile, explain: These are similarly disallowed. Searching both historical rollup and non-rollup data The rollup search API has the capability to search across both “live” non-rollup data and the aggregated rollup data. This is done by simply adding the live indices to the URI. For example:
` GET sensor-1,sensor_rollup/_rollup_search { "size": 0, "aggregations": { "max_temperature": { "max": { "field": "temperature" } } } } `
The rollup search endpoint does two things when the search runs: * The original request is sent to the non-rollup index unaltered. * A rewritten version of the original request is sent to the rollup index. When the two responses are received, the endpoint rewrites the rollup response and merges the two together. During the merging process, if there is any overlap in buckets between the two responses, the buckets from the non-rollup index are used.https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-search.html
- Parameters:
index (str | Sequence[str]) – A comma-separated list of data streams and indices used to limit the request. This parameter has the following rules: * At least one data stream, index, or wildcard expression must be specified. This target can include a rollup or non-rollup index. For data streams, the stream’s backing indices can only serve as non-rollup indices. Omitting the parameter or using _all are not permitted. * Multiple non-rollup indices may be specified. * Only one rollup index may be specified. If more than one are supplied, an exception occurs. * Wildcard expressions (*) may be used. If they match more than one rollup index, an exception occurs. However, you can use an expression to match multiple non-rollup indices or data streams.
aggregations (Mapping[str, Mapping[str, Any]] | None) – Specifies aggregations.
aggs (Mapping[str, Mapping[str, Any]] | None) – Specifies aggregations.
query (Mapping[str, Any] | None) – Specifies a DSL query that is subject to some limitations.
rest_total_hits_as_int (bool | None) – Indicates whether hits.total should be rendered as an integer or an object in the rest search response
size (int | None) – Must be zero if set, as rollups work on pre-aggregated data.
typed_keys (bool | None) – Specify whether aggregation and suggester names should be prefixed by their respective types in the response
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type:
- start_job(*, id, error_trace=None, filter_path=None, human=None, pretty=None)
Start rollup jobs. If you try to start a job that does not exist, an exception occurs. If you try to start a job that is already started, nothing happens.
https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-start-job.html
- stop_job(*, id, error_trace=None, filter_path=None, human=None, pretty=None, timeout=None, wait_for_completion=None)
Stop rollup jobs. If you try to stop a job that does not exist, an exception occurs. If you try to stop a job that is already stopped, nothing happens. Since only a stopped job can be deleted, it can be useful to block the API until the indexer has fully stopped. This is accomplished with the wait_for_completion query parameter, and optionally a timeout. For example:
` POST _rollup/job/sensor/_stop?wait_for_completion=true&timeout=10s `
The parameter blocks the API call from returning until either the job has moved to STOPPED or the specified time has elapsed. If the specified time elapses without the job moving to STOPPED, a timeout exception occurs.https://www.elastic.co/guide/en/elasticsearch/reference/8.17/rollup-stop-job.html
- Parameters:
id (str) – Identifier for the rollup job.
timeout (str | Literal[-1] | ~typing.Literal[0] | None) – If wait_for_completion is true, the API blocks for (at maximum) the specified duration while waiting for the job to stop. If more than timeout time has passed, the API throws a timeout exception. NOTE: Even if a timeout occurs, the stop request is still processing and eventually moves the job to STOPPED. The timeout simply means the API call itself timed out while waiting for the status change.
wait_for_completion (bool | None) – If set to true, causes the API to block until the indexer state completely stops. If set to false, the API returns immediately and the indexer is stopped asynchronously in the background.
error_trace (bool | None)
human (bool | None)
pretty (bool | None)
- Return type: