Helpers
Collection of simple helper functions that abstract some specifics or the raw
API.
-
elasticsearch.helpers.bulk_index(client, docs, chunk_size=500, stats_only=False, raise_on_error=False, **kwargs)
Helper for the bulk() api that provides
a more human friendly interface - it consumes an iterator of documents and
sends them to elasticsearch in chunks.
This function expects the doc to be in the format as returned by
search(), for example:
{
'_index': 'index-name',
'_type': 'document',
'_id': 42,
'_parent': 5,
'_ttl': '1d',
'_source': {
...
}
}
alternatively, if _source is not present, it will pop all metadata fields
from the doc and use the rest as the document data.
Parameters: |
- client – instance of Elasticsearch to use
- docs – iterator containing the docs
- chunk_size – number of docs in one chunk sent to es (default: 500)
- stats_only – if True only report number of successful/failed
operations instead of just number of successful and a list of error responses
- raise_on_error – raise BulkIndexError if some documents failed to
index (and stop sending chunks to the server)
|
Any additional keyword arguments will be passed to the bulk API itself.
-
elasticsearch.helpers.scan(client, query=None, scroll='5m', **kwargs)
Simple abstraction on top of the
scroll() api - a simple iterator that
yields all hits as returned by underlining scroll requests.
Parameters: |
- client – instance of Elasticsearch to use
- query – body for the search() api
- scroll – Specify how long a consistent view of the index should be
maintained for scrolled search
|
Any additional keyword arguments will be passed to the initial
search() call.
-
elasticsearch.helpers.reindex(client, source_index, target_index, target_client=None, chunk_size=500, scroll='5m')
Reindex all documents from one index to another, potentially (if
target_client is specified) on a different cluster.
Note
This helper doesn’t transfer mappings, just the data.
Parameters: |
- client – instance of Elasticsearch to use (for
read if target_client is specified as well)
- source_index – index (or list of indices) to read documents from
- target_index – name of the index in the target cluster to populate
- target_client – optional, is specified will be used for writing (thus
enabling reindex between clusters)
- chunk_size – number of docs in one chunk sent to es (default: 500)
- scroll – Specify how long a consistent view of the index should be
maintained for scrolled search
|