Helpers

Collection of simple helper functions that abstract some specifics or the raw API.

elasticsearch.helpers.bulk_index(client, docs, chunk_size=500, stats_only=False, raise_on_error=False, **kwargs)

Helper for the bulk() api that provides a more human friendly interface - it consumes an iterator of documents and sends them to elasticsearch in chunks.

This function expects the doc to be in the format as returned by search(), for example:

{
    '_index': 'index-name',
    '_type': 'document',
    '_id': 42,
    '_parent': 5,
    '_ttl': '1d',
    '_source': {
        ...
    }
}

alternatively, if _source is not present, it will pop all metadata fields from the doc and use the rest as the document data.

Parameters:
  • client – instance of Elasticsearch to use
  • docs – iterator containing the docs
  • chunk_size – number of docs in one chunk sent to es (default: 500)
  • stats_only – if True only report number of successful/failed operations instead of just number of successful and a list of error responses
  • raise_on_error – raise BulkIndexError if some documents failed to index (and stop sending chunks to the server)

Any additional keyword arguments will be passed to the bulk API itself.

elasticsearch.helpers.scan(client, query=None, scroll='5m', **kwargs)

Simple abstraction on top of the scroll() api - a simple iterator that yields all hits as returned by underlining scroll requests.

Parameters:
  • client – instance of Elasticsearch to use
  • query – body for the search() api
  • scroll – Specify how long a consistent view of the index should be maintained for scrolled search

Any additional keyword arguments will be passed to the initial search() call.

elasticsearch.helpers.reindex(client, source_index, target_index, target_client=None, chunk_size=500, scroll='5m')

Reindex all documents from one index to another, potentially (if target_client is specified) on a different cluster.

Note

This helper doesn’t transfer mappings, just the data.

Parameters:
  • client – instance of Elasticsearch to use (for read if target_client is specified as well)
  • source_index – index (or list of indices) to read documents from
  • target_index – name of the index in the target cluster to populate
  • target_client – optional, is specified will be used for writing (thus enabling reindex between clusters)
  • chunk_size – number of docs in one chunk sent to es (default: 500)
  • scroll – Specify how long a consistent view of the index should be maintained for scrolled search
Read the Docs v: 0.4.3
Versions
latest
0.4.3
0.4.2
0.4.1
Downloads
On Read the Docs
Project Home
Builds

Free document hosting provided by Read the Docs.