Connection Layer API¶

All of the classes reponsible for handling the connection to the Elasticsearch cluster. The default subclasses used can be overriden by passing parameters to the Elasticsearch class. All of the arguments to the client will be passed on to Transport, ConnectionPool and Connection.

For example if you wanted to use your own implementation of the ConnectionSelector class you can just pass in the selector_class parameter.

Note

ConnectionPool and related options (like selector_class) will only be used if more than one connection is defined. Either directly or via the Sniffing mechanism.

Transport¶

class elasticsearch.Transport(hosts, connection_class=Urllib3HttpConnection, connection_pool_class=ConnectionPool, nodes_to_host_callback=construct_hosts_list, sniff_on_start=False, sniffer_timeout=None, sniff_on_connection_fail=False, serializer=JSONSerializer(), max_retries=3, ** kwargs)¶

Encapsulation of transport-related to logic. Handles instantiation of the individual connections as well as creating a connection pool to hold them.

Main interface is the perform_request method.

Parameters:

hosts – list of dictionaries, each containing keyword arguments to create a connection_class instance
connection_class – subclass of Connection to use
connection_pool_class – subclass of ConnectionPool to use
host_info_callback – callback responsible for taking the node information from /_cluser/nodes, along with already extracted information, and producing a list of arguments (same as hosts parameter)
sniff_on_start – flag indicating whether to obtain a list of nodes from the cluser at startup time
sniffer_timeout – number of seconds between automatic sniffs
sniff_on_connection_fail – flag controlling if connection failure triggers a sniff
sniff_timeout – timeout used for the sniff request - it should be a fast api call and we are talking potentially to more nodes so we want to fail quickly.
serializer – serializer instance
serializers – optional dict of serializer instances that will be used for deserializing data coming from the server. (key is the mimetype)
default_mimetype – when no mimetype is specified by the server response assume this mimetype, defaults to ‘application/json’
max_retries – maximum number of retries before an exception is propagated
retry_on_status – set of HTTP status codes on which we should retry on a different node. defaults to (503, 504, )
retry_on_timeout – should timeout trigger a retry on different node? (default False)
send_get_body_as – for GET requests with body this option allows you to specify an alternate way of execution for environments that don’t support passing bodies with GET requests. If you set this to ‘POST’ a POST method will be used instead, if to ‘source’ then the body will be serialized and passed as a query parameter source.

Any extra keyword arguments will be passed to the connection_class when creating and instance unless overriden by that connection’s options provided as part of the hosts parameter.

add_connection(host)¶

Create a new Connection instance and add it to the pool.

Parameters:	host – kwargs that will be used to create the instance

get_connection()¶: Retreive a Connection instance from the ConnectionPool instance.

mark_dead(connection)¶

Mark a connection as dead (failed) in the connection pool. If sniffing on failure is enabled this will initiate the sniffing process.

Parameters:	connection – instance of `Connection` that failed

perform_request(method, url, params=None, body=None)¶

Perform the actual request. Retrieve a connection from the connection pool, pass all the information to it’s perform_request method and return the data.

If an exception was raised, mark the connection as failed and retry (up to max_retries times).

If the operation was succesful and the connection used was previously marked as dead, mark it as live, resetting it’s failure count.

Parameters:	method – HTTP method to use url – absolute url (without host) to target params – dictionary of query parameters, will be handed over to the underlying `Connection` class for serialization body – body of the request, will be serializes using serializer and passed to the connection

set_connections(hosts)¶

Instantiate all the connections and crate new connection pool to hold them. Tries to identify unchanged hosts and re-use existing Connection instances.

Parameters:	hosts – same as __init__

sniff_hosts()¶

Obtain a list of nodes from the cluster and create a new connection pool using the information retrieved.

To extract the node connection parameters use the nodes_to_host_callback.

Connection Pool¶

class elasticsearch.ConnectionPool(connections, dead_timeout=60, selector_class=RoundRobinSelector, randomize_hosts=True, ** kwargs)¶

Container holding the Connection instances, managing the selection process (via a ConnectionSelector) and dead connections.

It’s only interactions are with the Transport class that drives all the actions within ConnectionPool.

Initially connections are stored on the class as a list and, along with the connection options, get passed to the ConnectionSelector instance for future reference.

Upon each request the Transport will ask for a Connection via the get_connection method. If the connection fails (it’s perform_request raises a ConnectionError) it will be marked as dead (via mark_dead) and put on a timeout (if it fails N times in a row the timeout is exponentially longer - the formula is default_timeout * 2 ** (fail_count - 1)). When the timeout is over the connection will be resurrected and returned to the live pool. A connection that has been peviously marked as dead and succeedes will be marked as live (it’s fail count will be deleted).

Parameters:

connections – list of tuples containing the Connection instance and it’s options
dead_timeout – number of seconds a connection should be retired for after a failure, increases on consecutive failures
timeout_cutoff – number of consecutive failures after which the timeout doesn’t increase
selector_class – ConnectionSelector subclass to use if more than one connection is live
randomize_hosts – shuffle the list of connections upon arrival to avoid dog piling effect across processes

get_connection()¶

Return a connection from the pool using the ConnectionSelector instance.

It tries to resurrect eligible connections, forces a resurrection when no connections are availible and passes the list of live connections to the selector instance to choose from.

Returns a connection instance and it’s current fail count.

mark_dead(connection, now=None)¶

Mark the connection as dead (failed). Remove it from the live pool and put it on a timeout.

Parameters:	connection – the failed instance

mark_live(connection)¶

Mark connection as healthy after a resurrection. Resets the fail counter for the connection.

Parameters:	connection – the connection to redeem

resurrect(force=False)¶

Attempt to resurrect a connection from the dead pool. It will try to locate one (not all) eligible (it’s timeout is over) connection to return to the live pool. Any resurrected connection is also returned.

Parameters:	force – resurrect a connection even if there is none eligible (used when we have no live connections). If force is specified resurrect always returns a connection.

Connection Selector¶

class elasticsearch.ConnectionSelector(opts)¶

Simple class used to select a connection from a list of currently live connection instances. In init time it is passed a dictionary containing all the connections’ options which it can then use during the selection process. When the select method is called it is given a list of currently live connections to choose from.

The options dictionary is the one that has been passed to Transport as hosts param and the same that is used to construct the Connection object itself. When the Connection was created from information retrieved from the cluster via the sniffing process it will be the dictionary returned by the host_info_callback.

Example of where this would be useful is a zone-aware selector that would only select connections from it’s own zones and only fall back to other connections where there would be none in it’s zones.

Parameters:	opts – dictionary of connection instances and their options

select(connections)¶

Select a connection from the given list.

Parameters:	connections – list of live connections to choose from

Urllib3HttpConnection (default connection_class)¶

class elasticsearch.Urllib3HttpConnection(host='localhost', port=9200, http_auth=None, use_ssl=False, verify_certs=False, ca_certs=None, client_cert=None, maxsize=10, **kwargs)¶

Default connection class using the urllib3 library and the http protocol.

Parameters:

http_auth – optional http auth information as either ‘:’ separated string or a tuple
use_ssl – use ssl for the connection if True
verify_certs – whether to verify SSL certificates
ca_certs – optional path to CA bundle. See http://urllib3.readthedocs.org/en/latest/security.html#using-certifi-with-urllib3 for instructions how to get default set
client_cert – path to the file containing the private key and the certificate
maxsize – the maximum number of connections which will be kept open to this host.