Connection Layer API¶
All of the classes responsible for handling the connection to the Elasticsearch
cluster. The default subclasses used can be overriden by passing parameters to the
Elasticsearch class. All of the arguments to the client
will be passed on to
For example if you wanted to use your own implementation of the
ConnectionSelector class you can just pass in the
Transport(hosts, connection_class=Urllib3HttpConnection, connection_pool_class=ConnectionPool, host_info_callback=construct_hosts_list, sniff_on_start=False, sniffer_timeout=None, sniff_on_connection_fail=False, serializer=JSONSerializer(), max_retries=3, ** kwargs)¶
Encapsulation of transport-related to logic. Handles instantiation of the individual connections as well as creating a connection pool to hold them.
Main interface is the perform_request method.
- hosts – list of dictionaries, each containing keyword arguments to create a connection_class instance
- connection_class – subclass of
- connection_pool_class – subclass of
- host_info_callback – callback responsible for taking the node information from /_cluster/nodes, along with already extracted information, and producing a list of arguments (same as hosts parameter)
- sniff_on_start – flag indicating whether to obtain a list of nodes from the cluster at startup time
- sniffer_timeout – number of seconds between automatic sniffs
- sniff_on_connection_fail – flag controlling if connection failure triggers a sniff
- sniff_timeout – timeout used for the sniff request - it should be a
fast api call and we are talking potentially to more nodes so we want
to fail quickly. Not used during initial sniffing (if
sniff_on_startis on) when the connection still isn’t initialized.
- serializer – serializer instance
- serializers – optional dict of serializer instances that will be used for deserializing data coming from the server. (key is the mimetype)
- default_mimetype – when no mimetype is specified by the server response assume this mimetype, defaults to ‘application/json’
- max_retries – maximum number of retries before an exception is propagated
- retry_on_status – set of HTTP status codes on which we should retry
on a different node. defaults to
(502, 503, 504)
- retry_on_timeout – should timeout trigger a retry on different node? (default False)
- send_get_body_as – for GET requests with body this option allows you to specify an alternate way of execution for environments that don’t support passing bodies with GET requests. If you set this to ‘POST’ a POST method will be used instead, if to ‘source’ then the body will be serialized and passed as a query parameter source.
- meta_header – If True will send the ‘X-Elastic-Client-Meta’ HTTP header containing simple client metadata. Setting to False will disable the header. Defaults to True.
Any extra keyword arguments will be passed to the connection_class when creating and instance unless overridden by that connection’s options provided as part of the hosts parameter.
Create a new
Connectioninstance and add it to the pool.
Parameters: host – kwargs that will be used to create the instance
Explicitly closes connections
Mark a connection as dead (failed) in the connection pool. If sniffing on failure is enabled this will initiate the sniffing process.
Parameters: connection – instance of
perform_request(method, url, headers=None, params=None, body=None)¶
Perform the actual request. Retrieve a connection from the connection pool, pass all the information to it’s perform_request method and return the data.
If an exception was raised, mark the connection as failed and retry (up to max_retries times).
If the operation was successful and the connection used was previously marked as dead, mark it as live, resetting it’s failure count.
- method – HTTP method to use
- url – absolute url (without host) to target
- headers – dictionary of headers, will be handed over to the
- params – dictionary of query parameters, will be handed over to the
Connectionclass for serialization
- body – body of the request, will be serialized using serializer and passed to the connection
Instantiate all the connections and create new connection pool to hold them. Tries to identify unchanged hosts and re-use existing
Parameters: hosts – same as __init__
Obtain a list of nodes from the cluster and create a new connection pool using the information retrieved.
To extract the node connection parameters use the
Parameters: initial – flag indicating if this is during startup (
sniff_on_start), ignore the
ConnectionPool(connections, dead_timeout=60, selector_class=RoundRobinSelector, randomize_hosts=True, ** kwargs)¶
Container holding the
Connectioninstances, managing the selection process (via a
ConnectionSelector) and dead connections.
It’s only interactions are with the
Transportclass that drives all the actions within ConnectionPool.
Initially connections are stored on the class as a list and, along with the connection options, get passed to the ConnectionSelector instance for future reference.
Upon each request the Transport will ask for a Connection via the get_connection method. If the connection fails (it’s perform_request raises a ConnectionError) it will be marked as dead (via mark_dead) and put on a timeout (if it fails N times in a row the timeout is exponentially longer - the formula is default_timeout * 2 ** (fail_count - 1)). When the timeout is over the connection will be resurrected and returned to the live pool. A connection that has been previously marked as dead and succeeds will be marked as live (its fail count will be deleted).
- connections – list of tuples containing the
Connectioninstance and it’s options
- dead_timeout – number of seconds a connection should be retired for after a failure, increases on consecutive failures
- timeout_cutoff – number of consecutive failures after which the timeout doesn’t increase
- selector_class –
ConnectionSelectorsubclass to use if more than one connection is live
- randomize_hosts – shuffle the list of connections upon arrival to avoid dog piling effect across processes
Explicitly closes connections
Return a connection from the pool using the ConnectionSelector instance.
It tries to resurrect eligible connections, forces a resurrection when no connections are availible and passes the list of live connections to the selector instance to choose from.
Returns a connection instance and it’s current fail count.
Mark the connection as dead (failed). Remove it from the live pool and put it on a timeout.
Parameters: connection – the failed instance
Mark connection as healthy after a resurrection. Resets the fail counter for the connection.
Parameters: connection – the connection to redeem
Attempt to resurrect a connection from the dead pool. It will try to locate one (not all) eligible (it’s timeout is over) connection to return to the live pool. Any resurrected connection is also returned.
Parameters: force – resurrect a connection even if there is none eligible (used when we have no live connections). If force is specified resurrect always returns a connection.
- connections – list of tuples containing the
Simple class used to select a connection from a list of currently live connection instances. In init time it is passed a dictionary containing all the connections’ options which it can then use during the selection process. When the select method is called it is given a list of currently live connections to choose from.
The options dictionary is the one that has been passed to
Transportas hosts param and the same that is used to construct the Connection object itself. When the Connection was created from information retrieved from the cluster via the sniffing process it will be the dictionary returned by the host_info_callback.
Example of where this would be useful is a zone-aware selector that would only select connections from it’s own zones and only fall back to other connections where there would be none in its zones.
Parameters: opts – dictionary of connection instances and their options
Select a connection from the given list.
Parameters: connections – list of live connections to choose from
Urllib3HttpConnection (default connection_class)¶
If you have complex SSL logic for connecting to Elasticsearch using an SSLContext object might be more helpful. You can create one natively using the python SSL library with the create_default_context (https://docs.python.org/3/library/ssl.html#ssl.create_default_context) method.
To create an SSLContext object you only need to use one of cafile, capath or cadata:
>>> from ssl import create_default_context >>> context = create_default_context(cafile=None, capath=None, cadata=None)
- cafile is the path to your CA File
- capath is the directory of a collection of CA’s
- cadata is either an ASCII string of one or more PEM-encoded certificates or a bytes-like object of DER-encoded certificates.
Please note that the use of SSLContext is only available for Urllib3.
Urllib3HttpConnection(host='localhost', port=None, http_auth=None, use_ssl=False, verify_certs=<object object>, ssl_show_warn=<object object>, ca_certs=None, client_cert=None, client_key=None, ssl_version=None, ssl_assert_hostname=None, ssl_assert_fingerprint=None, maxsize=10, headers=None, ssl_context=None, http_compress=None, cloud_id=None, api_key=None, opaque_id=None, **kwargs)¶
Default connection class using the urllib3 library and the http protocol.
- host – hostname of the node (default: localhost)
- port – port to use (integer, default: 9200)
- url_prefix – optional url prefix for elasticsearch
- timeout – default timeout in seconds (float, default: 10)
- http_auth – optional http auth information as either ‘:’ separated string or a tuple
- use_ssl – use ssl for the connection if True
- verify_certs – whether to verify SSL certificates
- ssl_show_warn – show warning when verify certs is disabled
- ca_certs – optional path to CA bundle. See https://urllib3.readthedocs.io/en/latest/security.html#using-certifi-with-urllib3 for instructions how to get default set
- client_cert – path to the file containing the private key and the certificate, or cert only if using client_key
- client_key – path to the file containing the private key if using separate cert and key files (client_cert will contain only the cert)
- ssl_version – version of the SSL protocol to use. Choices are:
SSLv23 (default) SSLv2 SSLv3 TLSv1 (see
PROTOCOL_*constants in the
sslmodule for exact options for your environment).
- ssl_assert_hostname – use hostname verification if not False
- ssl_assert_fingerprint – verify the supplied certificate fingerprint if not None
- maxsize – the number of connections which will be kept open to this host. See https://urllib3.readthedocs.io/en/1.4/pools.html#api for more information.
- headers – any custom http headers to be add to requests
- http_compress – Use gzip compression
- cloud_id – The Cloud ID from ElasticCloud. Convenient way to connect to cloud instances. Other host connection params will be ignored.
- api_key – optional API Key authentication as either base64 encoded string or a tuple.
- opaque_id – Send this value in the ‘X-Opaque-Id’ HTTP header For tracing all requests made by this transport.
Explicitly closes connection