Over the last 6 months I have noticed a lot of code going into CPython working with SSL and TLS. At first I did not think anything commits and brushed it off as bug fixes and improvements. However, as the months went by and I kept seeing these commits come through it started to get my curiosity piqued about what might be happening in language under the hood. To find out more, I started digging through the Python Enhancement Proposals, better known as PEPs, and I found one entitled: PEP 543 -- A Unified TLS API for Python. At first I was not sure what to make of this proposal, but as I read more I understood that this is what all of the TLS and SSL commits in CPython have been about. It looks like the Python Core development team and the community have been slowing reworking how CPython works with OpenSSL and to make TLS connections, and PEP-543 was the proposal that explained this long term plan. So that is why I wanted to write this article, to explain the current problem CPython is facing, how this problem is going to be solved with the proposal of PEP-543, and to explain what I find to be interesting about TLS Agnosticism that is created from the implementation of this proposal. I hope you enjoy, let's jump in!
The Current Problem with OpenSSL Re-Distribution 🐍
To begin I believe it is fundamental to understand the problem the Python Core team is trying to solve by implementing this proposal. It is of my opinion that I believe the Core team is trying to remove the heavy reliance on shipping OpenSSL with different distributions of Python. Currently, at the time this article was written, OpenSSL has to be shipped with different versions of Python to support creating TLS connections across different operating systems. This puts a heavy burden on the Python Core team to make sure that they are distributing and supporting two code bases on each operating system whenever a network security patch needs to be addressed, CPython and OpenSSL.
The proposed solution is to remove shipping OpenSSL with Python and rework the TLS APIs internally in CPython to not make these APIs not rely on OpenSSL but to provide the ability to hook into a secure back end of a developers choosing . That way Python developers would have the option to take advantage of the secure back end that makes sense for their project and they are not reliant on the one shipped with Python. This provides flexibility to the developer when using CPython and does not require the use of the SSL module within Python to delegate TLS connections. As an example, with this proposed solution, if an application was opening a secure connection on the client side only, a developer would have the option to use the ClientContext with TLSWrappedSocket connection class instead of using the SSL module. Thus reducing overhead from the SSL module when making the connection. Below is an example of the proposed TLS APIs from the proposal. Keep in mind that these APIs are not yet available at the time of writing this article.
TLSBufferObject = Union[TLSWrappedSocket, TLSWrappedBuffer] class _BaseContext(metaclass=ABCMeta): @abstractmethod def __init__(self, configuration: TLSConfiguration): """ Create a new context object from a given TLS configuration. """ @property @abstractmethod def configuration(self) -> TLSConfiguration: """ Returns the TLS configuration that was used to create the context. """ class ClientContext(_BaseContext): def wrap_socket(self, socket: socket.socket, server_hostname: Optional[str]) -> TLSWrappedSocket: """ Wrap an existing Python socket object ``socket`` and return a ``TLSWrappedSocket`` object. ``socket`` must be a ``SOCK_STREAM`` socket: all other socket types are unsupported. The returned SSL socket is tied to the context, its settings and certificates. The socket object originally passed to this method should not be used again: attempting to use it in any way will lead to undefined behaviour, especially across different TLS implementations. To get the original socket object back once it has been wrapped in TLS, see the ``unwrap`` method of the TLSWrappedSocket. The parameter ``server_hostname`` specifies the hostname of the service which we are connecting to. This allows a single server to host multiple SSL-based services with distinct certificates, quite similarly to HTTP virtual hosts. This is also used to validate the TLS certificate for the given hostname. If hostname validation is not desired, then pass ``None`` for this parameter. This parameter has no default value because opting-out of hostname validation is dangerous, and should not be the default behaviour. """ buffer = self.wrap_buffers(server_hostname) return TLSWrappedSocket(socket, buffer) @abstractmethod def wrap_buffers(self, server_hostname: Optional[str]) -> TLSWrappedBuffer: """ Create an in-memory stream for TLS, using memory buffers to store incoming and outgoing ciphertext. The TLS routines will read received TLS data from one buffer, and write TLS data that needs to be emitted to another buffer. The implementation details of how this buffering works are up to the individual TLS implementation. This allows TLS libraries that have their own specialised support to continue to do so, while allowing those without to use whatever Python objects they see fit. The ``server_hostname`` parameter has the same meaning as in ``wrap_socket``. """ class TLSWrappedSocket: def do_handshake(self) -> None: """ Performs the TLS handshake. Also performs certificate validation and hostname verification. This must be called after the socket has connected (either via ``connect`` or ``accept``), before any other operation is performed on the socket. """ def cipher(self) -> Optional[Union[CipherSuite, int]]: """ Returns the CipherSuite entry for the cipher that has been negotiated on the connection. If no connection has been negotiated, returns ``None``. If the cipher negotiated is not defined in CipherSuite, returns the 16-bit integer representing that cipher directly. """ def negotiated_protocol(self) -> Optional[Union[NextProtocol, bytes]]: """ Returns the protocol that was selected during the TLS handshake. This selection may have been made using ALPN, NPN, or some future negotiation mechanism. If the negotiated protocol is one of the protocols defined in the ``NextProtocol`` enum, the value from that enum will be returned. Otherwise, the raw bytestring of the negotiated protocol will be returned. If ``Context.set_inner_protocols()`` was not called, if the other party does not support protocol negotiation, if this socket does not support any of the peer's proposed protocols, or if the handshake has not happened yet, ``None`` is returned. """ @property def context(self) -> Context: """ The ``Context`` object this socket is tied to. """ def negotiated_tls_version(self) -> Optional[TLSVersion]: """ The version of TLS that has been negotiated on this connection. """ def unwrap(self) -> socket.socket: """ Cleanly terminate the TLS connection on this wrapped socket. Once called, this ``TLSWrappedSocket`` can no longer be used to transmit data. Returns the socket that was wrapped with TLS. """
TLS Agnosticism 🐍
TLS Agnosticism frees up the Python developer from being tied to the version of OpenSSL that was shipped with Python. Going back to the original problem, if a version of OpenSSL ships with support for TLS v1.1 and v1.2, then that is what the Python developer is tied to using. However, if OpenSSL is taken out of the equation then the Python developer is then free to configure their own secure back end, then the developer can also confirue TLS version that makes sense in their project. There would be no shimming or providing support for other versions of TLS on your own, you would simply be allowed to take advantage of these new API as deems necessary. Removing the dependency on OpenSSL would also allow Python developers to take advantage of using TLS v1.3 draft. It would also provide embedded Python developers the opportunity configure their own TLS back end without the extra bytes of using a larger library.
In Summary ⌛️
In summary I think that the core development team is taking an excellent step forward with the PEP-543 proposal. This will take a heavy burden off the core team in maintaining the distribution of OpenSSL alongside Python if a security exploit should ever arise. From there, I think that when these new TLS APIs are implemented it will level the playing field and not make CPython so reliant on OpenSSL along with giving developers options on how they want to configure TLS connections. I think that this proposal does a lot for utilizing TLS efficiently based upon a projects context, i.e., client or server, and these moves are very timely with the next daft release of TLS v1.3 that is just around the corner. So in conclusion, I am interested to advantage of these new TLS APIs once they are finished and think this proposal does a lot to extend the network capabilities of Python!
Thank you for reading and as always, if you have any questions, comments, or concerns, please leave a comment and I will get back to you as soon as possible.
References:
PEP 543 -- A Unified TLS API for Python
https://www.python.org/dev/peps/pep-0543/
TLS/SSL wrapper for socket objects - 2.7.14
https://docs.python.org/2/library/ssl.html
TLS/SSL wrapper for socket objects - 3.6.4
https://docs.python.org/3.6/library/ssl.html
OpenSSL - Old Releases
https://www.openssl.org/source/old/
Comments
Piqued. The word is “piqued.”
Peaked and peeked are different things.
JimD, thank you very much! …
JimD, thank you very much! I missed that while proof reading. Appreciate it!