== CPython's trust of bytecode is not a security problem Yesterday I wrote about [[how CPython trusts bytecode so much that you can use it to read or write arbitrary memory BytecodeIsTrusted]]. In comments, Ewen McNeil had a typical reaction to this: > It appears this means if you can get arbitrary Python execution (eg, > unwisely trusting YAML, XML, pickle, etc...), then you can probably > get arbitrary memory read/write in the Python process, which is a > fairly short step away from arbitrary assembly code execution. This is true, but it also misunderstands the security situation of Python bytecode. Even without this issue, it is game over in general if an attacker can load arbitrary bytecode into your Python process. The obvious weakness is that [[ctypes https://docs.python.org/3/library/ctypes.html]] is part of the standard library these days and it can also be used to give you this level of access to memory without any need to corrupt the bytecode interpreter. But even without ctypes an attacker has plenty of options to achieve binary code execution. They can transfer a binary, write it out, and then execute it. They can transfer a native code Python module (in _.so_ form), manipulate the Python load path, and then _import_ it (which gets them code execution even in the Python process). They can run other existing vulnerable binaries on your system and exploit their bugs. And so on. You can certainly try to stop this by creating a Python environment that blocks access to the Python features necessary for this. The problem is that there have proven to be many features that can be exploited to help here and many paths through Python to reach them. The runtime environment of Python is a complex, intertangled thing, and all attackers need is one crack that lets them bootstrap a reference to, say, the _os_ module. And there are a lot of potential cracks. (Python used to have [[a restricted execution module https://docs.python.org/2/library/restricted.html]]. As you can see, it was disabled in Python 2.3 because it had basically unfixable holes.) The simple truth is ~~Python is not a safe execution environment for untrusted code~~. The only important thing about bytecode being able to read and write arbitrary memory all by itself is that it shows how impossible the job of securing CPython is. Even if you managed to reliably cut off all access to modules and code that could be used to escape your sandbox at the Python level, you would have to audit and fix the innards of the bytecode interpreter itself to be safe. This is why I say that this trust of bytecode is not a security problem; it doesn't really make the situation any worse than it already is. It's just an amusingly baroque alternate path to a security issue that is already there in general.