|I feel like we're discussing how to install a fridge when all you want is a nice, cold beer but we're nothing without solid infrastructure.
In CodeProject.AI we support adding backend analysis modules in any language and tech stack, as long as that module can be installed (and uninstalled) neatly.
Python is obviously the biggest player in the AI space so supporting Python is critical. It's also tricky, and we rely on Python Virtual Environments to keep things clean. Clean-ish.
A Python Virtual Environment can lead one to think of a sandbox, or virtual machine where the environment is completely encased in a leak-proof container that protects it from interference from the outside, and protects the outside from harmful processes on the inside.
This isn't a Python Virtual Environment.
To create a a Python Virtual Environment for a given version of Python you must first have that version of Python installed in your environment. You run the command to create the a virtual environment (
python -m venv) and what you get is a folder that contains a copy of, or symlink to, your original Python interpreter, as well as a folder to any Python packages you need to install into that a virtual environment.
This is your Python Virtual Environment.
The secret sauce is the "Activate" script which simply sets environment variables so that when you type "Python" at the command line, the Python interpreter inside your a Virtual Environment will be launched, not the system-wide version. Deactivate the Virtual Environment and the environment variables are reset, and your OS-default version of Python is now the default.
The details of this are not important.
The important part is that when you "activate" the virtual environment and then install packages via
pip install, those packages will live inside the Virtual Environment's folder, and not inside the OS's global python package folder. This allows you to have different versions of the same package installed in different virtual environments.
Python installed in your OS with package A, package B
and also Python installed in Virtual Environment 1 with package A 2.0, package B
and also Python installed in Virtual Environment 2 with package A, package B 2.0
Installing Python in CodeProject.AI
When we install Python in a CodeProject.AI setup we either XCOPY in the Python executables (for Windows), or use
brew (Linux / macOS) to ensure we have a base Python installation, and then we create the Virtual Environment using the standard
python -m venv and install the python packages.
But we do it like this:
"[PythonFolder]python.exe" -m venv "[PythonFolder]venv"
"[PythonFolder]venv\scripts\pip.exe" --no-cache-dir install -r requirements.txt --target "[PythonFolder]venv\Lib\site-packages"
Where [PythonFolder] is \src\AnalysisLayer\bin\Windows\Python37\ for Windows and Python 3.7
This calls the python interpreter and the pip utility from within the Virtual Environment directly, and ensures the output of these commands will also be within the Virtual Environment. No need to activate the Virtual Environment during this.
Even more importantly: You do not need to have Python or any Python tools setup on your machine for this to work. The whole point of our installer (and our dev installation scripts) is to allow you to be up and running with no prep. We do all that for you.
Using Python in CodeProject.AI
We create a virtual environment for each version of Python that we install. We still do not need to activate those environments. This is partly because we're calling the Python process from our server directly, and not from a command terminal. Calling a script to set environment variables prior to launching Python is a little cumbersome.
We could set the PATH variables directly, but we don't need to: we call the specific Python and Pip executables from the virtual environment using their full paths. On top of this, in our codeprojectAI.py module wrapper we add the path to the virtual environment's packages directory to
sys.path in the code itself.
This means that
So there is no need to activate the Virtual Environment.
- The correct version of Python will be called
- The packages installed in the virtual environment will be used in preference to the packages installed globally
What we gain from this approach
We control the environment, we gain the flexibility of virtual environments, and we can add as many versions of python and as many combinations of installation packages as we need
What we lose from this approach
The biggest is size. Each Virtual Environment contains copies of packages probably installed elsewhere. PyTorhc, for instance, can be well over 1.3GB in size for the GPU enabled version. While size can be important, the size of Python modules will quickly pale in comparison to AI models, and even more so when one considers the data being input to these models. Video footage, for instance, gets really big, really fast.
The other downside is that modules installed in CodeProject.AI must use our module wrapper Python module to ensure the PATH is set correctly for the import statements to work, or they need to do that themselves. It's a small tradeoff given that using our SDK means 90% of the plumbing is done for you.