3  Python Virtual Environment

3.1 Package management

There are two most popular package management tools for Python, pip and conda.

  • pip is the official package management tool for Python.
  • conda was originally developed by Anaconda Inc., and later became open-sourced.

In this course, we mainly focus on conda since it is designed towards Data Science.

  1. Install packages. You may specify the particular version number.
conda install <pkg name>
conda install <pkg name>=<version number>
  1. List all installed packages, or list several specific installed packages:
conda list
conda list <pkg names>
  1. Update packages.
conda updata <pkg name>
  1. Remove packages.
conda remove <pkg name>
  1. If the packages you want is in PyPI but not in conda channels, you may use pip to install that package.
pip install <pkg name>

Note that if the package is in both conda channels and PyPI, it is recommended not to mix pip and conda. If you start from conda, just stick to conda. Only use pip when you have to.

The package managemers will download packages from online repository to install in your environment. By default, pip and conda use two different repositories.

  • PyPI: PyPI stands for Python Package Index. It is the official third-party software repository for Python. pip use it as the default source for packages and their dependencies. Packages on PyPI are typically uploaded by the author of the Python package.
  • conda-forge: Anaconda, Inc. provides several channels that host packages. Conda-Forge is one of the most important channels. Although it is a community project, it is now the recommended channel to get packages through conda. In conda-forge, package maintainers can be different than the original author of the package.

Usually PyPI contains more packages than conda-forge, and the versions of packages get to PyPI faster. However, when using conda through conda-forge, more safty checks are done and conda will try its best to make the installed packages compatible.

To install from conda-forge using conda, you should add an argument -c:

conda install -c conda-forge <pkg name>

3.2 Virtual environments

Virtual environments provide a project-specific version of installed packages. This both helps you to faithfully reproduce your environment as well as isolate the use of packages so that upgrading a package in one project doesn’t break other projects. In this section we are going to use conda to manage environments. The main reference is the official document. We will just list the minimal working examples here. For more functions please read the official document.

  1. To create an environment:
conda create --name <env name>
  1. To activate an environment:
conda activate <env name>

After you create a new environment and activate it, you may start to use the commands from the previous section to install packages.

  1. To deactivate an environment:
conda deactivate
  1. To remove an environemnt, both of the following commands work:
conda remove --name <env name> --all
conda env remove -n <env name>
  1. To get all enviroments in the system:
conda env list
conda info --envs

Unlike conda, pip is only a package manager, and it doesn’t provide any virtual environment functions. The default virtual environment tool for Python is venv. You may go to the official document for more infomation about venv.

3.3 Building identical environments

Sometimes people want to build identical environments. This is done by recording the versions of all packages in the current environment into a spec list file. When rebuilding Python will pull the spec list file out and install the packages of the speicific versions based on the list.

In the process, there are two steps. First generate the pacakge spec list file. Second create an environment based on the list. Note that the spec list file generated by different methods have different formats. Therefore you have to use the correct command to restore the environment.

3.3.1 Using conda

To produce a spec list, use the following command:

conda list --explicit > spec-file.txt

To install the packages, use the following command:

conda create --name <env name> --file spec-file.txt

A typical conda environment file looks like the following example.

# This file may be used to create an environment using:
# $ conda create --name <env> --file <this file>
# platform: win-64
@EXPLICIT
https://repo.anaconda.com/pkgs/main/win-64/conda-env-2.6.0-1.conda
https://repo.anaconda.com/pkgs/r/win-64/_r-mutex-1.0.0-anacondar_1.conda
https://repo.anaconda.com/pkgs/main/win-64/blas-1.0-mkl.conda

3.3.2 Using pip freeze

The classic popular way is pip freeze. To produce a spec list, use the following command:

pip freeze > requirements.txt

To install the packages, use the following command:

pip install -r requirements.txt

A typical pip requirements.txt file looks like the following example.

anyio @ file:///C:/ci/anyio_1644481856696/work/dist
argon2-cffi @ file:///opt/conda/conda-bld/argon2-cffi_1645000214183/work
argon2-cffi-bindings @ file:///C:/ci/argon2-cffi-bindings_1644569876605/work
asttokens==2.0.7
attrs @ file:///C:/b/abs_09s3y775ra/croot/attrs_1668696195628/work
Babel @ file:///tmp/build/80754af9/babel_1620871417480/work
backcall==0.2.0
Note

Actually both ways are NOT satisfying. Therefore there are a lot of new tools coming out to deal with this task, like pipreqs and poetry. Here for simplicity we just briefly introduce the most basic ones.

3.4 mamba

mamba is a reimplementation of the conda package manager in C++. It can be installed directly from conda-forge.

conda install -c conda-forge mamba

After you install it in one of your environment, you may use it in all your environments.

mamba can be treated as a drop-in replacement of conda. All commands we mentioned above can be rewritten by replacing conda with mamba. One of the reasons to use mamba over conda is that mamba runs so much faster than conda.