Scripting Environments

If all the dependencies for a model can be satisfied on the new platform, the next stage is to install the necessary scripting environments used to configure, prep, and post process the model.

imsi environment

For the imsi based system, it require an environment that users can use to get access to all the supporting codes/infrastructure. imsi is version controlled here, with its environment defined with this file.

This environment can generally be simply installed using something like this:

# clone the imsi code and check out desired version
>> git clone git@gitlab.com:cccma/imsi.git /path/to/cloned/imsi
>> ( cd /path/to/cloned/imsi ; git checkout <desired_commit|branch|tag> )

# create a simple venv with the desired python
#    version
>> /path/to/desired/version/python -m venv /path/to/desired/env

# activate environment
>> source /path/to/desired/env/bin/activate

# install imsi
>> pip install /path/to/cloned/imsi

Note

Once imsi is packaged and served via PyPI, you will be able to ignore the clone step, and just install the desired imsi version via:

>> pip install imsi==X.Y.Z

cmorization environment

In order to support “cmorization”, a python environment is required to run the pycmor codebase. Unfortunately, the easiest way to install these environments is via conda - specifically, with this file. This is because the cmor package, which is the back-bone of pycmor doesn’t have a pip installer.

On systems that don’t allow conda usage, mamba can be used, or the cmor dependency could be built from source. Outside of the cmor dependency, the other packages can likely be installed via pip directly.

Note

On niagara, with the v5.1_cp4c branch, we are still utilizing a python2 version of the cmorization software. Additionally, the niagara sys-admins very much don’t want users to make use of personal conda installs. As such, building the supporting environment on niagara was painful. Nevertheless, Clint built one here:

/project/c/cp4c/cp4c/pyenvs/pyenv-ncconv-0.1

using this directory as the build location

/project/c/cp4c/cp4c/canesm_deps/pycmor

Clint attempted to document the process there, but for posterity, includes some pertinent files here:

#README.md

This directory contains everything necessary to build a pycmor supporting (using python 2.7)
virtualenv on DRAC machines. However, getting this to work had a few hiccups:

### Python Version and the CCCma Format Problem
Since the atmosphere model _still_ writes out CCCma format, instead of something like netcdf, we require
a method to read in CCCma format files into python, which is achieved through an old python library
that (binary format) interacts with old CCCma tools and acts as an interface between fortran and python.
This module was written in python2.7 and would require entirely new bindings to upgrade to python3 (work is
scheduled to do this, but it will take some time)

As such, we need to do some painful gymnastics here to create a python 2.7 environment.

The use of this "cccma_py.so" module also means that we have a dependence on libgfortran.so.3 (note the 3). This
isn't available under NiaEnv/2022a, so we need to carry a version of this around here and add its location
manually to LD_LIBRARY_PATH

### Generating a requirements.txt file
For reference the cmor.yml file has been provided here, but it can't be used to install. We need
a requirements.txt file that can be used by pip. As such, Clint Seinen generated one by activating
the environment on the science network, and running

    $ pip freeze > requirements.txt

However, this ended up including some modules that aren't truly required and couldn't be found by pip.
As such, Clint removed them. For reference the original requirements file is provided as
pycmor-requirements-science-work.txt, while the modified version that is used is pycmor-requirements.txt

### Installing CMOR and jsonc
On the ECCC science network, conda is utilized to serve the necessary environment but conda is considered
problematic on DRAC machines, and virtualenv is the desired environment solution. However, this complicates
matters because CMOR is only provided via conda, or by building from source - see here:

https://cmor.llnl.gov/mydoc_cmor3_conda/

As such, to use CMOR (and its dependency jsonc) we need to build them from source.

NOTE: json-c needs to have its library added to LD_LIBRARY_PATH

### Setting LD_LIBRARY_PATH

To properly pickup the following libraries:
  - libgfortran.so.3 and
  - json-c's library

these need to be added to LD_LIBRARY_PATH when the environment is activated, as such, the
build-env.sh script adds to the environment's 'activate' script to do so.

NOTE: it also makes it so the 'deactivate' function (within 'activate' script) pulls theses
out of LD_LIBRARYPATH

### PYCMOR_DEPS

An additional dependency, from the legacy cccma_py module (see above), is on the PYCMOR_DEPS
variable, which amor.py uses to find 'cccma_py'. On other platforms, for non-imsi systems, we've
set this variable in a "site-profile", but in this instance we've made it so the activate/deactivate
script/function handles the setting/clearing of this.

The 'build-env.sh' script handles this.
# build-env.sh
#!/bin/bash
# THIS IS A SCRIPT THAT IS USED TO BUILD A PYCMOR2 SUPPORTING VIRTUAL ENV.
# NOTE THAT THIS SCRIPT IS MOSTLY TO DOCUMENT THE BUILD PROCESS AND IF NEW VERSIONS
# OF JSON-C OR CMOR NEED TO BE BUILT, THE SCRIPT WILL LIKELY NEED TO BE ADAPTED
#
# AUTHOR: Clint Seinen clint.seinen@ec.gc.ca
#
#   When using this script, it builds all the dependencies in _this_ directory, and it will clear
#   old directories/files associated with the CMOR and JSONC downloads
set -ex

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Set configuration vars!
#
#   env_name : the name of the python environment
#
#   env_location : where the enviornment directory will be stored
#
#   modules_for_build : defines what modules will be loaded when json-c and cmor are built from source.
#                           Upon guidance from DRAC support, we built CMOR with the netcdf libraries
#                           that will be loaded when we try to use it.
#
#   modules_for_env_init : defines the modules that will be loaded to initate the python environment.
#                           These need to be different from modules_for_build so we can get a python2.7
#                           module. Note that if the modules within modules_for_build have a supporting
#                           python2.7 module, this would NOT be needed.
#
#   jsonc_version : the version of jsonc that will support CMOR, that will be built from source
#
#   cmor_version : the version of cmor that will be built from source
#
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
env_name="pyenv-ncconv-0.1"
env_location="/project/c/cp4c/cp4c/pyenvs"
modules_for_build="NiaEnv/2022a intel/2022u2 ucx/1.11.2 intelmpi/2022u2+ucx-1.11.2 hdf5/1.10.9 netcdf/4.9.0 cmake cdo udunits/2.2.28"
modules_for_env_init="NiaEnv/2019b python/2.7.15"
jsonc_version="0.13.1-20180305"
cmor_version="3.5.0"

#~~~~~~~~~~~~~~~~~~~~~~~~
# Define local locations
#~~~~~~~~~~~~~~~~~~~~~~~~
install_dir=$(pwd)
lib_dir=$(pwd)/lib
pycmor_deps_dir=$(pwd)/pycmor_deps

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Initiate python environment and activate it
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
for mod in $modules_for_env_init; do
    module load $mod
done
( cd $env_location; virtualenv $env_name; )
source ${env_location}/${env_name}/bin/activate
pip install numpy six

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Build the libraries from source
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# load libs for building of libs
for mod in $modules_for_build; do
    module load $mod
done

#-- JSON-C
# get and install json-c and add its locations to required path vars
rm -rf json-c*
wget https://github.com/json-c/json-c/archive/refs/tags/json-c-${jsonc_version}.tar.gz
tar -zxf json-c-${jsonc_version}.tar.gz
cd json-c-json-c-${jsonc_version}
rm -rf build install
mkdir build install
cd build
cmake .. -DCMAKE_INSTALL_PREFIX=$(pwd)/../install
make install
cd ..
CPATH=$(pwd):$CPATH
cd install/include
CPATH=$(pwd):$CPATH
cd ../lib
LIBRARY_PATH=$(pwd):$LIBRARY_PATH
LD_LIBRARY_PATH=$(pwd):$LD_LIBRARY_PATH
JSONC_LIB_PATH=$(pwd)

#-- CMOR
cd ${install_dir}
rm -rf cmor-${cmor_version}
rm -rf ${cmor_version}.tar.gz
wget https://github.com/PCMDI/cmor/archive/refs/tags/${cmor_version}.tar.gz
tar -zxf ${cmor_version}.tar.gz
cd cmor-${cmor_version}
mkdir install
./configure --prefix=$(pwd)/install --with-python
make install || : # apparently we accept a failure here...

# install into this environment
pip install .

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Install the remaining python modules
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cd ${install_dir}
pip install -r pycmor-requirements.txt

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Add extra env considerations to activate script
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
echo "export LD_LIBRARY_PATH=\${LD_LIBRARY_PATH}:${JSONC_LIB_PATH}" >> ${env_location}/${env_name}/bin/activate
echo "export LD_LIBRARY_PATH=\${LD_LIBRARY_PATH}:${lib_dir}" >> ${env_location}/${env_name}/bin/activate
echo "export PYCMOR_DEPS=${pycmor_deps_dir}" >> ${env_location}/${env_name}/bin/activate

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Add extra env considerations to deactivate function
#   - remove LD_LIBRARY_PATH entries
#   - unset PYCMOR_DEPS
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# find the starting line of the function
deactivate_function_start_line=$(grep -n 'deactivate () {' ${env_location}/${env_name}/bin/activate | awk -F ':' '{print $1}')
inject_line=$(( deactivate_function_start_line + 1 ))
sed -i "${inject_line}i"'export LD_LIBRARY_PATH=$( tr : $\x27\\n\x27 <<< $LD_LIBRARY_PATH | grep -v'" $JSONC_LIB_PATH "'| paste -s -d : )' ${env_location}/${env_name}/bin/activate
sed -i "${inject_line}i"'export LD_LIBRARY_PATH=$( tr : $\x27\\n\x27 <<< $LD_LIBRARY_PATH | grep -v'" $lib_dir "'| paste -s -d : )' ${env_location}/${env_name}/bin/activate
sed -i "${inject_line}i"'unset PYCMOR_DEPS' ${env_location}/${env_name}/bin/activate

#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
# Make the environment sharable by making it readable
#   and executable to all
#~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
deactivate
chmod -R a+rx ${env_location}/${env_name}
# pycmor-requirements.txt
asn1crypto==0.24.0
backports-abc==0.5
backports.shutil-get-terminal-size==1.0.0
bokeh==1.0.4
certifi==2018.11.29
cffi==1.11.5
cftime==1.0.3.4
chardet==3.0.4
Click==7.0
cloudpickle==0.8.0
CMOR==3.5.0
cryptography==2.4.2
cytoolz==0.9.0.1
dask==1.1.2
decorator==4.3.0
distributed==1.26.0
enum34==1.1.6
et-xmlfile==1.0.1
future==0.17.1
futures==3.2.0
heapdict==1.0.0
idna==2.8
ipaddress==1.0.22
ipython==5.8.0
ipython-genutils==0.2.0
jdcal==1.4
Jinja2==2.10
locket==0.2.0
MarkupSafe==1.1.1
msgpack==0.6.1
netCDF4==1.5.1.2
numpy==1.16.0
olefile==0.46
openpyxl==2.5.12
packaging==19.0
pandas==0.23.4
partd==0.3.9
pathlib2==2.3.3
pexpect==4.6.0
pickleshare==0.7.5
Pillow==5.4.1
prompt-toolkit==1.0.15
psutil==5.5.0
ptyprocess==0.6.0
pycparser==2.19
Pygments==2.3.1
pyOpenSSL==18.0.0
pyparsing==2.3.1
PySocks==1.6.8
python-dateutil==2.7.5
pytz==2018.9
PyYAML==3.13
requests==2.21.0
scandir==1.9.0
simplegeneric==0.8.1
singledispatch==3.4.0.3
six==1.12.0
sortedcontainers==2.1.0
tblib==1.3.2
toolz==0.9.0
tornado==5.1.1
traitlets==4.3.2
urllib3==1.24.1
wcwidth==0.1.7
xarray==0.11.0
zict==0.1.3

classic rtd environment

Work was recently done to add a python based rtd program for CLASSIC variables specifically. Nic Annau worked with Vivek and Luke Grant in order to integrate this into internal CanESM5.3 and CanESM6.0 versions - as these versions require a classic rtd environment.

On eccc-u2 this enviornment is currently provided as part of our centralized python environments.

Note

Due to the higher memory load of CanESM6.0, extra work was done to reduce the memory footprint of the classic-rtd job. This is documented here. As a result of this, there are differences between the develop_canesm (CanESM5.3) and v6.0_release (CanESM6.0) versions.

Note

The classic rtds currently rely on a compiled version of nccrip in order to convert ccc files into netcdf files before loading and running the analysis. Once work is done to have netcdf files available by default, work should be done to adjust those tools.

Note

This is only required for CanESM versions >= 5.3.

agcm diagnostic environment

For CanESM6.0, as of July ‘25, work is being done to integrate a new AGCM diagnostic pipeline, which involves the use of the new canesm-processor package. As such, we have installed a centralized environment for it - see here for details.

Note

This is only required for CanESM versions >= 5.3.

cdo

With the use of netcdf files within the CanESM ecosystem, users/devs often write scripting to do simple file manipulations using cdo (see here). Notably, to run some of the ocean diagnostics, cdo must be available - see below for version details and how this has been loaded on different machines.

eccc-u2

>> . ssmuse-sh -x eccc/cmd/cmds/ext/20220331
>> cdo --version
Climate Data Operators version 2.0.3 (https://mpimet.mpg.de/cdo)
System: x86_64-pc-linux-gnu
CXX Compiler: c++ -std=gnu++14 -march=x86-64 -mtune=generic -O2 -pipe -fopenmp -pthread
CXX version : unknown
C Compiler: gcc -march=x86-64 -mtune=generic -O2 -pipe -fopenmp -pthread -pthread
C version : gcc (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5)
F77 Compiler: gfortran -g -O2
F77 version : GNU Fortran (GCC) 8.3.1 20191121 (Red Hat 8.3.1-5)
Features: 992GB 80threads C++14 OpenMP45 Fortran PTHREADS HDF5 NC4/HDF5 OPeNDAP SZ PROJ XML2 CURL FFTW3 SSE2
Libraries: HDF5/1.10.5 proj/6.3.2 xml2/2.9.7 curl/7.61.1
CDI data types: SizeType=size_t  DateType=int64_t
CDI file types: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5
     CDI library version : 2.0.3
 cgribex library version : 2.0.0
 ecCodes library version : 2.24.2
  NetCDF library version : 4.7.0 of Aug 15 2019 22:03:10 $
    hdf5 library version : 1.10.5
    exse library version : 1.4.2
    FILE library version : 1.9.1

niagara

>> module load CCEnv arch/avx512 StdEnv/2020 cdo/2.0.5
>> cdo --version
Climate Data Operators version 2.0.5 (https://mpimet.mpg.de/cdo)
System: x86_64-pc-linux-gnu
CXX Compiler: mpicxx -std=gnu++14 -O2 -xCore-AVX512 -ftz -fp-speculation=safe -fp-model source -fPIC -fopenmp -pthread
CXX version : Intel(R) C++ Intel(R) 64 Compiler for applications running on Intel(R) 64, Version 19.1.1.217 Build 20200306
C Compiler: mpicc -O2 -xCore-AVX512 -ftz -fp-speculation=safe -fp-model source -fPIC -fopenmp -pthread -pthread
C version : icc (ICC) 19.1.1.217 20200306
F77 Compiler: mpifort -O2 -xCore-AVX512 -ftz -fp-speculation=safe -fp-model source -fPIC
F77 version : ifort (IFORT) 19.1.1.217 20200306
Features: 93GB 40threads C++14 OpenMP45 Fortran PTHREADS HDF5 NC4/HDF5 OPeNDAP PROJ AVX2
Libraries: HDF5/1.10.6 proj/9.0.0
CDI data types: SizeType=size_t  DateType=int64_t
CDI file types: srv ext ieg grb1 grb2 nc1 nc2 nc4 nc4c nc5
     CDI library version : 2.0.5
 cgribex library version : 2.0.1
 ecCodes library version : 2.25.0
  NetCDF library version : 4.7.4 of May 14 2020 00:07:21 $
    hdf5 library version : 1.10.6
    exse library version : 1.4.2
    FILE library version : 1.9.1