Cdo{rb,py} allows you to use CDO in the context of Python and Ruby as if it would be a native library. It offers some features, which makes it superior over using CDO in a plain shell:

  • automatic tempfile handling
  • conditional processing (i.e. process only files, which do not exist) as a configuration
  • flexible multi-threadding
  • direct data access via numpy/narray (together with optional dep. python-netcdf4/scipy)
  • write new operators out of old ones
  • work with multiple CDO version simultaneously

Current release is 1.2.3. There also is a github repository for easy code sharing and where the changelog is tracked.

If you have questions, please use the CDO forum.

what it is (not) ...

This scripting language package is essentially a wrapper around the CDO binary. It parses method arguments and options, builds a command line and executes it. There is no shared C-library backend which calls CDO operators. This has some advantages:

  • operator chaining is fully supported
  • multiple CDO binaries can be used at the same time using setCdo() (alt. link)
  • packages are highly portable, because they are pure python/ruby implementations


Almost all features are covered by units tests. Theses should be a good starting point to get an impression in how to use the package:

  • Python: source:trunk/cdo/contrib/python/test/ or at github
  • Ruby: source:trunk/cdo/contrib/ruby/test/test_cdo.rb or at github

Both bindings are tested with the unix and the win32 version of CDO. Please note, that returning arrays by setting returnCdf is not tested due to the lack of the corresponding netcdf library on windows. There are precompiled windows version of netcdf, but I will not spent time to get it running.

Before doing anything else, the libraries must have been loaded in the usual way:

from cdo import *   # python version
cdo = Cdo()

In the python version an object has to be created for internal reasons, whereas this is not necessary for Ruby. This may change in the future, but for now it is only a minor difference
require 'cdo'       # ruby version

online/offline help

For all non-operators, the automatically generated documentation form rubygems might be helpful. Operator documentation can be viewed online, directly by calling

cdo -h <operator pattern>
or within the interactive python/ruby shell. Both of the folling examples display the built-in help for sinfov:

  • Python:
    from cdo import *
    from cdo import *
    cdo = Cdo()
  • Ruby
    require 'cdo''sinfov')     # or


Input and output files can be set with the keywors input and output

    Cdo.infov(:input => ifile)      #ruby version
    cdo.showlevel(:input => ifile)
    cdo.infov(input=ifile)          #python verson
    Cdo.timmin(:input => ifile ,:output => ofile)   #ruby version
    cdo.timmin(input = ifile,    output =  ofile)   #python version


Commandline options like '-f' or '-P' can by used via the options keyword:

    Cdo.timmin(:input => ifile ,:output => ofile,:options => '-f nc') #ruby version
    cdo.timmin(input = ifile,    output = ofile,  options = '-f nc')  #python version

Operator arguments have to be given as the first method argument

    Cdo.remap(gridFile,    weightFile,:input => ifile,:output => ofile,:options => '-f nc') #ruby version
    cdo.remap(gridFile+","+weighFile,  input =  ifile, output =  ofile, options = '-f nc')  #python version

    Cdo.seltimestep('1/10',:input => ifile,:output => ofile,:options => '-r -B F64') #ruby version
    cdo.seltimestep('1/10', input =  ifile, output =  ofile, options =  '-r -B F64') #python version

Operator Chains

To take real advantage of CDOs internal parallelism, you should work with operator chains as mush as possible:

    Cdo.setname('random',:input => "-mul -random,r10x10 -enlarge,r10x10 -setyear,2000 -for,1,4",:output => ofile,:options => '-f nc') #ruby version
    cdo.setname('random', input =  "-mul -random,r10x10 -enlarge,r10x10 -setyear,2000 -for,1,4", output =  ofile, options =  '-f nc') #python version

Another good example taken from the Tutorial illustrates the different ways of chaining: While the chain

cdo sub -dayavg ifile2 -timavg ifile1 ofile
is represented by

Cdo.sub(:input => "-dayavg #{ifile2} -timavg #{ifile1}", :output => ofile)  #ruby
cdo.sub(input = "-dayavg " + ifile2 + " -timavg " +ifile1, output = ofile)  #python

The serial version, which prohibits internal parallelism, creates unnecessary temporal files and is just mentioned for educational reasons would look like

Cdo.sub(:input => Cdo.dayavg(:input => ifile2) + " " + Cdo.timavg(:input => ifile1), :output => ofile)  #ruby
cdo.sub(input  =  cdo.dayavg(input  =  ifile2) + " " + cdo.timavg(input  =  ifile1), output  =  ofile)  #python

or using the join-method:
Cdo.sub(:input => [Cdo.dayavg(:input => ifile2),Cdo.timavg(:input => ifile1)].join(" "), :output => ofile)  #ruby
cdo.sub(input  =  " ".join([cdo.dayavg(input  =  ifile2),cdo.timavg(input  =  ifile1)] , output  =  ofile)  #python

Special Features

Tempfile handling

If the output stream is omitted, a temporary file is written and its name is the return value of the call:

    ofile = Cdo.timmin(:input => ifile ,:options => '-f nc')   #ruby version
    ofile = cdo.timmin(input  =  ifile,  options =  '-f nc')   #python version

Here, the output files are automatically removed, when the scripts finishes. Manual cleanup is not necessary any more.

Conditional Processing

When processing large number of input files as it is the case in a running experiment, it can be very helpful to suppress the creation of intermediate output if these files are already there. This can speed up your post-processing. In the default behavior, output is created no matter if something is overwritten or not. Conditional processing can be used in two different ways:

  • global setting
    cdo.forceOutput = False   #python
    Cdo.forceOutput = false   #ruby
    This switch changes the default behavior (example)
  • operator option
    cdo.stdatm("0,10,20",output = ofile, force =  False)  #python
    Cdo.stdatm(0,10,20,:output => ofile,:force => false)  #ruby
    The usage of this options allows you to setup the output action very precisely without changing the default (example for good place to uses this feature)


When things can be done in parallel, Python and Ruby offer a smart way to handle this without to much overhead. A Ruby example should illustrate how it can be done: Tutorial

require 'cdo'
require 'jobqueue'

iFile                 = ARGV[0].nil? ? 'ifs_oper_T1279_2011010100.grb' : ARGV[0]
targetGridFile        = ARGV[1].nil? ? ''            : ARGV[1] # grid file
targetGridweightsFile = ARGV[2].nil? ? ''          : ARGV[2] # pre-computed interpolation weights
nWorkers              = ARGV[3].nil? ? 8                               : ARGV[3] # number of parallel threads

# lets work in debug mode
Cdo.debug = true

# create a queue with a predifined number of workers
jq =

# split the input file wrt to variable names,codes,levels,grids,timesteps,...
splitTag = "ifs2icon_skel_split_" 
#Cdo.splitcode(:in => iFile, :out => splitTag,:options => '-f nc')
Cdo.splitname(:in => iFile, :out => splitTag,:options => '-f nc')

# collect Files form the split
files = Dir.glob("#{splitTag}*.nc")

# remap variables in parallel
files.each {|file|
  jq.push {
    basename = file[0..-(File.extname(file).size+1)]
              :in => file,
              :out => "remapped_#{basename}.nc")

# Merge all the results together
Cdo.merge(:in => Dir.glob("remapped_*.nc").join(" "),:out => '')

In this case the parallelization is done per variable. The only lines, which had to be added for letting the code run on a user defined (see line 7) number of thread are 2, 13, 25, 30 and 32. This approach uses a queue, which takes all jobs and is getting started with A python version of JobQueue should be easy to implement. Contribution would be appreciated!

A multiprocessing based example may look like

from cdo import *
import multiprocessing

def showlevel(arg):
    return cdo.showlevel(input=arg)

cdo       = Cdo()
cdo.debug = True
ifile     = '/home/ram/local/data/cdo/'
pool      = multiprocessing.Pool(1)
results   = []

for i in range(0,5):
    results.append(pool.apply_async(showlevel, [ifile]))


for res in results:

Data access via numpy/narray

When working with netcdf, it is possible to get access to the data in three additional ways:

  1. a file handle: Using a file handle offers the flexibility to go through the whole file with all it information like variables, dimensions and attributes. To get such an handle form a cdo call, use the returnCdf keyword or use the readCdf methods:
    cdo.stdatm("0", options = "-f nc", returnCdf  =  True).variables["P"][:]  #python, access variable 'P' with
    Cdo.stdatm(0, :options => "-f nc", :returnCdf => true).var("P").get       #ruby , access with ruby-netcdf
    or return the pure handle with
    cdo.readCdf(ifile)  #python
    Cdf.readCdf(ifile)  #ruby
  2. a numpy/narray object: If a certain variable should be read in, use the returnArray instead of returnCdf:
    pressure = cdo.stdatm("0", options = "-f nc",  returnArray = 'P')  #python
    pressure = Cdo.stdatm(0, :options => "-f nc", :returnArray = 'P')  #ruby
  3. a masked array: If the target variable has missing values, i.e. makes use of the FillValue, the returned structure reflects this. For python a masked array is returned, the ruby version uses a special version of NArray called NArrayMiss. As an example, lets mask out the ocean from the global topography:
    oro = cdo.setrtomiss(-10000,0, input =  cdo.topo( options =  '-f nc'), returnMaArray =  'topo')  #python
    oro = Cdo.setrtomiss(-10000,0,:input => Cdo.topo(:options => '-f nc'),:returnMaArray => 'topo')  #ruby

Have a look into the documentation of the underlying netcdf libraries to get an overview of their functionality:


The python module requires (or pycdf as a fallback) whereas the ruby module needs ruby-netcdf. These dependencies are not handled automatically by pip or gem, because they are optional. Scipy and netcdf4-python are available for most linux/unix distributions as a precompiled package. If this is not the case for your favorite one, you could also use its pip repository. The ruby-netcdf package has a gem-repository:

  • Ruby:
    gem install ruby-netcdf
    gem install ruby-netcdf --user-install
  • Python:
    pip install scipy
    or visit the the homepage for help on manual installation

Use Cases: Plotting

Examples: Python

from cdo import *
cdo   = Cdo()                                                         # create the CDO caller
ifile = ''                                                    # input: surface temperature
cdo.fldsum(input=ifile)                                               # compute the timeseries of global sum, return a temporary filename
vals  = cdo.fldsum(input=ifile,returnCdf=True).variables['tsurf'][:]  # return the timeseries as numpy array
print(cdo.fldsum(input=ifile,returnCdf=True).variables)               # get a list of all variables in the file 

Basic plotting:

from cdo import *
import matplotlib.pyplot as plt
ifile = ''
cdo   = Cdo()

# Comput the field mean value timeseries and return it as a numpy array
vals  = cdo.fldmean(input=ifile,returnCdf=True).variables['tsurf'][:] 

# make it 1D
tmean = vals.flatten()

# Plot the cumulatice sum of the variataion
produces: original:

Examples: Ruby

require 'cdo'
ifile = ''                                                   # input: surface temperature
vals  = Cdo.fldsum(:in => ifile,:returnCdf=> true).var('tsurf').get  # return the global sum timeseries as narray
puts Cdo.fldsum(:in=> ifile,:returnCdf => true).var_names            # get a list of all variables in the file 

If you want some basic plotting, use the Ruby bindings of the GNU Scientific Library. You can install it like cdo. Here's a short example:

require 'cdo'
require 'gsl'
tmean = Cdo.fldmean(:in => ifile,:returnCdf => true).var('tsurf').get
tmean.to_gv.plot("w d title 'AMPI global mean surface temp'")

which shows

In this context the variable tmean is of type narray which is the ruby version of numpy. It has several methods itself. For filtering the out the temporal behaviour of the aboce time series, you could substract the mean value and display the cumulative sum by adding:

(tmean-tmean.mean)[0,0,0..-1].cumsum.to_gv.plot("w d title 'CUMSUM of global mean surface temp variation'")

with results

Use Cases: Interpolation, Root finding, Data fitting, ...

Through the numpy/narray interface, both python and ruby version offer a huge amount of extra functionality via several 3rd party libraries:

Write your own operators

Future versions

Both cdo modules are not directly linked to a special CDO version. Instead you can change the CDO version to what ever you have installed. Use the setCdo method to use another CDO binary. When CDO is updated and new operators area available, they are usable in the python and ruby modules automatically without any update.


CDO can be easily accessed via Ruby and Python. For each of the these two there is a dedicated package with can be installed from public servers with their own specific package management systems: gem for Ruby and pypi for Python. Interfaces of both packages are the identical.


Ruby's package system is called gem. The cdo module is located here. Its installation is rather easy: Just type

gem install cdo
and you'll get the latest version installed on your system. Installation as usual requires root privileges, so it might be necessary to prepend a sudo to the command. gem has a great built-in help:
gem help install
will show all you need for installation. If you do not have root access to you machine, another installation directory should be chosen with the --install-dir option or you use
gem install cdo --user-install
for an installation under $HOME/.gem.

Ruby 1.9.x comes with gem included, but some distros like debian and its derivates create extra packages for it. You might have to watch out for a rubygems package.


The cdo module can also be installed for python using pypi, the python package index. Cdo can be found here. If pip is installed on your system, just type

pip install cdo
For user installations, use
pip install --build=/tmp/pip_build --src=/tmp/src_build --user cdo
Please Note: For upgrading with pip, you have to remove the temporary directories first. Otherwise the upgrade will not take place:
rm -rf /tmp/pip /tmp/src && pip install --build=/tmp/pip --src=/tmp/src --user cdo --upgrade

Without pip, you should download the tar file and run (possibly requiring root privileges)
python install
in the extracted directory.

For ZMAW users

Depending on the target language, use module load python or module load ruby before you use the cdo bindings. For getting easy access to the data, the netcdf bindings for Python/Ruby have to be installed:

  • Python: pycdf is installed for the python-2.7 module on all lenny-64 workstations. Using with should work with and without returnCdf=True.
  • Ruby: ruby-netcdf is installed for the available ruby module. In case of problems it can be installed by every user locally:
    1. Create a directory structure (if you didn't do it for cdo.rb) for local gems with
      mkdir -p $HOME/.gem/ruby/1.9.1
    2. Call
      gem install -r ruby-netcdf --install-dir=$HOME/.gem/ruby/1.9.1 -- --with-netcdf-dir=/sw/lenny-x64/netcdf-4.1.3-gccsys

plotTest.png (26.4 KB) Ralf Mueller, 2012-01-20 10:06

plotTestCumsum.png (18.6 KB) Ralf Mueller, 2012-01-20 10:18

tmean_py.png (38.6 KB) Ralf Mueller, 2012-01-24 14:08

tsurfOrg.png (30.4 KB) Ralf Mueller, 2012-11-22 09:22

tsurfOrg.png (59.9 KB) Ralf Mueller, 2012-11-22 12:30

tmean_py.png (36.7 KB) Ralf Mueller, 2012-11-22 12:30 (482 KB) Ralf Mueller, 2015-04-20 10:20 (70.6 MB) Ralf Mueller, 2015-04-20 10:26