Table of Contents

Introduction

Recipe description

This cookbook recipe describes the automatic deallocation of memory blocks allocated via malloc() calls in C, when the corresponding Python numpy array objects are destroyed. The recipe uses SWIG and a modified numpy.i helper file.

To be more specific, new fragments were added to the existing numpy.i to handle automatic deallocation of arrays, the size of which is not known in advance. As with the original fragments, a block of malloc() memory can be converted into a returned numpy python object via a call to PyArray_SimpleNewFromData(). However, the returned python object is created using PyCObject_FromVoidPtr(), which ensures that the allocated memory is automatically disposed of when the Python object is destroyed. Examples below show how using these new fragments avoids leaking memory.

Since the new fragments are based on the _ARGOUTVIEW_ ones, the name _ARGOUTVIEWM_ was chosen, where M stands for managed. All managed fragments (ARRAY1, 2 and 3, FARRAY1, 2 and 3) were implemented, and have now been extensively tested.

Where to get the files

At the moment, the modified numpy.i file is available here (last updated 2012-04-22):

How the code came about

The original memory deallocation code was written by Travis Oliphant (see http://blog.enthought.com/?p=62 ) and as far as I know, these clever people were the first ones to use it in a swig file (see http://niftilib.sourceforge.net/pynifti, file nifticlib.i). Lisandro Dalcin then pointed out a simplified implementation using CObjects, which Travis details in this updated blog post.

How to use the new fragments

Important steps

In yourfile.i, the %init function uses the same import_array() call you already know:

   1 %init %{
   2     import_array();
   3 %}

... then just use ARGOUTVIEWM_ARRAY1 instead of ARGOUTVIEW_ARRAY1 and memory deallocation is handled automatically when the python array is destroyed (see examples below).

A simple ARGOUTVIEWM_ARRAY1 example

The SWIG-wrapped function in C creates an N integers array, using malloc() to allocate memory. From python, this function is repetitively called and the array created destroyed (M times).

Using the ARGOUTVIEW_ARRAY1 provided in numpy.i, this will create memory leaks (I know ARGOUTVIEW_ARRAY1 has not been designed for this purpose but it's tempting!).

Using the ARGOUTVIEWM_ARRAY1 fragment instead, the memory allocated with malloc() will be automatically deallocated when the array is deleted.

The python test program creates and deletes a 1024^2 ints array 2048 times using both ARGOUTVIEW_ARRAY1 and ARGOUTVIEWM_ARRAY1 and when memory allocation fails, an exception is generated in C and caught in Python, showing which iteration finally caused the allocation to fail.

The C source (ezalloc.c and ezalloc.h)

Here is the ezalloc.h file:

   1 void alloc(int ni, int** veco, int *n);

Here is the ezalloc.c file:

   1 #include <stdio.h>
   2 #include <errno.h>
   3 #include "ezalloc.h"
   4 
   5 void alloc(int ni, int** veco, int *n)
   6 {
   7     int *temp;
   8     temp = (int *)malloc(ni*sizeof(int));
   9 
  10     if (temp == NULL)
  11         errno = ENOMEM;
  12 
  13     //veco is either NULL or pointing to the allocated block of memory...
  14     *veco = temp;
  15     *n = ni;
  16 }

The interface file (ezalloc.i)

The file (available here: ezalloc.i) does a couple of interesting things:

   1 %module ezalloc
   2 %{
   3 #include <errno.h>
   4 #include "ezalloc.h"
   5 
   6 #define SWIG_FILE_WITH_INIT
   7 %}
   8 
   9 %include "numpy.i"
  10 
  11 %init %{
  12     import_array();
  13 %}
  14 
  15 %apply (int** ARGOUTVIEWM_ARRAY1, int *DIM1) {(int** veco1, int* n1)}
  16 %apply (int** ARGOUTVIEW_ARRAY1, int *DIM1) {(int** veco2, int* n2)}
  17 
  18 %include "ezalloc.h"
  19 
  20 %exception
  21 {
  22     errno = 0;
  23     $action
  24 
  25     if (errno != 0)
  26     {
  27         switch(errno)
  28         {
  29             case ENOMEM:
  30                 PyErr_Format(PyExc_MemoryError, "Failed malloc()");
  31                 break;
  32             default:
  33                 PyErr_Format(PyExc_Exception, "Unknown exception");
  34         }
  35         SWIG_fail;
  36     }
  37 }
  38 
  39 %rename (alloc_managed) my_alloc1;
  40 %rename (alloc_leaking) my_alloc2;
  41 
  42 %inline %{
  43 
  44 void my_alloc1(int ni, int** veco1, int *n1)
  45 {
  46     /* The function... */
  47     alloc(ni, veco1, n1);
  48 }
  49 
  50 void my_alloc2(int ni, int** veco2, int *n2)
  51 {
  52     /* The function... */
  53     alloc(ni, veco2, n2);
  54 }
  55 
  56 %}

Don't forget that you will need the numpy.i file in the same directory for this to compile.

Setup file (setup_alloc.py)

This is the setup_alloc.py file:

   1 #! /usr/bin/env python
   2 
   3 # System imports
   4 from distutils.core import *
   5 from distutils      import sysconfig
   6 
   7 # Third-party modules - we depend on numpy for everything
   8 import numpy
   9 
  10 # Obtain the numpy include directory.  This logic works across numpy versions.
  11 try:
  12     numpy_include = numpy.get_include()
  13 except AttributeError:
  14     numpy_include = numpy.get_numpy_include()
  15 
  16 # alloc extension module
  17 _ezalloc = Extension("_ezalloc",
  18                    ["ezalloc.i","ezalloc.c"],
  19                    include_dirs = [numpy_include],
  20 
  21                    extra_compile_args = ["--verbose"]
  22                    )
  23 
  24 # NumyTypemapTests setup
  25 setup(  name        = "alloc functions",
  26         description = "Testing managed arrays",
  27         author      = "Egor Zindy",
  28         version     = "1.0",
  29         ext_modules = [_ezalloc]
  30         )

Compiling the module

The setup command-line is (in Windows, using mingw):

$> python setup_alloc.py build --compiler=mingw32

or in UN*X, simply

$> python setup_alloc.py build

Testing the module

If everything goes according to plan, there should be a _ezalloc.pyd file available in the build\lib.XXX directory. The file needs to be copied in the directory with the ezalloc.py file (generated by swig).

A python test program is provided in the SVN repository (test_alloc.py) and reproduced below:

   1 import ezalloc
   2 
   3 n = 2048
   4 
   5 # this multiplied by sizeof(int) to get size in bytes...
   6 #assuming sizeof(int)=4 on a 32bit machine (sorry, it's late!)
   7 m = 1024 * 1024
   8 err = 0
   9 
  10 print "ARGOUTVIEWM_ARRAY1 (managed arrays) - %d allocations (%d bytes each)" % (n,4*m)
  11 for i in range(n):
  12     try:
  13         #allocating some memory
  14         a = ezalloc.alloc_managed(m)
  15         #deleting the array
  16         del a
  17     except:
  18         err = 1
  19         print "Step %d failed" % i
  20         break
  21 
  22 if err == 0:
  23     print "Done!\n"
  24 
  25 print "ARGOUTVIEW_ARRAY1 (unmanaged, leaking) - %d allocations (%d bytes each)" % (n,4*m)
  26 for i in range(n):
  27     try:
  28         #allocating some memory
  29         a = ezalloc.alloc_leaking(m)
  30         #deleting the array
  31         del a
  32     except:
  33         err = 1
  34         print "Step %d failed" % i
  35         break
  36 
  37 if err == 0:
  38     print "Done? Increase n!\n"

Then, a

$> python test_alloc.py

will produce an output similar to this:

ARGOUTVIEWM_ARRAY1 (managed arrays) - 2048 allocations (4194304 bytes each)
Done!

ARGOUTVIEW_ARRAY1 (unmanaged, leaking) - 2048 allocations (4194304 bytes each)
Step 483 failed

The unmanaged array leaks memory every time the array view is deleted. The managed one will delete the memory block seamlessly. This was tested both in Windows XP and Linux.

A simple ARGOUTVIEWM_ARRAY2 example

The following examples shows how to return a two-dimensional array from C which also benefits from the automatic memory deallocation.

A naive "crop" function is wrapped using SWIG/numpy.i and returns a slice of the input array. When used as array_out = crop.crop(array_in, d1_0,d1_1, d2_0,d2_1), it is equivalent to the native numpy slicing array_out = array_in[d1_0:d1_1, d2_0:d2_1].

The C source (crop.c and crop.h)

Here is the crop.h file:

   1 void crop(int *arr_in, int dim1, int dim2, int d1_0, int d1_1, int d2_0, int d2_1, int **arr_out, int *dim1_out, int *dim2_out);

Here is the crop.c file:

   1 #include <stdlib.h>
   2 #include <errno.h>
   3 
   4 #include "crop.h"
   5 
   6 void crop(int *arr_in, int dim1, int dim2, int d1_0, int d1_1, int d2_0, int d2_1, int **arr_out, int *dim1_out, int *dim2_out)
   7 {
   8     int *arr=NULL;
   9     int dim1_o=0;
  10     int dim2_o=0;
  11     int i,j;
  12 
  13     //value checks
  14     if ((d1_1 < d1_0) || (d2_1 < d2_0) ||
  15         (d1_0 >= dim1) || (d1_1 >= dim1) || (d1_0 < 0) || (d1_1 < 0) ||
  16         (d2_0 >= dim2) || (d2_1 >= dim2) || (d2_0 < 0) || (d2_1 < 0))
  17     {
  18         errno = EPERM;
  19         goto end;
  20     }
  21 
  22     //output sizes
  23     dim1_o = d1_1-d1_0;
  24     dim2_o = d2_1-d2_0;
  25 
  26     //memory allocation
  27     arr = (int *)malloc(dim1_o*dim2_o*sizeof(int));
  28     if (arr == NULL)
  29     {
  30         errno = ENOMEM;
  31         goto end;
  32     }
  33 
  34     //copying the cropped arr_in region to arr (naive implementation)
  35     printf("\n--- d1_0=%d d1_1=%d (rows)  -- d2_0=%d d2_1=%d (columns)\n",d1_0,d1_1,d2_0,d2_1);
  36     for (j=0; j<dim1_o; j++)
  37     {
  38         for (i=0; i<dim2_o; i++)
  39         {
  40             arr[j*dim2_o+i] = arr_in[(j+d1_0)*dim2+(i+d2_0)];
  41             printf("%d ",arr[j*dim2_o+i]);
  42         }
  43         printf("\n");
  44     }
  45     printf("---\n\n");
  46 
  47 end:
  48     *dim1_out = dim1_o;
  49     *dim2_out = dim2_o;
  50     *arr_out = arr;
  51 }

The interface file (crop.i)

The file (available here: crop.i) does a couple of interesting things:

   1 %module crop
   2 %{
   3 #include <errno.h>
   4 #include "crop.h"
   5 
   6 #define SWIG_FILE_WITH_INIT
   7 %}
   8 
   9 %include "numpy.i"
  10 
  11 %init %{
  12     import_array();
  13 %}
  14 
  15 %exception crop
  16 {
  17     errno = 0;
  18     $action
  19 
  20     if (errno != 0)
  21     {
  22         switch(errno)
  23         {
  24             case EPERM:
  25                 PyErr_Format(PyExc_IndexError, "Index error");
  26                 break;
  27             case ENOMEM:
  28                 PyErr_Format(PyExc_MemoryError, "Not enough memory");
  29                 break;
  30             default:
  31                 PyErr_Format(PyExc_Exception, "Unknown exception");
  32         }
  33         SWIG_fail;
  34     }
  35 }
  36 
  37 %apply (int* IN_ARRAY2, int DIM1, int DIM2) {(int *arr_in, int dim1, int dim2)}
  38 %apply (int** ARGOUTVIEWM_ARRAY2, int* DIM1, int* DIM2) {(int **arr_out, int *dim1_out, int *dim2_out)}
  39 
  40 %include "crop.h"

Don't forget that you will need the numpy.i file in the same directory for this to compile.

Setup file (setup_crop.py)

This is the setup_crop.py file:

   1 #! /usr/bin/env python
   2 
   3 # System imports
   4 from distutils.core import *
   5 from distutils      import sysconfig
   6 
   7 # Third-party modules - we depend on numpy for everything
   8 import numpy
   9 
  10 # Obtain the numpy include directory.  This logic works across numpy versions.
  11 try:
  12     numpy_include = numpy.get_include()
  13 except AttributeError:
  14     numpy_include = numpy.get_numpy_include()
  15 
  16 # crop extension module
  17 _crop = Extension("_crop",
  18                    ["crop.i","crop.c"],
  19                    include_dirs = [numpy_include],
  20 
  21                    extra_compile_args = ["--verbose"]
  22                    )
  23 
  24 # NumyTypemapTests setup
  25 setup(  name        = "crop test",
  26         description = "A simple crop test to demonstrate the use of ARGOUTVIEWM_ARRAY2",
  27         author      = "Egor Zindy",
  28         version     = "1.0",
  29         ext_modules = [_crop]
  30         )

Testing the module

If everything goes according to plan, there should be a _crop.pyd file available in the build\lib.XXX directory. The file needs to be copied in the directory with the crop.py file (generated by swig).

A python test program is provided in the SVN repository (test_crop.py) and reproduced below:

   1 import crop
   2 import numpy
   3 
   4 a = numpy.zeros((5,10),numpy.int)
   5 a[numpy.arange(5),:] = numpy.arange(10)
   6 
   7 b = numpy.transpose([(10 ** numpy.arange(5))])
   8 a = (a*b)[:,1:] #this array is most likely NOT contiguous
   9 
  10 print a
  11 print "dim1=%d dim2=%d" % (a.shape[0],a.shape[1])
  12 
  13 d1_0 = 2
  14 d1_1 = 4
  15 d2_0 = 1
  16 d2_1 = 5
  17 
  18 c = crop.crop(a, d1_0,d1_1, d2_0,d2_1)
  19 d = a[d1_0:d1_1, d2_0:d2_1]
  20 
  21 print "returned array:"
  22 print c
  23 
  24 print "native slicing:"
  25 print d

This is what the output looks like:

$ python test_crop.py 
[[    1     2     3     4     5     6     7     8     9]
 [   10    20    30    40    50    60    70    80    90]
 [  100   200   300   400   500   600   700   800   900]
 [ 1000  2000  3000  4000  5000  6000  7000  8000  9000]
 [10000 20000 30000 40000 50000 60000 70000 80000 90000]]
dim1=5 dim2=9

--- d1_0=2 d1_1=4 (rows)  -- d2_0=1 d2_1=5 (columns)
200 300 400 500 
2000 3000 4000 5000 
---

returned array:
[[ 200  300  400  500]
 [2000 3000 4000 5000]]
native slicing:
[[ 200  300  400  500]
 [2000 3000 4000 5000]]

numpy.i takes care of making the array contiguous if needed, so the only thing left to take care of is the array orientation.

Conclusion and comments

That's all folks! Files are available on the Google code SVN. As usual, comments welcome!

Regards, Egor


CategoryCookbook

Cookbook/SWIG Memory Deallocation (last edited 2012-04-22 10:06:50 by EgorZindy)