标签:
1 Extending Python with C or C++
It is quite easy to add new built-in modules to Python, if you know how to program in C. Such extension modules can do two things that can’t be done directly in Python: they can implement new built-in object types, and they can call C library functions and system calls.
如果你知道如何用C编程,那么添加新的内置Python模块会非常容易。这种扩展的模块可以做两种Python无法直接实现的事情:它能实现新的内置对象类型,它能调用C库函数和系统函数。
To support extensions, the Python API (Application Programmers Interface) defines a set of functions, macros and variables that provide access to most aspects of the Python run-time system. The Python API is incorporated in a C source file by including the header "Python.h".
为了支持这种扩展,Python API(应用程序编程接口)定义了一系列的函数、宏和变量来使用Python运行时的各个方面。这些Python API 被定义在一个C头文件 "Python.h"中。
The compilation of an extension module depends on its intended use as well as on your system setup; details are given in later chapters.
为了编译一个扩展的模块,不仅依赖于它的用途,而且依赖于你的系统环境,详细情况见下一章。
Note
The C extension interface is specific to CPython, and extension modules do not work on other Python implementations. In many cases, it is possible to avoid writing C extensions and preserve portability to other implementations. For example, if your use case is calling C library functions or system calls, you should consider using the ctypes module or the cffi library rather than writing custom C code. These modules let you write Python code to interface with C code and are more portable between implementations of Python than writing and compiling a C extension module.
注意
这里的C扩展模块特指CPython版本,并不支持其他Python版本。在许多情况下,最好避免写这样的C扩展模块来保持对其他Python版本的可移植性。例如,如果你使用C库函数调用或系统调用,你应该考虑使用ctypes模块或者cffi库而不应该写习惯的C代码。这些模块能让你用Python代码来调用C代码写的接口,并且可以移植到不同的Python版本。
1.1. A Simple Example
Let’s create an extension module called spam (the favorite food of Monty Python fans...) and let’s say we want to create a Python interface to the C library function system(). [1] This function takes a null-terminated character string as argument and returns an integer. We want this function to be callable from Python as follows:
我们先创建命名为spam的扩展模块(spam(罐头猪肉)是Python爱好者喜爱的食物...),这个Python接口包括一个用C库函数写的方法system(). system()含有一个非空字符串参数,并返回一个整型,我们希望这个函数以如下方式被Pyton调用:
>>> import spam >>> status = spam.system("ls -l")
Begin by creating a file spammodule.c. (Historically, if a module is called spam, the C file containing its implementation is called spammodule.c; if the module name is very long, like spammify, the module name can be just spammify.c.)
先开始创建spammodule.c文件(习惯风格,如果一个模块叫做spam,那么实现它的C文件名称命名为spammodule.c; 如果模块的名字很长,例如spammify,那么C文件最好也命名为spammify.c)
The first line of our file can be:
spammodule.c的第一行:
#include <Python.h>
which pulls in the Python API (you can add a comment describing the purpose of the module and a copyright notice if you like).
这一行作用就是导入Python API(如果你喜欢,你可以在这个头文件前加入一些有关这个模块的描述和版权信息)。
Note
Since Python may define some pre-processor definitions which affect the standard headers on some systems, you must include Python.h before any standard headers are included.
因为Python定义了一些预处理指令,这可能在某些系统上会影响C标准头文件,因此你必须在引入标准头文件前引入Python.h头文件。
All user-visible symbols defined by Python.h have a prefix of Py or PY, except those defined in standard header files. For convenience, and since they are used extensively by the Python interpreter, "Python.h" includes a few standard header files: <stdio.h>, <string.h>, <errno.h>, and <stdlib.h>. If the latter header file does not exist on your system, it declares the functions malloc(), free() and realloc() directly.
除开那些在标准头文件定义的符号,所有定义在Python.h头文件中的符号都含有前缀Py或者PY。为了方便起见或者是因为这些C头文件被Python解释器广泛的使用,Python.h 包括了几个C标准头文件:<stdio.h>, <string.h>, <errono.h>和<stdlib.h>。如果这些头文件在你的系统中不存在,那么Python解释器会直接定义malloc(),free()和realloc()函数。
The next thing we add to our module file is the C function that will be called when the Python expression spam.system(string) is evaluated (we’ll see shortly how it ends up being called):
接下来我们在spammodule.c文件中添加一个C函数,当Python调用spam.system(string)时,这个函数将会被调用。(我们等会儿会立刻看到它是如何被调用的):
static PyObject * spam_system(PyObject *self, PyObject *args) { const char *command; int sts; if (!PyArg_ParseTuple(args, "s", &command)) return NULL; sts = system(command); return PyLong_FromLong(sts); }
There is a straightforward translation from the argument list in Python (for example, the single expression "ls -l") to the arguments passed to the C function. The C function always has two arguments, conventionally named self and args.
这里从Python参数列表(例如,传递单个参数"ls-l")到C函数参数列表有个简单的转换。C函数总是包括两个参数,习惯性命名为self和args。
The self argument points to the module object for module-level functions; for a method it would point to the object instance.
self参数指向模块对象,如果是一个方法,self就是指向改对象的实例。(就相当于C++的this指针)
The args argument will be a pointer to a Python tuple object containing the arguments. Each item of the tuple corresponds to an argument in the call’s argument list. The arguments are Python objects — in order to do anything with them in our C function we have to convert them to C values. The function PyArg_ParseTuple() in the Python API checks the argument types and converts them to C values. It uses a template string to determine the required types of the arguments as well as the types of the C variables into which to store the converted values. More about this later.
args参数指向Python元组对象,这个元组包含Pyhon方法的参数(Python参数都是用元组存放起来的)。元组的每一个元素对应于调用函数的参数列表,spam_system函数的参数是PyObject类型(相当于void*类型),为了能让它正常的工作,我们需要将它转换为对应的C类型。在Python API中,PyArg_ParseTuple()函数就是检查参数类型并转换为相应的C类型。该函数使用一个格式化字符串来转换为对应的C类型。
PyArg_ParseTuple() returns true (nonzero) if all arguments have the right type and its components have been stored in the variables whose addresses are passed. It returns false (zero) if an invalid argument list was passed. In the latter case it also raises an appropriate exception so the calling function can return NULL immediately (as we saw in the example).
如果所有参数都正确的转换为对应的类型,那么PyArg_ParseTuple()返回true,所有的参数都是相当于是C++的引用传递,指向同一块内存。如果参数转换不合法,那么该函数返回false,并且抛出对应的异常,因此,当异常出现时spam_system()函数返回NULL。
1.2. Intermezzo: Errors and Exceptions
An important convention throughout the Python interpreter is the following: when a function fails, it should set an exception condition and return an error value (usually a NULL pointer). Exceptions are stored in a static global variable inside the interpreter; if this variable is NULL no exception has occurred. A second global variable stores the “associated value” of the exception (the second argument to raise). A third variable contains the stack traceback in case the error originated in Python code. These three variables are the C equivalents of the result in Python of sys.exc_info() (see the section on module sys in the Python Library Reference). It is important to know about them to understand how errors are passed around.
Python解释器有一个重要的习俗:当一个函数调用失败时。该函数一个设置异常条件,并返回错误值(通常是NULL)。异常一般存储在一个静态全局变量中。如果这个静态全局变量为NULL说明没有异常发生。第二个全局变量存储该异常的关联值。第三个变量存储Python代码原始错误的堆栈踪迹。这三个值其实就是sys.exc_info()函数返回值(请到sys模块查看该函数功能)。知道这三个变量对于了解错误信息的传递是非常重要的。
The Python API defines a number of functions to set various types of exceptions.
Python API定义了一系列的函数来设置不同类型的异常。
The most common one is PyErr_SetString(). Its arguments are an exception object and a C string. The exception object is usually a predefined object like PyExc_ZeroDivisionError. The C string indicates the cause of the error and is converted to a Python string object and stored as the “associated value” of the exception.
最常见的一个函数是PyErr_SetString().它包含一个异常对象和C字符串参数。第一个参数通常是一个预定义的类型,例如PyExc_ZeroDivisionError.第二个参数表明引起错误的原因,并转换为Python字符串,它存储该异常相关联的值。
Another useful function is PyErr_SetFromErrno(), which only takes an exception argument and constructs the associated value by inspection of the global variable errno. The most general function is PyErr_SetObject(), which takes two object arguments, the exception and its associated value. You don’t need to Py_INCREF() the objects passed to any of these functions.
另外一个有用的函数是PyErr_SetFromErrno(),只有一个PyObject对象,通过检查全局变量errno来构建一个关联值。最通用的函数是PyErr_SetObject(),包含两个参数,一个是异常对象,一个是关联值。你不需要为这些函数通过PyINCREF()来增加异常对象引用。
You can test non-destructively whether an exception has been set with PyErr_Occurred(). This returns the current exception object, or NULL if no exception has occurred. You normally don’t need to call PyErr_Occurred() to see whether an error occurred in a function call, since you should be able to tell from the return value.
你可以调用PyErr_Occurred()函数用非破坏性测试方法来判断是否有异常发生。这个函数返回当前异常对象,如果没有发生异常则返回NULL。你通常不需要在一个方法中调用PyErr_Occurred()方法来判断是非有错误发生,因为你应该通过函数返回值来判断。
When a function f that calls another function g detects that the latter fails, f should itself return an error value (usually NULL or -1). It should not call one of the PyErr_*() functions — one has already been called by g. f‘s caller is then supposed to also return an error indication to its caller, again without calling PyErr_*(), and so on — the most detailed cause of the error was already reported by the function that first detected it. Once the error reaches the Python interpreter’s main loop, this aborts the currently executing Python code and tries to find an exception handler specified by the Python programmer.
当f方法调用另一个方法g,发现g调用失败了,f应该通过它自己返回错误值(通常是NULL或者-1)。而不是调用PyErr_*()函数(假设其中一个PyErr_*()已经被g调用)来返回。f的调用者也应该返回一个错误给调用者的调用者,反复循环而不需要调用PyErr_*()。因为最详细的错误原因在函数第一次检查到时就已经被报告了。一旦这个错误递归到Python解释器的main方法是,程序就会终止当前正在执行的Python代码,并尝试查找用户指定的异常句柄。
(There are situations where a module can actually give a more detailed error message by calling another PyErr_*() function, and in such cases it is fine to do so. As a general rule, however, this is not necessary, and can cause information about the cause of the error to be lost: most operations can fail for a variety of reasons.)
(在有些情况下,一个模块通过调用另外的PyErr_*()方法给出更多错误消息的细节会更好一些,但作为通常的准则,这可能不是必须的,并且这可能会造成引起错误的信息丢失)
To ignore an exception set by a function call that failed, the exception condition must be cleared explicitly by calling PyErr_Clear(). The only time C code should call PyErr_Clear() is if it doesn’t want to pass the error on to the interpreter but wants to handle it completely by itself (possibly by trying something else, or pretending nothing went wrong).
当想要忽略函数调用时产生的异常,那么必须通过调用PyErr_Clear()函数来清除异常条件。C代码调用PyErr_Clear()函数仅仅是在当它不想传递错误给解释器时,并想要自己去处理错误时才这样做。
Every failing malloc() call must be turned into an exception — the direct caller of malloc() (or realloc()) must call PyErr_NoMemory() and return a failure indicator itself. All the object-creating functions (for example, PyLong_FromLong()) already do this, so this note is only relevant to those who call malloc() directly.
每一个malloc()函数调用失败都必须返回异常。直接调用malloc()(或者realloc())函数的方法必须调用PyErr_NoMemory()并返回一个错误。所有Py开头的方法(例如PyLong_FromLong())内部已经进行了这样的处理,因此用户调用malloc()函数的方法才需要这样处理。
Also note that, with the important exception of PyArg_ParseTuple() and friends, functions that return an integer status usually return a positive value or zero for success and -1 for failure, like Unix system calls.
同样需要注意的是,类似PyArg_ParseTuple()异常,函数通常返回一个相对值,例如0表示成功,-1表示失败,类似于Unix系统调用。
Finally, be careful to clean up garbage (by making Py_XDECREF() or Py_DECREF() calls for objects you have already created) when you return an error indicator!
最后,当你返回错误的时候清理垃圾(调用Py_XDECREF()或者Py_DECREF())时要小心!
The choice of which exception to raise is entirely yours. There are predeclared C objects corresponding to all built-in Python exceptions, such as PyExc_ZeroDivisionError, which you can use directly. Of course, you should choose exceptions wisely — don’t use PyExc_TypeError to mean that a file couldn’t be opened (that should probably be PyExc_IOError). If something’s wrong with the argument list, the PyArg_ParseTuple() function usually raises PyExc_TypeError. If you have an argument whose value must be in a particular range or must satisfy other conditions, PyExc_ValueError is appropriate.
选择哪个异常被抛出完全在于你自己。这里有一些预先定义的C对象对应于所有的内置Python异常,例如PyExc_ZeroDivisionError,这些函数也能直接使用。当然你应该理智的选择异常抛出(不要使用PyExc_TypeError来表示一个文件无法打开(而应该调用PyExc_IOError)).如果参数列表出现某种错误,PyArg_ParseTuple()函数通常抛出PyExc_TypeError.如果你的参数值必须在某个指定的范围或满足其他条件,PyExc_ValueError可能更合适。
You can also define a new exception that is unique to your module. For this, you usually declare a static object variable at the beginning of your file:
你也可以为你自己的模块定义新的异常,因此,你一般需要在文件开头定义一个静态PyObject*变量
static PyObject *SpamError;
and initialize it in your module’s initialization function (PyInit_spam()) with an exception object (leaving out the error checking for now):
在你的模块调用函数PyInit_spam()通过一个异常对象初始化该变量(先不管错误检查)
PyMODINIT_FUNC PyInit_spam(void) { PyObject *m; m = PyModule_Create(&spammodule); if (m == NULL) return NULL; SpamError = PyErr_NewException("spam.error", NULL, NULL); Py_INCREF(SpamError); PyModule_AddObject(m, "error", SpamError); return m; }
Note that the Python name for the exception object is spam.error. The PyErr_NewException() function may create a class with the base class being Exception (unless another class is passed in instead of NULL), described in Built-in Exceptions.
注意,该异常对象的名字是spam.error。PyErr_NewException()方法通过基类Exception创建了一个类。具体细节见Built-in Exception.
Note also that the SpamError variable retains a reference to the newly created exception class; this is intentional! Since the exception could be removed from the module by external code, an owned reference to the class is needed to ensure that it will not be discarded, causing SpamError to become a dangling pointer. Should it become a dangling pointer, C code which raises the exception could cause a core dump or other unintended side effects.
同样需要注意的是,SpamError变量维持一个引用指向新创建的异常类。这是刻意的!因为异常可能被外面的代码从spam模块中移除,因此需要一个引用来确保它不会被移除而导致SpamError变成一个悬挂指针。如果SpamError变成一个悬挂指针,C代码抛出异常可能引起内核崩溃或者意想不到的副作用。
We discuss the use of PyMODINIT_FUNC as a function return type later in this sample.
在这个例子中,我们将会讨论PyMODINIT_FUNC作为一个函数返回类型的使用
The spam.error exception can be raised in your extension module using a call to PyErr_SetString() as shown below:
下面通过使用PyErr_SetString(),在你的扩展模块中抛出spam.error 异常:
static PyObject * spam_system(PyObject *self, PyObject *args) { const char *command; int sts; if (!PyArg_ParseTuple(args, "s", &command)) return NULL; sts = system(command); if (sts < 0) { PyErr_SetString(SpamError, "System command failed"); return NULL; } return PyLong_FromLong(sts); }
标签:
原文地址:http://www.cnblogs.com/lijie-vs/p/4330772.html