![]() |
Database »
Database »
MySQL
Intermediate
MySQL User Defined FunctionsBy DarkLizenerTutorial on writing your own MySQL User Defines Functions |
C, SQL, VC6, Windows, SQL Server, Visual Studio, MySQL, DBA, Dev
|
|
Advanced Search |
|
|
|
||||||||||||||||

This tutorial explains what an User Defined Function (UDF) is, what it does and why/when they are useful.
Basically an User Defined Function (UDF) is a piece code that extends the functionality of a MySQL server by adding a new function that behaves just like a native (built-in) MySQL function like abs() or concat(). UDFs are written in C (or C++ if you really need to)! ... well maybe there is a way to write them in BASIC, .NET or whatever but I don't see why anybody would want to do that.
As implied by the name UDFs are useful when you need to extend the functionality of your MySQL server. This little table should make it clear which method is best for a given situation:
|
Method |
Speed |
Language |
Development |
|---|---|---|---|
|
Stored Procedures |
slow |
SQL |
~minutes (for small functions) |
|
UDF |
fast |
C |
~hour |
|
Native Function |
fast |
C |
major pain in the *** |
And by "slow" I mean: "slower than the others"! Stored Procedures are still much faster then normal SQL statements!
A little explanation on native functions: The code you have to write here is essentially the same as the one for an UDF. BUT you have to write it in the MySQL source code and recompile the whole thing. This will (believe me) be a lot of work because you have to do it again and again with every new version of MySQL.
This part is really easy. When you have your UDF finished you just use it like every other native function. For example : "SELECT MyFunction(data1, data2) FROM table"
Now let's get started on writing our first UDF in steps:
Create a new shared-library project (in the example I used VC++ 6.0 with a standard DLL)
First we need some headers. These headers are either standard library headers or from the MySQL Server's include directory
#ifdef STANDARD
/* STANDARD is defined, don't use any mysql functions */
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#ifdef __WIN__
typedef unsigned __int64 ulonglong; /* Microsofts 64 bit types */
typedef __int64 longlong;
#else
typedef unsigned long long ulonglong;
typedef long long longlong;
#endif /*__WIN__*/
#else
#include <my_global.h>
#include <my_sys.h>
#endif
#include <mysql.h>
#include <ctype.h>
static pthread_mutex_t LOCK_hostname;
Now we have to decide what kind of function we want. There are essentially two choices to be made:
is the function an aggregate function or not? (we will learn more about aggregate functions later)
which type of return value should the function return? Here we have 4 options:
|
Type |
Description |
|---|---|
|
STRING |
A string literal. Translates to char* in C |
|
INTEGER |
A normal integer. Translates to a 64 bit integer in C |
|
REAL |
A floating point number. Translates to double in C |
|
DECIAML |
This one isn't really finished at this time. MySQL treats it as STRING |
Let's talk about non-aggregate functions first. Now we have to declare and implement some functions the MySQL server needs to use our UDF. But first some structs we'll need for that:
UDF_INIT:
|
Type |
Name |
Description |
|---|---|---|
<code>
|
maybe_null |
1 if function can return NULL |
|
unsigned int |
decimals |
for REAL functions |
|
unsigned long |
max_length |
For string functions |
|
char * |
ptr |
free pointer for function data |
|
my_bool |
const_item |
0 if result is independent of argument |
UDF_ARGS:
|
Type |
Name |
Description |
|---|---|---|
|
unsigned int |
arg_count |
Number of argument |
|
enum Item_result * |
arg_type |
Array containing the types of the arguments |
|
char ** |
args |
Array of pointer to the arguments |
|
unsigned long * |
lengths |
Array of the argument's lengths (only needed for strings) |
|
char * |
maybe_null |
Array of "maybe_null" flags (1 if argument maybe null) |
|
char ** |
attributes |
Array of pointers to the arguments' attributes (see chapter x for details) |
|
unsigned long * |
attribute_lengths |
Array of attributes lengths |
now let's take a look at the functions:
De-/Initialization:
extern "C" my_bool MyTest_init(UDF_INIT *initid, UDF_ARGS *args,
char *message)
{
// The most important thing to do here is setting up the memory
// you need...
// Lets say we need a lonlong type variable to keep a checksum
// Although we do not need one in this case
longlong* i = new longlong; // create the variable
*i = 0; // set it to a value
// store it as a char pointer in the pointer variable
// Make sure that you don`t run in typecasting troubles later!!
initid->ptr = (char*)i;
// check the arguments format
if (args->arg_count != 1)
{
strcpy(message,"MyTest() requires one arguments");
return 1;
}
if (args->arg_type[0] != INT_RESULT)
{
strcpy(message,"MyTest() requires an integer");
return 1;
}
return 0;
}
extern "C" void MyTest_deinit(UDF_INIT *initid)
{
// Here you have to free the memory you allocated in the
// initialization function
delete (longlong*)initid->ptr;
}
The actual function:
extern "C" longlong MyTest(UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error) { /* So finally this is the part were we do the real work. This function is called for every record and the current value(s) or better pointers to the current values are stroed in the UDF_ARGS variable. We have to get the values, do our calculation and return the result. NOTE: You can access the memory allocated in MyTest_init through the UDF_INIT variable. In this example we will just add 5 to every value...*/ return *((longlong*)args->args[0])+5; }
All done! Now we have to compile the library and copy it to a directory where our OS can find it. On Windows that would be anywhere the PATH System variable says. Personally I use the MySQL servers bin directory. You have to make sure that the library is in one of those directories otherwise MySQL can't use it! And also make sure to export all the functions MySQL needs!
And at last we have to tell MySQL about it. This is really straightforward: Just execute the following SQL command:
CREATE [AGGREGATE] FUNCTION MyTest
RETURNS [INTEGER|STRING|REAL|DECIMAL] SONAME the_libraries_exact_name
Now you can use it like any other function.
Now some words to aggregate functions. When your UDF is an aggregate function you have to add some more functions and some functions are used in a different way. The calling sequence is:
MyTest_init to allocate memory (just like a normal UDF)MyTest_clear for the first row in each groupMyTest_add for each row that belongs to the same groupMyTest to get the result when the group changes or the last row has been processedMyTest_deinit to free any used memoryNow let's look at the new functions needed for the aggregate function. In this example we'll simply add up all the values. (like the native SUM function)
void MyTest_clear(UDF_INIT *initid, char *is_null, char *error)
{
/* The clear function resets the sum to 0 for each new group
Of course you have to allocate a longlong variable in the init
function and assign it to the pointer as seen above */
*((longlong*)initid->ptr) = 0;
}
void MyTest_add(UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error)
{
// For each row the current value is added to the sum
*((longlong*)initid->ptr) = *((longlong*)initid->ptr) +
*((longlong*)args->args[0]);
}
longlong MyTest(UDF_INIT *initid, UDF_ARGS *args, char *is_null, char *error)
{
// And in the end the sum is returned
return *((longlong*)initid->ptr);
}
Here are some things you should know when you write more complex UDFs:
A string function should return a pointer to the result and set *result and *length to the contents and length of the return value. For example:
memcpy(result, "result string", 13);
*length = 13;
The result buffer that is passed to the MyTest function is 255 bytes long. If your result fits in this, you don't have to worry about memory allocation for results.
If your string function needs to return a string longer than 255 bytes, you must allocate the space for it with malloc or new in your MyTest_init function or your MyTest function and free it in your MyTest_deinit function. You can store the allocated memory in the ptr slot in the UDF_INIT structure for reuse by future MyTest calls.
To indicate an error return in the main function, set *error to 1: If MyTest() sets *error to 1 for any row, the function value is NULL for the current row and for any subsequent rows processed by the statement in which MyTest() was invoked.
For more information see the MySQL Online Manual
Here are some guidelines that you should follow if you want to make sure your UDF runs smoothly ;-)
Do not call any other applications or processes inside an UDF!
Do not store any information locally! (This goes for shared libraries in general)
Do not allocate any global or static variables!
Always check the type of your arguments. As you can see MySQL converts everything to char pointers. This can lead to major troubles if you convert a string literal to a integer pointer and so on.
Be extra careful with the memory allocation! If you have memory leaks you can bring down the sever in no time.
Debugging an UDF can be pretty nerve wracking because every time your UDF crashes it takes down the whole MySQL server along with it. So I wrote a little command line tool to work around that problem. Just execute it after compilation and it does the rest. Meaning, it emulates a call to the function by calling an "SELECT" command and then passing the results to the library and printing out the result on the command line . So when the UDF produces some serious errors only the little helper goes down and not the whole server. It is still in "beta" so don't expect to much....
| You must Sign In to use this message board. | |||||||||||||||
|
|||||||||||||||
|
|||||||||||||||
|
|||||||||||||||
|
|||||||||||||||
General
News
Question
Answer
Joke
Rant
Admin
|
PermaLink |
Privacy |
Terms of Use
Last Updated: 20 Sep 2006 Editor: Chris Maunder |
Copyright 2006 by DarkLizener Everything else Copyright © CodeProject, 1999-2009 Web17 | Advertise on the Code Project |