PostgreSQL プログラマガイド
Prev	Chapter 4. SQL の拡張: 関数	Next

プログラミング言語関数

基本型を使用したプログラミング言語関数

Internally, Postgres regards a base type as a "blob of memory." The user-defined functions that you define over a type in turn define the way that Postgres can operate on it. That is, Postgres will only store and retrieve the data from disk and use your user-defined functions to input, process, and output the data. Base types can have one of three internal formats:

Postgres は、その内部で、基本型を "メモリ内の小塊" とみなしています。ある型について定義したユーザ定義関数は同様に Postgres がその型をどう操作できるのかを定義しています。つまり、 Postgres はデータをディスクに保存し、ディスクからデータを受取るだけで、そのデータを入力として受取り、処理し、出力するためにユーザ定義の関数を使います。基本型は次の 3 つの内部フォーマットのいずれかを持ちます。

値渡し、固定長
参照渡し、固定長
参照渡し、不定長

By-value types can only be 1, 2 or 4 bytes in length (even if your computer supports by-value types of other sizes). Postgres itself only passes integer types by value. You should be careful to define your types such that they will be the same size (in bytes) on all architectures. For example, the long type is dangerous because it is 4 bytes on some machines and 8 bytes on others, whereas int type is 4 bytes on most UNIX machines (though not on most personal computers). A reasonable implementation of the int4 type on UNIX machines might be:

    /* 4-byte integer, passed by value */
    typedef int int4;

値渡しの型は 1、 2、又は 4 バイト長のみです。（使用するコンピュータが他の大きさの値渡しの型をサポートしていたとしてもです。） Postgres 自身では、整数型のみを値で渡します。型を定義する時、その型が全てのアーキテクチャで同一の大きさ（バイト数）になることに注意して下さい。例えば、 long 型は、あるマシンでは 4 バイト、他では 8 バイトであるので危険です。一方 int は（ほとんどのパーソナルコンピュータでは異なりますが）ほとんどの UNIX マシンで 4 バイトになっています。 UNIX マシンでの int4 型の合理的な実装は次になるでしょう。

    /* 4 バイト整数、値渡し */
    typedef int int4;

On the other hand, fixed-length types of any size may be passed by-reference. For example, here is a sample implementation of a Postgres type:

         /* 16-byte structure, passed by reference */
    typedef struct
    {
        double  x, y;
    } Point;

一方、固定長の任意の大きさの型は参照渡しで渡されます。例えば、以下は Postgres の型の実装のサンプルです。

    /* 16 バイトの構造体、参照渡し */
    typedef struct
    {
        double  x, y;
    } Point;

Only pointers to such types can be used when passing them in and out of Postgres functions. Finally, all variable-length types must also be passed by reference. All variable-length types must begin with a length field of exactly 4 bytes, and all data to be stored within that type must be located in the memory immediately following that length field. The length field is the total length of the structure (i.e., it includes the size of the length field itself). We can define the text type as follows:

Postgres 関数の入出力にこのような型が渡された場合はポインタのみが使用される事ができます。最後に、全ての不定長の型は参照で渡される必要があります。全ての不定長の型は正確に 4 バイトの length フィールドから始まる必要があり、その型に保存される全てのデータは length フィールドのすぐ後に続いて、メモリに置かれる必要があります。length フィールドはその構造体の総長です。（つまり、length フィールド自体もその大きさに含みます。）次のように text 型を定義できます。

         typedef struct {
             int4 length;
             char data[1];
         } text;

Obviously, the data field is not long enough to hold all possible strings -- it's impossible to declare such a structure in C. When manipulating variable-length types, we must be careful to allocate the correct amount of memory and initialize the length field. For example, if we wanted to store 40 bytes in a text structure, we might use a code fragment like this:

明らかに、 data フィールドはすべての取り得る文字列を保持できるほどの長さをもっていません。C ではそのような構造体を宣言できません。不定長の型を操作する時、正しい大きさのメモリを割り当て、 length フィールドを初期化することに注意する必要があります。例えば、text フィールドに 40 バイトを保持したい場合、下のようなコードを使用する事になります。

         #include "postgres.h"
         ...
         char buffer[40]; /* our source data */
         ...
         text *destination = (text *) palloc(VARHDRSZ + 40);
         destination->length = VARHDRSZ + 40;
         memmove(destination->data, buffer, 40);
         ...

Now that we've gone over all of the possible structures for base types, we can show some examples of real functions. Suppose funcs.c look like:

これで基本型用の全てのあり得る構造体について完了しましたので、実際の関数の例を幾つか示す事ができます。下のような funcs.c を前提にします。

         #include <string.h>
         #include "postgres.h"


         /* By Value */
         /* 値渡し */
         
         int
         add_one(int arg)
         {
             return(arg + 1);
         }
         

         /* By Reference, Fixed Length */
         /* 参照渡し、固定長 */
         
         Point *
         makepoint(Point *pointx, Point *pointy )
         {
             Point     *new_point = (Point *) palloc(sizeof(Point));
        
             new_point->x = pointx->x;
             new_point->y = pointy->y;
                
             return new_point;
         }
        

         /* By Reference, Variable Length */
         /* 参照渡し、不定長 */
         
         text *
         copytext(text *t)
         {

             /*
              * VARSIZE is the total size of the struct in bytes.
              */
             /*
              * VARSIZE は構造体の全体の大きさをバイトで示したもの
              */
             text *new_t = (text *) palloc(VARSIZE(t));
             memset(new_t, 0, VARSIZE(t));
             VARSIZE(new_t) = VARSIZE(t);

             /*
              * VARDATA is a pointer to the data region of the struct.
              */
             memcpy((void *) VARDATA(new_t), /* destination */
                    (void *) VARDATA(t),     /* source */
                    VARSIZE(t)-VARHDRSZ);        /* how many bytes */
             return(new_t);
             /*
              * VARDATA は構造体の data 領域へのポインタ
              */
             memcpy((void *) VARDATA(new_t), /* 宛先 */
                    (void *) VARDATA(t),     /* 源 */
                    VARSIZE(t)-VARHDRSZ);        /* バイト数 */
             return(new_t);
         }
         
         text *
         concat_text(text *arg1, text *arg2)
         {
             int32 new_text_size = VARSIZE(arg1) + VARSIZE(arg2) - VARHDRSZ;
             text *new_text = (text *) palloc(new_text_size);

             memset((void *) new_text, 0, new_text_size);
             VARSIZE(new_text) = new_text_size;
             strncpy(VARDATA(new_text), VARDATA(arg1), VARSIZE(arg1)-VARHDRSZ);
             strncat(VARDATA(new_text), VARDATA(arg2), VARSIZE(arg2)-VARHDRSZ);
             return (new_text);
         }

On OSF/1 we would type:

OSF/1 では、次のように入力します。

         CREATE FUNCTION add_one(int4) RETURNS int4
              AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c';

         CREATE FUNCTION makepoint(point, point) RETURNS point
              AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c';
    
         CREATE FUNCTION concat_text(text, text) RETURNS text
              AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c';
                                  
         CREATE FUNCTION copytext(text) RETURNS text
              AS 'PGROOT/tutorial/funcs.so' LANGUAGE 'c';

On other systems, we might have to make the filename end in .sl (to indicate that it's a shared library).

他のシステムでは、ファイル名の終りを（共有ライブラリを示す）.sl にしなければいけないかもしれません。

複合型を使用したプログラミング言語関数

Composite types do not have a fixed layout like C structures. Instances of a composite type may contain null fields. In addition, composite types that are part of an inheritance hierarchy may have different fields than other members of the same inheritance hierarchy. Therefore, Postgres provides a procedural interface for accessing fields of composite types from C. As Postgres processes a set of instances, each instance will be passed into your function as an opaque structure of type TUPLE. Suppose we want to write a function to answer the query

複合型は C の構造体のような固定のレイアウトをもちません。複合型のインスタンスは NULL フィールドを持つかもしれません。更に、継承階層の一部である複合型は、同じ継承の階層の他のメンバとは異なるフィールドを持つ可能性もあります。そのため、 Postgres は C から複合型のフィールドにアクセスするための手続きのインタフェースを提供します。 Postgres はインスタンスの集合を処理するため、各インスタンスは、不明瞭な TUPLE 型の構造体として関数に渡されます。次の問い合わせに答える関数を書こうとしていると仮定します。

         * SELECT name, c_overpaid(EMP, 1500) AS overpaid
           FROM EMP
           WHERE name = 'Bill' or name = 'Sam';

In the query above, we can define c_overpaid as: 上の問い合わせの場合、c_overpaid を次のように定義できます。

         #include "postgres.h"

         #include "executor/executor.h"  /* for GetAttributeByName() */
         #include "executor/executor.h"  /* GetAttributeByName() 関数のための宣言 */
         

         bool
         c_overpaid(TupleTableSlot *t, /* the current instance of EMP */
                    int4 limit)
         bool
         c_overpaid(TupleTableSlot *t, /* 対象 EMP 型のインスタンス */
                    int4 limit)
         {
             bool isnull = false;
             int4 salary;
             salary = (int4) GetAttributeByName(t, "salary", &isnull);
             if (isnull)
                 return (false);
             return(salary > limit);
         }

GetAttributeByName is the Postgres system function that returns attributes out of the current instance. It has three arguments: the argument of type TUPLE passed into the function, the name of the desired attribute, and a return parameter that describes whether the attribute is null. GetAttributeByName will align data properly so you can cast its return value to the desired type. For example, if you have an attribute name which is of the type name, the GetAttributeByName call would look like:

GetAttributeByName は、対象インスタンスの属性を返す Postgres システム関数です。その関数に渡される TUPLE 型の引数、要求する属性の名前、属性が NULL かどうかを示すリターンパラメータという 3 つの引数を取ります。 GetAttributeByName は、その戻り値を必要とする型にキャストできるようにデータを適切に整列します。例えば、name 型の属性の名前を指定する時、GetAttributeByName は下のようになります。

         char *str;
         ...
         str = (char *) GetAttributeByName(t, "name", &isnull)

The following query lets Postgres know about the c_overpaid function:

次の問い合わせは Postgres に c_overpaid 関数を伝えます。

         * CREATE FUNCTION c_overpaid(EMP, int4) RETURNS bool
              AS 'PGROOT/tutorial/obj/funcs.so' LANGUAGE 'c';

While there are ways to construct new instances or modify existing instances from within a C function, these are far too complex to discuss in this manual.

C 関数内から新しいインスタンスを構築したり既存のインスタンスを変更する方法はありますが、このマニュアルで説明するには複雑過ぎます。

注意

We now turn to the more difficult task of writing programming language functions. Be warned: this section of the manual will not make you a programmer. You must have a good understanding of C (including the use of pointers and the malloc memory manager) before trying to write C functions for use with Postgres. While it may be possible to load functions written in languages other than C into Postgres, this is often difficult (when it is possible at all) because other languages, such as FORTRAN and Pascal often do not follow the same "calling convention" as C. That is, other languages do not pass argument and return values between functions in the same way. For this reason, we will assume that your programming language functions are written in C. The basic rules for building C functions are as follows:

さてここからプログラミング言語関数を記述する上でより難しい仕事について説明します。このマニュアルはプログラマを養成するためのものではありません。Postgres で使用する C 関数を作成しようとする前には、（ポインタや malloc メモリ管理を含め、）C についてよく理解していなければいけません。C 以外の言語で記述した関数を Postgres に組み込む事はできますが、FORTRAN や Pascal といった他の言語は多くの場合 C 同様の "呼び出し規定" に従っていませんので（できる事はできるのですが）多くの場合は困難です。つまり他の言語では同じ方法で引数を渡したり結果を返すことを行ないません。この理由のためにプログラミング言語関数は C で記述されているものと仮定します。C 関数を作成する時の基本的なルールは次のものです。

Most of the header (include) files for Postgres should already be installed in PGROOT/include (see Figure 2). You should always include
ほとんどの Postgres 用の（include）ヘッダファイルは PGROOT/include にインストールされています。（図 2 を見て下さい。）cc のコマンドラインに下を常に含めるべきです。
```
                -I$PGROOT/include
```
on your cc command lines. Sometimes, you may find that you require header files that are in the server source itself (i.e., you need a file we neglected to install in include). In those cases you may need to add one or more of 必要とするヘッダファイルの一部分は、サーバーのソース自体の中にあるものがあります。（つまりシステムインストール時に include 内にインストールしなかったファイルを必要とします。）この場合、次のものの一つ以上を指定する必要があります。
```
                -I$PGROOT/src/backend
                -I$PGROOT/src/backend/include
                -I$PGROOT/src/backend/port/<PORTNAME>
                -I$PGROOT/src/backend/obj
```
(where <PORTNAME> is the name of the port, e.g., alpha or sparc). （ここで <PORTNAME> は alpha 、 sparc といった移植先の名前です。）
メモリを割り当てる時には、 palloc と pfree という Postgres のルーチンを対応する malloc と free という C ライブラリのルーチンの代わりに使用して下さい。palloc で割り当てられたメモリは、メモリリークを防ぐために、各トランザクションが終了した時点で自動的に解放されます。
memset 又は bzero を使って、常に構造体全体をゼロにして下さい。（ハッシュアクセスメソッド、ハッシュ結合、ソートアルゴリズムなどの）多くのルーチンは構造体内のビットそのものを使ってその計算を行ないます。構造体の全メンバを初期化したとしても、整列させるための（構造体中のすき間を）埋める物の多くのバイトはゴミの値を持ちます。
Postgres の内部的な型のほとんどは postgres.h で宣言されていますので、常にこのファイルを include しておく事は良い考えです。postgres.h を include することは、また、elog.h と palloc.h も include していることになります。
動的に Postgres に組み込まれるようにオブジェクトコードをコンパイルし読み込むには、常に特別なフラグが必要です。特定のオペレーティングシステムでどのように指定するのかについての詳細は付録 A を見て下さい。