critcl-cproc-types - Critcl - cproc Type Reference
C Runtime In Tcl, or CriTcl , is a system for compiling C code embedded in Tcl on the fly and either loading the resulting objects into Tcl for immediate use or packaging them for distribution. Use CriTcl to improve performance by rewriting in C those routines that are performance bottlenecks.
This document is a breakout of the descriptions for the predefined argument- and result-types usable with the critcl::cproc command, as detailed in the reference manpage for the critcl package, plus the information on how to extend the predefined set with custom types. The breakout was made to make this information easier to find (toplevel document vs. having to search the large main reference).
Its intended audience are developers wishing to write Tcl packages with embedded C code.
Before going into the details first a quick overview:
Critcl type | C type | Tcl type | Notes ----------- | -------------- | --------- | ------------------------------ Tcl_Interp* | Tcl_Interp* | n/a | Special, only first ----------- | -------------- | --------- | ------------------------------ Tcl_Obj* | Tcl_Obj* | Any | Read-only object | | | Alias of Tcl_Obj* above list | critcl_list | List | Read-only ----------- | -------------- | --------- | ------------------------------ char* | const char* | Any | Read-only, string rep pstring | critcl_pstring | Any | Read-only bytes | critcl_bytes | ByteArray | Read-only ----------- | -------------- | --------- | ------------------------------ int | int | Int | long | long | Long | wideint | Tcl_WideInt | WideInt | double | double | Double | float | float | Double | ----------- | -------------- | --------- | ------------------------------ X > 0 | | | For X in int ... float above. X >= 0 | | | C types as per the base type X. X < 0 | | | Allowed argument values are X <= 0 | | | restricted as per the shown X > 1 | | | relation X >= 1 | | | X < 1 | | | This is not a general mechanism X <= 1 | | | open to other values. Only 0/1. ----------- | -------------- | --------- | ------------------------------ boolean | int | Boolean | bool | | | Alias of boolean above ----------- | -------------- | --------- | ------------------------------ bytearray | | | DEPRECATED rawchar | | | DEPRECATED rawchar* | | | DEPRECATED double* | | | DEPRECATED float* | | | DEPRECATED int* | | | DEPRECATED void* | | | DEPRECATED
And now the details:
Attention: This is a special argument type. It can only be used by the first argument of a function. Any other argument using it will cause critcl to throw an error.
When used, the argument will contain a reference to the current interpreter that the function body may use. Furthermore the argument will not be an argument of the Tcl command for the function.
This is useful when the function has to do more than simply returning a value. Examples would be setting up error messages on failure, or querying the interpreter for variables and other data.
The function takes an argument of type Tcl_Obj*. No argument checking is done. The Tcl level word is passed to the argument as-is. Note that this value must be treated as read-only (except for hidden changes to its intrep, i.e. shimmering).
The function takes an argument of type critcl_pstring containing the original Tcl_Obj* reference of the Tcl argument, plus the length of the string and a pointer to the character array.
typedef struct critcl_pstring {
    Tcl_Obj*    o;
    const char* s;
    int         len;
} critcl_pstring;
Note the const. The string is read-only. Any modification can have arbitrary effects, from pulling out the rug under the script because of string value and internal representation not matching anymore, up to crashes anytime later.
The function takes an argument of type critcl_list containing the original Tcl_Obj* reference of the Tcl argument, plus the length of the Tcl list and a pointer to the array of the list elements.
typedef struct critcl_list {
    Tcl_Obj*        o;
    Tcl_Obj* const* v;
    int             c;
} critcl_list;
The Tcl argument must be convertible to List, an error is thrown otherwise.
Note the const. The list is read-only. Any modification can have arbitrary effects, from pulling out the rug under the script because of string value and internal representation not matching anymore, up to crashes anytime later.
The function takes an argument of type char*. The Tcl argument must be convertible to ByteArray, an error is thrown otherwise. Note that the length of the ByteArray is not passed to the function, making this type not very usable.
Attention: These types are considered DEPRECATED. It is planned to remove their documentation in release 3.2, and their implementation in release 3.3. Their deprecation can be undone if good use cases are shown.
This is the new and usable ByteArray type.
The function takes an argument of type critcl_bytes containing the original Tcl_Obj* reference of the Tcl argument, plus the length of the byte array and a pointer to the byte data.
typedef struct critcl_bytes {
    Tcl_Obj*             o;
    const unsigned char* s;
    int                len;
} critcl_list;
The Tcl argument must be convertible to ByteArray, an error is thrown otherwise.
Note the const. The bytes are read-only. Any modification can have arbitrary effects, from pulling out the rug under the script because of string value and internal representation not matching anymore, up to crashes anytime later.
The function takes an argument of type const char*. The string representation of the Tcl argument is passed in.
Note the const. The string is read-only. Any modification can have arbitrary effects, from pulling out the rug under the script because of string value and internal representation not matching anymore, up to crashes anytime later.
The function takes an argument of type double. The Tcl argument must be convertible to Double, an error is thrown otherwise.
These are variants of double above, restricting the argument value to the shown relation. An error is thrown for Tcl arguments outside of the specified range. Note: This is not a general range specification syntax. Only the listed types exist.
The function takes an argument of type float. The Tcl argument must be convertible to Double, an error is thrown otherwise.
These are variants of float above, restricting the argument value to the shown relation. An error is thrown for Tcl arguments outside of the specified range. Note: This is not a general range specification syntax. Only the listed types exist.
The function takes an argument of type int. The Tcl argument must be convertible to Boolean, an error is thrown otherwise.
The function takes an argument of type Tcl_Channel. The Tcl argument must be convertible to type Channel, an error is thrown otherwise. The channel is further assumed to be already registered with the interpreter.
This type is an extension of channel above. All of the information above applies.
Beyond that the channel must not be shared by multiple interpreters, an error is thrown otherwise.
This type is an extension of unshared-channel above. All of the information above applies.
Beyond that the code removes the channel from the current interpreter without closing it, and disables all pre-existing event handling for it.
With this the function takes full ownership of the channel in question, taking it away from the interpreter invoking it. It is then responsible for the lifecycle of the channel, up to and including closing it.
Should the system the function is a part of wish to return control of the channel back to the interpeter it then has to use the result type return-channel. This will undo the registration changes made by this argument type. Note however that the removal of pre-existing event handling done here cannot be undone.
Attention Removal from the interpreter without closing the channel is effected by incrementing the channel's reference count without providing an interpreter, before decrementing the same for the current interpreter. This leaves the overall reference count intact without causing Tcl to close it when it is removed from the interpreter structures. At this point the channel is effectively a globally-owned part of the system not associated with any interpreter.
The complementary result type then runs this sequence in reverse. And if the channel is never returned to Tcl either the function or the system it is a part of have to unregister the global reference when they are done with it.
The function takes an argument of type int. The Tcl argument must be convertible to Int, an error is thrown otherwise.
These are variants of int above, restricting the argument value to the shown relation. An error is thrown for Tcl arguments outside of the specified range. Note: This is not a general range specification syntax. Only the listed types exist.
The function takes an argument of type long int. The Tcl argument must be convertible to Long, an error is thrown otherwise.
These are variants of long above, restricting the argument value to the shown relation. An error is thrown for Tcl arguments outside of the specified range. Note: This is not a general range specification syntax. Only the listed types exist.
The function takes an argument of type Tcl_WideInt. The Tcl argument must be convertible to WideInt, an error is thrown otherwise.
These are variants of wideint above, restricting the argument value to the shown relation. An error is thrown for Tcl arguments outside of the specified range. Note: This is not a general range specification syntax. Only the listed types exist.
The function takes an argument of the same-named C type. The Tcl argument must be convertible to ByteArray, an error is thrown otherwise. The bytes in the ByteArray are then re-interpreted as the raw representation of a single C pointer of the given type which is then passed as argument to the function. In other words, this is for Tcl values somehow holding raw C pointers, i.e. memory addresses.
Attention: These types are considered DEPRECATED. It is planned to remove their documentation in release 3.2, and their implementation in release 3.3. Their deprecation can be undone if good use cases are shown.
Before going into the details first a quick overview:
Critcl type   | C type         | Tcl type  | Notes
------------- | -------------- | --------- | ------------------------------
void          | n/a            | n/a       | Always OK. Body sets result
ok            | int            | n/a       | Result code. Body sets result
------------- | -------------- | --------- | ------------------------------
int           | int            | Int       |
boolean       |                |           | Alias of int above
bool          |                |           | Alias of int above
long          | long           | Long      |
wideint       | Tcl_WideInt    | WideInt   |
double        | double         | Double    |
float         | float          | Double    |
------------- | -------------- | --------- | ------------------------------
char*         | char*          | String    | Makes a copy
vstring       |                |           | Alias of char* above
const char*   | const char*    |           | Behavior of char* above
------------- | -------------- | --------- | ------------------------------
string        |                | String    | Freeable string set directly
              |                |           | No copy is made
dstring       |                |           | Alias of string above
------------- | -------------- | --------- | ------------------------------
              |                |           | For all below: Null is ERROR
              |                |           | Body has to set any message
Tcl_Obj*      | Tcl_Obj*       | Any       | refcount --
object        |                |           | Alias of Tcl_Obj* above
Tcl_Obj*0     |                | Any       | refcount unchanged
object0       |                |           | Alias of Tcl_Obj*0 above
------------- | -------------- | --------- | ------------------------------
known-channel | Tcl_Channel    | String    | Assumes to already be registered
new-channel   | Tcl_Channel    | String    | New channel, will be registered
And now the details:
If the returned Tcl_Obj* is NULL, the Tcl return code is TCL_ERROR and the function should set an error mesage as the interpreter result. Otherwise, the returned Tcl_Obj* is set as the interpreter result.
Note that setting an error message requires the function body to have access to the interpreter the function is running in. See the argument type Tcl_Interp* for the details on how to make that happen.
Note further that the returned Tcl_Obj* should have a reference count greater than 0. This is because the converter decrements the reference count to release possession after setting the interpreter result. It assumes that the function incremented the reference count of the returned Tcl_Obj*. If a Tcl_Obj* with a reference count of 0 were returned, the reference count would become 1 when set as the interpreter result, and immediately thereafter be decremented to 0 again, causing the memory to be freed. The system is then likely to crash at some point after the return due to reuse of the freed memory.
Like Tcl_Obj* except that this conversion assumes that the returned value has a reference count of 0 and does not decrement it. Returning a value whose reference count is greater than 0 is therefore likely to cause a memory leak.
Note that setting an error message requires the function body to have access to the interpreter the function is running in. See the argument type Tcl_Interp* for the details on how to make that happen.
A String Tcl_Obj holding the name of the returned Tcl_Channel is set as the interpreter result. The channel is further assumed to be new, and therefore registered with the interpreter to make it known.
A String Tcl_Obj holding the name of the returned Tcl_Channel is set as the interpreter result. The channel is further assumed to be already registered with the interpreter.
This type is a variant of new-channel above. It varies slightly from it in the registration sequence to be properly complementary to the argument type take-channel. A String Tcl_Obj holding the name of the returned Tcl_Channel is set as the interpreter result. The channel is further assumed to be new, and therefore registered with the interpreter to make it known.
A String Tcl_Obj holding a copy of the returned char* is set as the interpreter result. If the value is allocated then the function itself and the extension it is a part of are responsible for releasing the memory when the data is not in use any longer.
Like char* above, except that the returned string is const-qualified.
The returned char* is directly set as the interpreter result without making a copy. Therefore it must be dynamically allocated via Tcl_Alloc. Release happens automatically when the Interpreter finds that the value is not required any longer.
The returned double or float is converted to a Double Tcl_Obj and set as the interpreter result.
The returned int value is converted to an Int Tcl_Obj and set as the interpreter result.
The returned int value is converted to an Int Tcl_Obj and set as the interpreter result.
The returned long int value is converted to a Long Tcl_Obj and set as the interpreter result.
The returned Tcl_WideInt value is converted to a WideInt Tcl_Obj and set as the interpreter result.
The returned int value becomes the Tcl return code. The interpreter result is left untouched and can be set by the function if desired. Note that doing this requires the function body to have access to the interpreter the function is running in. See the argument type Tcl_Interp* for the details on how to make that happen.
The function does not return a value. The interpreter result is left untouched and can be set by the function if desired.
While the critcl::cproc command understands the most common C types (as per the previous 2 sections), sometimes this is not enough.
To get around this limitation the commands in this section enable users of critcl to extend the set of argument and result types understood by critcl::cproc. In other words, they allow them to define their own, custom, types.
This command tests if the named result-type is known or not. It returns a boolean value, true if the type is known and false otherwise.
This command defines the result type name, and associates it with the C code doing the conversion (body) from C to Tcl. The C return type of the associated function, also the C type of the result variable, is ctype. This type defaults to name if it is not specified.
If name is already declared an error is thrown. Attention! The standard result type void is special as it has no accompanying result variable. This cannot be expressed by this extension command.
The body's responsibility is the conversion of the functions result into a Tcl result and a Tcl status. The first has to be set into the interpreter we are in, and the second has to be returned.
The C code of body is guaranteed to be called last in the wrapper around the actual implementation of the cproc in question and has access to the following environment:
A Tcl_Interp* typed C variable referencing the interpreter the result has to be stored into.
The C variable holding the result to convert, of type ctype.
As examples here are the definitions of two standard result types:
    resulttype int {
	Tcl_SetObjResult(interp, Tcl_NewIntObj(rv));
	return TCL_OK;
    }
    resulttype ok {
	/* interp result must be set by cproc body */
	return rv;
    } int
This form of the resulttype command declares name as an alias of result type origname, which has to be defined already. If this is not the case an error is thrown.
This command tests if the named argument-type is known or not. It returns a boolean value, true if the type is known and false otherwise.
This command defines the argument type name, and associates it with the C code doing the conversion (body) from Tcl to C. ctype is the C type of the variable to hold the conversion result and ctypefun is the type of the function argument itself. Both types default to name if they are the empty string or are not provided.
If name is already declared an error is thrown.
body is a C code fragment that converts a Tcl_Obj* into a C value which is stored in a helper variable in the underlying function.
body is called inside its own code block to isolate local variables, and the following items are in scope:
A variable of type Tcl_Interp* which is the interpreter the code is running in.
A placeholder for an expression that evaluates to the Tcl_Obj* to convert.
A placeholder for the name of the variable to store the converted argument into.
As examples, here are the definitions of two standard argument types:
    argtype int {
	if (Tcl_GetIntFromObj(interp, @@, &@A) != TCL_OK) return TCL_ERROR;
    }
    argtype float {
	double t;
	if (Tcl_GetDoubleFromObj(interp, @@, &t) != TCL_OK) return TCL_ERROR;
	@A = (float) t;
    }
This form of the argtype command declares name as an alias of argument type origname, which has to be defined already. If this is not the case an error is thrown.
This command defines a C code fragment for the already defined argument type name which is inserted before all functions using that type. Its purpose is the definition of any supporting C types needed by the argument type. If the type is used by many functions the system ensures that only the first of the multiple insertions of the code fragment is active, and the others disabled. The guard identifier is normally derived from name, but can also be set explicitly, via guard. This latter allows different custom types to share a common support structure without having to perform their own guarding.
This command defines a C code fragment for the already defined argument type name which is inserted whenever the worker function of a critcl::cproc returns to the shim. It is the responsibility of this fragment to unconditionally release any resources the critcl::argtype conversion code allocated. An example of this are the variadic types for the support of the special, variadic args argument to critcl::cproc's. They allocate a C array for the collected arguments which has to be released when the worker returns. This command defines the C code for doing that.
The examples shown here have been drawn from section "Embedding C" in the document about Using CriTcl. Please see that document for many more examples.
Starting simple, let us assume that the Tcl code in question is something like
    proc math {x y z} {
        return [expr {(sin($x)*rand())/$y**log($z)}]
    }
with the expression pretending to be something very complex and slow. Converting this to C we get:
    critcl::cproc math {double x double y double z} double {
        double up   = rand () * sin (x);
        double down = pow(y, log (z));
        return up/down;
    }
Notable about this translation:
All the arguments got type information added to them, here "double". Like in C the type precedes the argument name. Other than that it is pretty much a Tcl dictionary, with keys and values swapped.
We now also have to declare the type of the result, here "double", again.
The reference manpage lists all the legal C types supported as arguments and results.
When writing bindings to external libraries critcl::cproc is usually the most convenient way of writing the lower layers. This is however hampered by the fact that critcl on its own only supports a few standard (arguably the most import) standard types, whereas the functions we wish to bind most certainly will use much more, specific to the library's function.
The critcl commands argtype, resulttype and their adjuncts are provided to help here, by allowing a developer to extend critcl's type system with custom conversions.
This and the three following sections will demonstrate this, from trivial to complex.
The most trivial use is to create types which are aliases of existing types, standard or other. As an alias it simply copies and uses the conversion code from the referenced types.
Our example is pulled from an incomplete project of mine, a binding to Jeffrey Kegler's libmarpa library managing Earley parsers. Several custom types simply reflect the typedef's done by the library, to make the critcl::cprocs as self-documenting as the underlying library functions themselves.
    critcl::argtype Marpa_Symbol_ID     = int
    critcl::argtype Marpa_Rule_ID       = int
    critcl::argtype Marpa_Rule_Int      = int
    critcl::argtype Marpa_Rank          = int
    critcl::argtype Marpa_Earleme       = int
    critcl::argtype Marpa_Earley_Set_ID = int
    ...
    method sym-rank: proc {
        Marpa_Symbol_ID sym
        Marpa_Rank      rank
    } Marpa_Rank {
        return marpa_g_symbol_rank_set (instance->grammar, sym, rank);
    }
    ...
A more involved custom argument type would be to map from Tcl strings to some internal representation, like an integer code.
The first example is taken from the tclyaml package, a binding to the libyaml library. In a few places we have to map readable names for block styles, scalar styles, etc. to the internal enumeration.
    critcl::argtype yaml_sequence_style_t {
        if (!encode_sequence_style (interp, @@, &@A)) return TCL_ERROR;
    }
    ...
    critcl::ccode {
        static const char* ty_block_style_names [] = {
            "any", "block", "flow", NULL
        };
        static int
        encode_sequence_style (Tcl_Interp* interp, Tcl_Obj* style,
                               yaml_sequence_style_t* estyle)
        {
            int value;
            if (Tcl_GetIndexFromObj (interp, style, ty_block_style_names,
                                     "sequence style", 0, &value) != TCL_OK) {
                return 0;
            }
            *estyle = value;
            return 1;
        }
    }
    ...
    method sequence_start proc {
        pstring anchor
        pstring tag
        int implicit
        yaml_sequence_style_t style
    } ok {
        /* Syntax: <instance> seq_start <anchor> <tag> <implicit> <style> */
        ...
    }
    ...
It should be noted that this code precedes the advent of the supporting generator package critcl::emap. using the generator the definition of the mapping becomes much simpler:
    critcl::emap::def yaml_sequence_style_t {
        any   0
        block 1
        flow  2
    }
Note that the generator will not only provide the conversions, but also define the argument and result types needed for their use by critcl::cproc. Another example of such a semi-trivial argument type can be found in the CRIMP package, which defines a Tcl_ObjType for image values. This not only provides a basic argument type for any image, but also derived types which check that the image has a specific format. Here we see for the first time non-integer arguments, and the need to define the C types used for variables holding the C level value, and the type of function parameters (Due to C promotion rules we may need different types).
    critcl::argtype image {
        if (crimp_get_image_from_obj (interp, @@, &@A) != TCL_OK) {
            return TCL_ERROR;
        }
    } crimp_image* crimp_image*
    ...
        set map [list <<type>> $type]
        critcl::argtype image_$type [string map $map {
            if (crimp_get_image_from_obj (interp, @@, &@A) != TCL_OK) {
                return TCL_ERROR;
            }
            if (@A->itype != crimp_imagetype_find ("crimp::image::<<type>>")) {
                Tcl_SetObjResult (interp,
                                  Tcl_NewStringObj ("expected image type <<type>>",
                                                    -1));
                return TCL_ERROR;
            }
        }] crimp_image* crimp_image*
    ...
The adjunct command critcl::argtypesupport is for when the conversion needs additional definitions, for example a helper structure.
An example of this can be found among the standard types of critcl itself, the pstring type. This type provides the C function with not only the string pointer, but also the string length, and the Tcl_Obj* this data came from. As critcl::cproc's calling conventions allow us only one argument for the data of the parameter a structure is needed to convey these three pieces of information.
Thus the argument type is defined as
    critcl::argtype pstring {
        @A.s = Tcl_GetStringFromObj(@@, &(@A.len));
        @A.o = @@;
    } critcl_pstring critcl_pstring
    critcl::argtypesupport pstring {
        typedef struct critcl_pstring {
            Tcl_Obj*    o;
            const char* s;
            int         len;
        } critcl_pstring;
    }
In the case of such a structure being large we may wish to allocate it on the heap instead of having it taking space on the stack. If we do that we need another adjunct command, critcl::argtyperelease. This command specifies the code required to release dynamically allocated resources when the worker function returns, before the shim returns to the caller in Tcl. To keep things simple our example is synthetic, a modification of pstring above, to demonstrate the technique. An actual, but more complex example is the code to support the variadic args argument of critcl::cproc.
    critcl::argtype pstring {
        @A = (critcl_pstring*) ckalloc(sizeof(critcl_pstring));
        @A->s = Tcl_GetStringFromObj(@@, &(@A->len));
        @A->o = @@;
    } critcl_pstring* critcl_pstring*
    critcl::argtypesupport pstring {
        typedef struct critcl_pstring {
            Tcl_Obj*    o;
            const char* s;
            int         len;
        } critcl_pstring;
    }
    critcl::argtyperelease pstring {
        ckfree ((char*)) @A);
    }
Note, the above example shows only the most simple case of an allocated argument, with a conversion that cannot fail (namely, string retrieval). If the conversion can fail then either the allocation has to be defered to happen only on successful conversion, or the conversion code has to release the allocated memory itself in the failure path, because it will never reach the code defined via critcl::argtyperelease in that case.
All of the previous sections dealt with argument conversions, i.e. going from Tcl into C. Custom result types are for the reverse direction, from C to Tcl. This is usually easier, as most of the time errors should not be possible. Supporting structures, or allocating them on the heap are not really required and therefore not supported.
The example of a result type shown below was pulled from KineTcl. It is a variant of the builtin result type Tcl_Obj*, aka object. The builtin conversion assumes that the object returned by the function has a refcount of 1 (or higher), with the function having held the reference, and releases that reference after placing the value into the interp result. The conversion below on the other hand assumes that the value has a refcount of 0 and thus that decrementing it is forbidden, lest it be released much to early, and crashing the system.
    critcl::resulttype KTcl_Obj* {
        if (rv == NULL) { return TCL_ERROR; }
        Tcl_SetObjResult(interp, rv);
        /* No refcount adjustment */
        return TCL_OK;
    } Tcl_Obj*
This type of definition is also found in Marpa and recent hacking hacking on CRIMP introduced it there as well. Which is why this definition became a builtin type starting with version 3.1.16, under the names Tcl_Obj*0 and object0.
Going back to errors and their handling, of course, if a function we are wrapping signals them in-band, then the conversion of such results has to deal with that. This happens for example in KineTcl, where we find
    critcl::resulttype XnStatus {
        if (rv != XN_STATUS_OK) {
            Tcl_AppendResult (interp, xnGetStatusString (rv), NULL);
            return TCL_ERROR;
        }
        return TCL_OK;
    }
    critcl::resulttype XnDepthPixel {
        if (rv == ((XnDepthPixel) -1)) {
            Tcl_AppendResult (interp,
                              "Inheritance error: Not a depth generator",
                              NULL);
            return TCL_ERROR;
        }
        Tcl_SetObjResult (interp, Tcl_NewIntObj (rv));
        return TCL_OK;
    }
Jean Claude Wippler, Steve Landers, Andreas Kupries
This document, and the package it describes, will undoubtedly contain bugs and other problems. Please report them at https://github.com/andreas-kupries/critcl/issues. Ideas for enhancements you may have for either package, application, and/or the documentation are also very welcome and should be reported at https://github.com/andreas-kupries/critcl/issues as well.
C code, Embedded C Code, code generator, compile & run, compiler, dynamic code generation, dynamic compilation, generate package, linker, on demand compilation, on-the-fly compilation
Glueing/Embedded C code
Copyright © Jean-Claude Wippler
Copyright © Steve Landers
Copyright © 2011-2018 Andreas Kupries