-
Notifications
You must be signed in to change notification settings - Fork 62
/
HACKING
437 lines (289 loc) · 14.1 KB
/
HACKING
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
GDL HACKING HOWTO
=================
A NOTE ON EXCEPTION-SAFE PROGRAMMING
====================================
Exception-safe programming takes into account, that in C++ exceptions can be
thrown anytime. The purpose of exception-safe programming is to keep a
program always in a consistent state even if an exception occurs.
Most often this is about avoiding memory leaks. They occur when a object is
allocated on the heap and an exception is thrown. Then the allocated heap
memory is never returned. The local pointer to the heap memory is destroyed
when the function it was defined in goes out of scope.
In GDL this happens always when there is some error in a GDL program, as
this errors lead to the throw of an exception.
A way to avoid such memory leaks is to define for every such allocated memory
a "guard". This is a local object which is destroyed when the function is
left. In it's destructor it frees the heap memory of the heap object or more
general, it performs a proper cleanup of the newly created object.
Of course the guard must be informed when some other entity takes care of the
new heap object (e. g. it is returned from a function).
GDL provides several kinds of such guards:
for normal objects Guard<> is used (to set the guarded object free
Release() must be called). For array objects (destroyed by "delete[]" instead
of "delete") GDL provides the templated ArrayGuard<> class.
For objects with a cleanup function (e. g. GSL objects) the use of use
GDLGuard is recommended. Please see the source code for examples.
As this is not a proper place to discuss exception-safe programming in detail,
here are just some rules of thumb, when this might concern a code contributor:
- you need almost always a guard when using (operator) new somewhere in
your code
- objects returned from Convert2(..., COPY) methods are to be guarded as well
- same for objects created with Dup() methods.
General:
One of the main ideas of this project is to be able to eventually
use the huge code base written in IDL.
Therefore any extension to GDL should be done with compatibility as
the highest priority in mind.
Extensions of the language are welcome also, but then the routine
names should not collide with IDL routine names.
Therefore as a standard, if you want to port a library XXX to GDL with
non compatible functions make the (GDL-)names XXX_ROUTINENAME.
And if you implement a non IDL routine, bear in mind that some
documentation is needed in order to be able to use it.
Furthermore please keep in mind that GDL should be able to compile at
least under GNU/Linux (GCC), OS X and (soon) windows (MSVC). Therefore the
code should be as portable as possible (note that it is not mandatory
to make it run on all platforms, just portability should be kept in
mind avoiding any obvious obstacles like compiler specific extensions).
Extensions can be also done in python. See file PYTHON.txt for details.
In the distribution there is an empty file: new.cpp
This file is part of the project (will be compiled and linked). It is
intended for quick and easy extensions to GDL. Furthermore there is a new.hpp
file which is included (#include "new.hpp") in libinit.cpp (see below).
If you use these files, you don't need to edit any build system related files
for your extension.
This hacking guide isn't far from complete yet and intended to be extended.
Please send questions.
For now just some basic instructions:
Base class for all GDL data types is declared in basegdl.hpp.
The data structure is declared in datatypes.hpp (all types except structs)
and dstructgdl.hpp. For assoc variables in assocdata.hpp.
If you want to write a new library function or procedure, look at
basic_fun.cpp and basic_pro.cpp for examples.
The library functions communicate with the rest via the
environment.
See envt.hpp for the library function programming interface.
Three steps to install a new library subroutine:
1. Write it. Source in new.cpp header in new.hpp
(use namespace lib).
2. Make it known to the interpreter:
Add (preferable at the bottom) in libinit.cpp:
new DLibPro( subroutine name, GDL name, max. number of arguments, keyword list);
if the max. number of args is -1 an arbitrary number is allowed (like for
the PRINT procedure)
(new DLibFunRetNew( subroutine name ...) for functions,
new DLibFun( subroutine name ...) for functions which might return an existing value)
For the keyword list look at libinit.cpp for examples.
Using GSL (the Gnu Scientific Library)
Since 0.9.3: When you write routines utilizing the GSL, all GSL objects allocated with the
particular gsl_..._alloc(...) functions should be guarded, using GSLGuard. Please look
in src/gsl_fun.cpp for examples. GDL sets its own error handler for GSL. This way, errors in GSL
functions are properly handled and do not crash (via a call to abort()) GDL anymore.
The API (from envt.hpp)
=======================
implement a new library function:
this example function simply returns its number of parameters
#include "envt.hpp"
namespace lib {
BaseGDL* example( EnvT* e)
{
SizeT nPar = e->NParam();
BaseGDL* result = new DUIntGDL( nPar);
return result;
}
}
returns the actual number of parameters passed to a library function
minPar is the minimal number of parameters the function needs
(if less it throws)
SizeT NParam( SizeT minPar = 0);
raise an exception from within a library function
automatically cares for adding line/column info and the
function name. 's' should be set to the 'raw' error message:
void Throw( const std::string& s);
Within our example function the call would look like:
e->Throw( "An error occurred.");
'guards' a newly created variable which should be deleted
upon library routines exit (normal or on error)
eliminates the need of auto_ptr
void Guard( BaseGDL* toGuard);
Eg.:
char* tempVal = new DIntGDL( 10);
e->Guard( tempVal);
Note that all guarded objects are deleted with 'delete', ie. you
cannot guard arrays this way (which require 'delete[]', see
ArrayGuard<> class).
New since GDL 0.9.4: In the rare case that a library function returns an
existing variable (as opposed to a new one), it must notify the interpreter
about it by calling EnvBaseT::SetPtrToReturnValue( BaseGDL** ptrToRetVal).
Such a function must be a declared (in libinit.cpp) as a DLibFun, never as
DLibFunRetNew. While any other library function should be declared as
DLibFunRetNew.
for library functions (keyword must be an exact match)
returns the index of keyword k
int KeywordIx( const std::string& k);
Eg.:
static int structureIx = e->KeywordIx( "STRUCTURE");
returns environment data, by value (but that by C++ reference)
BaseGDL*& GetKW(SizeT ix) { return env[ix];}
Eg.:
BaseGDL* structureData = e->GetKW( structureIx);
returns the ix'th parameter (NULL if not defined)
BaseGDL*& GetPar(SizeT i);
the next are variations of GetKW() or GetPar() with some additional
checks:
get i'th parameter
throws if not defined (ie. never returns NULL)
BaseGDL*& GetParDefined(SizeT i); //, const std::string& subName = "");
throw for STRING, STRUCT, PTR and OBJECT useful if a function
only accepts numeric data
BaseGDL*& GetNumericParDefined( SizeT ix);
get i'th parameter
throws if not global (might be NULL), for assigning a new variable to
BaseGDL*& GetParGlobal(SizeT i);
next are some variations of GetKW() or GetPar() with perform a
conversion into an desired type:
get the pIx'th parameter and converts it to T if necessary
implies that the parameter must be defined
if it converts it cares for the destruction of the copy
CAUTION: this is for *read only* data, as the returned data might
be a copy or not
template <typename T>
T* GetParAs( SizeT pIx)
same as before for keywords
template <typename T>
T* GetKWAs( SizeT ix)
next two same as last two, but return NULL if parameter/keyword is not defined
template <typename T>
T* IfDefGetParAs( SizeT pIx);
same as before for keywords
template <typename T>
T* IfDefGetKWAs( SizeT ix);
returns the struct of a valid object reference or throws
DStructGDL* GetObjectPar( SizeT pIx);
for use within library functions
consider to use (note: 'static' is the point here):
static int kwIx = env->KeywordIx( "KEYWORD");
bool kwSet = env->KeywordSet( kwIx);
instead of:
bool kwSet = env->KeywordSet( "KEYWORD");
this one adds some overhead, but is easy to use
bool KeywordSet( const std::string& kw);
this one together with a static int holding the index is faster
(after the first call)
bool KeywordSet( SizeT ix);
keyword present (but might be an undefined variable)
bool KeywordPresent( SizeT ix);
local/global keyword/parameter
global -> passed in GDL by reference
local -> passes in GDL by value
bool LocalKW( SizeT ix);
bool GlobalKW( SizeT ix);
bool LocalPar( SizeT ix);
bool GlobalPar( SizeT ix);
next two to set keywords/parameters
note that the value MUST be created in the library function
with operator new
Before it must be tested with KeywordPresent() or NParam() if
the keyword/parameter is present
this is not done automatically because its more effective, to
create the data (newVal) only if its necessary
if the functions throw, they delete newVal before -> no
guarding of newVal is needed
void SetKW( SizeT ix, BaseGDL* newVal);
void SetPar( SizeT ix, BaseGDL* newVal);
Assure functions:
if name contains "Par" they must be used for parameters, else for keywords
(the error messages are defined for this usage and the indexing is
done respectively)
next two: NO CONVERSION (throw if wrong type)
NOTE: only few functions need to be so restrictive
converts parameter to scalar, throws if parameter is of different type,
non-scalar or not defined
template <typename T>
void AssureScalarPar( SizeT pIx, typename T::Ty& scalar);
same as before for keywords
template <typename T>
void AssureScalarKW( SizeT ix, typename T::Ty& scalar);
void AssureGlobalPar( SizeT pIx);
void AssureGlobalKW( SizeT ix);
if keyword 'kw' is not set, 'scalar' is left unchanged
void AssureLongScalarKWIfPresent( const std::string& kw, DLong& scalar);
converts keyword 'kw' if necessary and sets 'scalar'
void AssureLongScalarKW( const std::string& kw, DLong& scalar);
converts ix'th keyword if necessary and sets 'scalar'
void AssureLongScalarKW( SizeT ix, DLong& scalar);
converts parameter 'ix' if necessary and sets 'scalar'
void AssureLongScalarPar( SizeT ix, DLong& scalar);
same as for Long
void AssureDoubleScalarKWIfPresent( const std::string& kw, DDouble& scalar);
void AssureDoubleScalarKW( const std::string& kw, DDouble& scalar);
void AssureDoubleScalarKW( SizeT ix, DDouble& scalar);
void AssureDoubleScalarPar( SizeT ix, DDouble& scalar);
same as for Long
void AssureStringScalarKWIfPresent( const std::string& kw, DString& scalar);
void AssureStringScalarKW( const std::string& kw, DString& scalar);
void AssureStringScalarKW( SizeT ix, DString& scalar);
void AssureStringScalarPar( SizeT ix, DString& scalar);
About defining GDL structs in C++
=================================
'structList' contains a list of 'DStructDesc*' (struct descriptors).
For each structure, there must be exactly one descriptor but there
might be any number of instances (DStructGDL).
You have to distinguish between instances and descriptors.
For descriptors (DStructDesc) there is only (dstructdesc.hpp):
void AddTag( const std::string& tagName, BaseGDL* data);
this one does NOT grab the data, hence no new operator.
This is used in 'InitStructs()' (objects.cpp) for defining some
structures where there is no initial instance.
For instances (actually holding the data) there are (dstructgdl.hpp):
template< class DataGDL>
void InitTag(const std::string& tName, const DataGDL& data)
For already defined structures (DStructDesc exists).
And:
void DStructGDL::NewTag(const string& tName, BaseGDL* data)
For defining the descriptor AND the instance at the same time.
Used for example in initsysvar.cpp. This one grabs the data (hence
it has to be created newly with new).
Note that if there is already an instance elsewhere,
DStructDesc must not be changed anymore. Within GDL this is checked
(it is not allowed for a named struct to be redefined after defined
once), but from C++ its the programmers responsibility.
Generally: For defining a system variable NewTag should be used,
for returning a named struct, its descriptor should be defined in
'InitStructs()' and in the library function 'var = new DstructGDL(...)'
and 'var->InitTag(...)' should be used.
('should' means here 'I cannot think of any other usage so far')
Short overview of how GDL works internally
==========================================
Programs (*.pro files) or command line input is parsed (GDLLexer.cpp,
GDLParser.cpp generated with ANTLR from gdlc.g). These results in an
abstract syntax tree (AST) consisting of 'DNode' (dnode.hpp).
This syntax tree is further manipulated (compiled) with a tree parser
(GDLTreeParser.cpp generated with ANTLR from gdlc.tree.g,
dcompiler.hpp).
Here the AST is split into the different functions/procedures and
the DNode(s) are annotated with further information and converted to
ProgNode(s).
Then these compiled (ProgNode) ASTs are interpreted
(GDLInterpreter.cpp generated with ANTLR from gdlc.i.g, dinterpreter.cpp).
LINKIMAGE
=========
Joel Gales provided support for the LINKIMAGE procedure.
It allows users to add their own routines as dynamic libraries to
GDL. LINKIMAGE currently only works for the GNU/Linux version.
Note that you cannot use shared libraries written for IDL without change.
There is a simple example in "src/dll/two.cpp".
To use it:
1. Build the shared library:
g++ -I/(GDL header file directory) -c two.cpp
(64 bit: g++ -DHAVE_64BIT_OS -I/(GDL header file directory) -c two.cpp -fpic)
g++ -shared -lm -lc -o two.so two.o
2. Run LINKIMAGE:
GDL> linkimage,'TWO','/usr/local/src/gdl-0.9/src/dll/two.so',1,'two_fun'
with:
'TWO' is the GDL function name
'/usr/local/src/gdl-0.9/src/dll/two.so' is the location of the DLL
1 tells GDL that the TWO function is a function, use "0" for a procedure
'two_fun' is the entry point for this function within the "two.cpp" file
GDL> print,two(findgen(3))
0.000000 2.00000 4.00000