forked from metno/fimex
-
Notifications
You must be signed in to change notification settings - Fork 0
/
doxydoc.txt
928 lines (737 loc) · 38.7 KB
/
doxydoc.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
/*! @mainpage %Fimex User Documentation
%Fimex is a the File Interpolation, Manipulation and EXtraction library for
gridded geospatial data. It converts between different, extensible dataformats
(currently netcdf, grib1/2 and felt). It enables you to change the projection
and interpolation of scalar and vector grids. It makes it possible subset the
gridded data and to extract only parts of the files.
%Fimex can be used as library called @em %Fimex and a command-line program
called @em fimex, which gives access to most but not all functions of the
library.
%Fimex is build around the Common Data Model version 1 developped by Unidata
and uses a describes data using the CF-Convention http://cf-pcmdi.llnl.gov/documents/cf-conventions/1.0/cf-conventions.html .
Knowledge of that convention is not required, but will help understanding the config files needed for conversion.
The API of %Fimex as included in this document is not stable yet and can
change without warning. The setup-files are considered to be mostly stable.
The fimex-program can thus savely be used. If you want to use the API, please
contact me.
@section toc Table of Contents
-# @ref programDoc
-# @ref mergerDoc
-# @ref fillWriterDoc
-# @ref setup
-# @ref feltConfigDoc
-# @ref ncmlConfiguration
-# @ref qualityExtractorDoc
-# @ref gribReaderDoc
-# @ref gribWriterDoc
-# @ref netcdfWriterDoc
-# @ref parallelization
-# @ref memory_usage
-# @ref fortran90
@page programDoc fimex Program Options
@section fimex fimex Program Options
@em fimex is a command-line program. It has the following options:
@verbatim
usage: fimex --input.file FILENAME [--input.type INPUT_TYPE]
[--output.file FILENAME | output.fillFile [--output.type OUTPUT_TYPE]]
[--input.config CFGFILENAME] [--output.config CFGFILENAME]
[--extract....]
[--interpolate....]
[--timeInterpolate....]
Generic options:
-h [ --help ] help message
--version program version
--debug debug program
--print-options print all options
-c [ --config ] arg (=fimex.cfg) configuration file
Configurational options:
--input.file arg input file
--input.type arg filetype of input file, e.g. nc, nc4,
ncml, felt, grib1, grib2
--input.config arg non-standard input configuration
--input.printNcML print NcML description of input file
--input.printCS print CoordinateSystems of input file
--output.file arg output file
--output.fillFile arg output file, which should be filled
--output.type arg filetype of output file, e.g. nc,
nc4, grib1, grib2
--output.config arg non-standard output configuration
--process.accumulateVariable arg accumulate variable along unlimited
dimension
--process.deaccumulateVariable arg deaccumulate variable along unlimited
dimension
--process.rotateVectorToLatLonX arg rotate this vector x component from
grid-direction to latlon direction
--process.rotateVectorToLatLonY arg rotate this vector x component from
grid-direction to latlon direction
--process.printNcML [=arg(=-)] print NcML description of process
--process.printCS print CoordinateSystems of process
--extract.removeVariable arg remove variables
--extract.selectVariables arg select only those variables
--extract.reduceDimension.name arg name of a dimension to reduce
--extract.reduceDimension.start arg start position of the dimension to
reduce (>=0)
--extract.reduceDimension.end arg end position of the dimension to
reduce
--extract.reduceTime.start arg start-time as iso-string
--extract.reduceTime.end arg end-time by iso-string
--extract.reduceVerticalAxis.unit arg unit of vertical axis to reduce
--extract.reduceVerticalAxis.start arg start value of vertical axis
--extract.reduceVerticalAxis.end arg end value of the vertical axis
--extract.reduceToBoundingBox.south arg geographical bounding-box in degree
--extract.reduceToBoundingBox.north arg geographical bounding-box in degree
--extract.reduceToBoundingBox.east arg geographical bounding-box in degree
--extract.reduceToBoundingBox.west arg geographical bounding-box in degree
--extract.printNcML print NcML description of extractor
--extract.printCS print CoordinateSystems of extractor
--qualityExtract.autoConfString arg configure the quality-assignment
using CF-1.3 status-flag
--qualityExtract.config arg configure the quality-assignment with
a xml-config file
--qualityExtract.printNcML print NcML description of extractor
--qualityExtract.printCS print CoordinateSystems of extractor
--interpolate.projString arg proj4 input string describing the new
projection
--interpolate.method arg interpolation method, one of
nearestneighbor, bilinear, bicubic,
coord_nearestneighbor, coord_kdtree,
forward_min, forward_max, forward_mean,
forward_median or forward_sum,
forward_undef_min, forward_undef_max,
forward_undef_mean,
forward_undef_median or forward_undef_sum,
--interpolate.xAxisValues arg string with values on x-Axis, use ...
to continue, i.e. 10.5,11,...,29.5,
see Fimex::SpatialAxisSpec for full
definition
--interpolate.yAxisValues arg string with values on x-Axis, use ...
to continue, i.e. 10.5,11,...,29.5,
see Fimex::SpatialAxisSpec for full
definition
--interpolate.xAxisUnit arg unit of x-Axis given as udunits
string, i.e. m or degrees_east
--interpolate.yAxisUnit arg unit of y-Axis given as udunits
string, i.e. m or degrees_north
--interpolate.latitudeName arg name for auto-generated projection
coordinate latitude
--interpolate.longitudeName arg name for auto-generated projection
coordinate longitude
--interpolate.preprocess arg add a 2d preprocess to before the
interpolation, i.e.
"fill2d(critx,cor,maxLoop)"
--interpolate.latitudeValues arg string with latitude values in
degree, i.e. 60.5,70,90
--interpolate.longitudeValues arg string with longitude values in
degree, i.e. -10.5,-10.5,29.5
--interpolate.template arg netcdf file containing lat/lon
list used in interpolation
see Fimex::CDMInterpolator::changeProjection
--interpolate.printNcML print NcML description of
interpolator
--interpolate.printCS print CoordinateSystems of
interpolator
--merge.inner.file arg inner file for merge
--merge.inner.type arg filetype of inner merge file, e.g. nc,
nc4, ncml, felt, grib1, grib2
--merge.inner.config arg non-standard configuration for inner
merge file
--merge.inner.cfg arg recursive fimex.cfg setup-file to
enable all fimex-processing steps (i.e.
not input and output) to the
merge.inner source before merging
--merge.smoothing arg smoothing function for merge, e.g.
"LINEAR(5,2)" for linear smoothing, 5
grid points transition, 2 grid points
border
--merge.keepOuterVariables keep all outer variables, default: only
keep variables existing in inner and
outer
--merge.method arg interpolation method for grid
conversions, one of nearestneighbor,
bilinear, bicubic, coord_nearestneighbor,
coord_kdtree, forward_min, forward_max,
forward_mean, forward_median or
forward_sum
--merge.projString arg proj4 input string describing the new
projection
--merge.xAxisValues arg string with values on x-Axis, use ...
to continue, i.e. 10.5,11,...,29.5, see
Fimex::SpatialAxisSpec for full
definition
--merge.yAxisValues arg string with values on x-Axis, use ...
to continue, i.e. 10.5,11,...,29.5, see
Fimex::SpatialAxisSpec for full
definition
--merge.xAxisUnit arg unit of x-Axis given as udunits string,
i.e. m or degrees_east
--merge.yAxisUnit arg unit of y-Axis given as udunits string,
i.e. m or degrees_north
--merge.xAxisType arg (=double) datatype of x-axis (double,float,int,short)
--merge.yAxisType arg (=double) datatype of y-axis
--verticalInterpolate.type arg pressure, height or depth
--verticalInterpolate.method arg linear, log or loglog interpolation
--verticalInterpolate.level1 arg specification of first level, see
Fimex::CDMVerticalInterpolator for a
full definition
--verticalInterpolate.level2 arg specification of second level, only
required for hybrid levels, see
Fimex::CDMVerticalInterpolator for a
full definition
--verticalInterpolate.dataConversion arg
vertical data-conversion: theta2T,
omega2vwind or add4Dpressure
--verticalInterpolate.printNcML [=arg(=-)]
print NcML description of extractor
--verticalInterpolate.printCS print CoordinateSystems of vertical
interpolator
--timeInterpolate.timeSpec arg specification of times to interpolate
to, see MetNoFimex::TimeSpec for a full
definition
--timeInterpolate.printNcML print NcML description of
timeInterpolator
--timeInterpolate.printCS print CoordinateSystems of
timeInterpolator
--ncml.config modify/configure with ncml-file
--ncml.printNcML print NcML description after
ncml-configuration
--ncml.printCS print CoordinateSystems after
ncml-configuration
@endverbatim
All the configurational options can be configured using a configuration file which is supplied
using the --config option. All command line options (CLO) will overwrite the config-file. As a rule of
thump, use the CLO for testing and use the config-file for productive usage. The CLOs will be further
explained in @ref fimex_config.
@subsection fimex_config fimex Setup File
@verbinclude test/felt2netcdf.cfg
The @em SpatialAxisSpec used in xAxisValues or yAxisValues for the spatial interpolation
should be formatted as explained in detail in MetNoFimex::SpatialAxisSpec. It allows also
autotuning to the orignal data-values.
The @em TimeSpec string used for the timeInterpolate should be formatted as explained in
detail in MetNoFimex::TimeSpec.
@section setup Setup Files
Detailed information on the differnt configuration files can be found at:
- @ref feltConfigDoc
- @ref ncmlConfiguration
- @ref gribWriterDoc
- @ref netcdfWriterDoc
- @ref qualityExtractorDoc
@page feltConfigDoc Configuration files for felt reader
@subsection felt_config Configuration files for felt reader
The xml configuration files are defined by the @em felt2nc_variables.dtd
definition. Since part of this configuration are quite stable, e.g.
the axes (time, level, lat, lon, x, y), other parts change, e.g.
the variables to translate change very often. It is therefore useful
to split the variables from the rest of the configuration via @em xinclude
When writing a new configuration for a new set of felt-files, usually from
a new model, it is wise to group the configuration by
-# time resolution, i.e. one config for 3hourly files, one config for hourly files
-# spatial resolution: fimex doesn't allow different spatial resolutions, but some
models use coarser resoluton for higher levels
-# vertical levels: it is difficult to have the same parameter with sigma levels
and with height in m
Grouping can be done in two ways, the first one being faster in operation,
the second is easier to configure/change consistently:
-# write different configuration-files for each group of parameter, stating
the parameter as well as possible.
-# write one configuration-file for all parameter, keeping the parameters as
variable as possible. Use a preprocess-step to extract each group.
Use e.g. @em nyfelt or @em felt2felt as preprocessor
By default, all data is read as @em type="short" data with a scaling factor. While
felt allows for one scaling factor for each timestep, height and parameter,
the CDM allows only for one scaling factor per parameter. When the scaling factor
changes withing height or timestep, @em fimex will fail to read the data as short.
It is therefore useful to read data as @em type="float", which will automatically expand
the scaling factor. If the resulting file is to big, it is possible to
convert to short with one scaling factor and offset using the @ref netcdfWriterConfig.
Before running fimex with a new felt configuration, make sure the file
is valid, e.g. with
@code
xmllint --valid --noout felt2nc_config.xml
@endcode
Unfortunately, xinclude and validation don't play well together, since
usual validation happens before the inclusion of external parts. xmllint
uses special options to fix those problem:
@code
xmllint --xinclude --postvalid --noout felt2nc_config.xml
@endcode
Below follows a complete felt-configuration.
@verbinclude share/etc/felt2nc_variables.xml
@page ncmlConfiguration ncml Configuration
@section ncmlConfiguration ncml Configuration
@verbinclude share/etc/ncmlCDMConfig.ncml
Unidata's NetCDF Markup Language (NcML) as described in http://www.unidata.ucar.edu/software/netcdf/ncml/
gives the opportunity to change all information written in the CDM. With the --ncml.config option, the CDM
will be configured immediately after reading a file. It is also possilbe to read in a ncml file with the
--input.file=xxx.ncml option. In this case, the real data must be linked with the 'location' markup. As an extension
to the Unidata ncml-location field, fimex allows to add a type and config field to location, e.g.
@code
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
location="test.flt felt felt2netcdfconfig.xml"
</netcdf>
@endcode
Input-files can and should be validated against the included ncml-2.2.xsd.
%Fimex supports now also ncml-aggregation. Simple examples are:
@code
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<!-- same as above, but with scan -->
<aggregation type="joinExisting">
<scan location=". felt felt2nc.xml" regExp="joinExistingAgg\d+\.flt" />
</aggregation>
</netcdf>
@endcode
@code
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
<aggregation type="union">
<netcdf location="joinExistingAgg1.nc" />
<netcdf location="unionAgg2.nc3" />
</aggregation>
</netcdf>
@endcode
MetNoFimex::NcmlCDMReader contains the up-to-date list of features.
@see MetNoFimex::NcmlCDMReader
@page qualityExtractorDoc quality-extraction Configuration
@section qualityExtractorDoc quality-extraction Configuration
@warning The quality-extraction is still in a very early stage of development. The configuration
and the outcome is very likely to change in further developments. Any feedback is strongly welcome.
@verbinclude share/etc/cdmQualityConfig.xml
In cases where the data should be extracted if certain conditions (qualities) apply, i.e.
the status-flag indicates a properly working instrument, or the sea-surface-temperature is
above 300K, the #MetNoFimex::CDMQualityExtractor allows to add these rules. The cdmQualityConfig.xml
file as shown above gives an example of such an configuration.
- The variable "bla" will only be set, if "blub" has integer-values between 1 and 6.
- The variable "air_temperature" will only be extracted for an "altitude" above 1000. The value
1000 is the actual data value in the variable "altitude" without any scaling or unit-conversion applied.
- The variable "sea_surface_temperature" will set the fill-values from the "land_mask" found in "land.dat", which is an
external felt file and configured by "felt2nc.xml".
The following use-values can be selected:
- @c all select all valid values (within valid_max, valid_min or valid_range, without _FillValue)
- @c highest the highest numerical value found in the data-slice which is valid
- @c lowest the lowest numerical value fond in the data-slice which is valid
- @c max:xxx.x all valid-values below or equal xxx.x
- @c min:xxx.x all valid values above or equal xxx.x
All values which do not match the quality-criteria will be set to the _FillValue of the
variable.
@see MetNoFimex::CDMQualityExtractor
@page mergerDoc field merging Options
@section mergerDoc field merging Options
The purpose of the merging functionaltiy is to produce a combined
field from two input fields, typically one with low horizontal
resolution covering a large area (base or outer ),
and one with high horizontal
resolution covering a small area (top or inner).
The merge process has two steps:
first, the transition between the outer border of the high-resolution
field and the low-resolution field is smoothed; second, both are
interpolated to the final grid, using the smoothed high-resolution
field where defined, and the low-resolution field elsewhere.
When using the fimex program, input.file/type/config specify the outer
"low resolution" file, while merge.inner.file/type/config specify the
inner "high resolution" file. All compatible variables will be merged,
where compatibilty means that names agree, and shapes match except for
length-1 dimensions and horizontal axes. To keep all other variables from
the base-file, on can specify merge.keepOuterVariables.
The final grid may be
specified with merge.projString,x/yAxisValues,x/yAxisUnit,x/yAxisType;
if not specified, it will be derived automatically by extending the
high resolution grid until it covers the outer grid.
Ofen, one needs to manipulate heavily both the outer as well as the inner
inner data to match each other. The outer data can be manipulated with the
usual fimex-commands, while the inner can be modified with a fimex-setup file
and the --merge.inner.cfg option.
A complex example is shown below, run as
@verbatim
fimex -c ecPlusArome.cfg --output.file=out.nc4
@endverbatim
It joins global EC-data with Norwegian local AROME data. EC-data comes in 3hourly
timesteps, while Arome has hourly data. EC precipitation is split into convective
and large-scale precipitation, Arome has only one type of precipitation (arome precipitation
will be put into convective precipitation, while large-scale precipitation will be set
to 0).
Arome contains only surface data, while the output
should contain all model-level data from EC model (keepOuterVariables).
@verbatim
# file: ecPlusArome.cfg
[input]
file=ec_n1d_20151208_00.nc
# extract only certain parameters used for dispersion modelling
[extract]
selectVariables=air_temperature_2m
selectVariables=lwe_thickness_of_convective_precipitation_amount
selectVariables=lwe_thickness_of_stratiform_precipitation_amount
selectVariables=x_wind_10m
selectVariables=y_wind_10m
selectVariables=air_temperature_ml
selectVariables=x_wind_ml
selectVariables=y_wind_ml
selectVariables=air_pressure_at_sea_level
selectVariables=surface_air_pressure
selectVariables=surface_geopotential
# merge with arome 2.5km data
[merge]
inner.file=arome_metcoop2_5km_20151208_00.nc
# manipulate names with arome2EcPrecip.ncml
inner.config=arome2EcPrecip.ncml
# further configure data with arome2hprecip.cfg
inner.cfg=arome3hprecip.cfg
keepOuterVariables=1
smoothing=LINEAR(3,2)
method=bilinear
# ec - 0.05deg
projString=+proj=longlat +R=6.371e+06 +no_defs
xAxisUnit=degree_east
yAxisUnit=degree_north
# create 3hourly data
[timeInterpolate]
timeSpec=2015-12-08T00:00:00Z,2015-12-08T03:00:00Z,...,2015-12-10T12:00:00
@endverbatim
@verbatim
#file: arome2precip.ncml
<?xml version="1.0" encoding="UTF-8"?>
<netcdf xmlns="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2"
xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xsi:schemaLocation="http://www.unidata.ucar.edu/namespaces/netcdf/ncml-2.2 http://www.unidata.ucar.edu/schemas/netcdf/ncml-2.2.xsd">
<!-- change from kg/m2 (mm) to m and rename -->
<variable orgName="precipitation_amount_acc" name="lwe_thickness_of_convective_precipitation_amount">
<attribute name="units" type="String" value="m" />
<attribute name="scale_factor" type="float" value="0.001" />
</variable>
<!-- create a dummy with everything, will be made 0 in quality extractor -->
<variable orgName="surface_air_pressure" name="lwe_thickness_of_stratiform_precipitation_amount">
<attribute name="units" type="String" value="m" />
<attribute name="scale_factor" type="float" value="0.001" />
</variable>
</netcdf>
@endverbatim
@verbatim
# file: arome3hprecip.cfg
[extract]
# this is a dummy 0 genereated with ncml
selectVariables=lwe_thickness_of_stratiform_precipitation_amount
# was precipitation_amount_acc before ncml
selectVariables=lwe_thickness_of_convective_precipitation_amount
# select same period as in ec
[timeInterpolate]
timeSpec=2015-12-08T00:00:00Z,2015-12-08T03:00:00Z,...,2015-12-10T12:00:00
[qualityExtract]
config=noConvectivePrec.xml
@endverbatim
@verbatim
# file: noConvectivePrec.xml
<?xml version="1.0" encoding="UTF-8"?>
<cdmQualityConfig>
<!-- set all values to 0 (no negative precip) -->
<variable name="lwe_thickness_of_stratiform_precipitation_amount" fillValue="0">
<status_flag_variable name="lwe_thickness_of_stratiform_precipitation_amount">
<allowed_values use="max:0" />
</status_flag_variable>
</variable>
</cdmQualityConfig>
@endverbatim
@see MetNoFimex::CDMMerger
@ref programDoc
@page fillWriterDoc fillWriter Options
@section fillWriterDoc fillWriter Options
The fillWriter enables the filling of an existing template-file with real data.
It was developed to speed up creation of output-files of NWP-models. These models
usually create output one timestep at a time and at the end, and the timesteps are
then merged at another stage, often on another machine. %Fimex fillWriter makes sure,
that one can look at the final file even before the model is finished. It is even
possible to restart the model at any state, replacing ony the newly created slices.
@subsection howToUseFillWriter How to use the FillWriter
The axes of the output should well-known before model-start and the output-template
should be written by e.g.
@verbatim
fimex --input.file=example.nc input.printNcML=template.ncml
# eventually add axes values or similar to ncml-file
fimex --input.file=template.ncml --output.file=template.nc --output.type=nc4
@endverbatim
This can also be achieved with more tuning possibilities with ncdump -s and ncgen -b.
@warning ncgen until 4.2.1.1 has some bugs when using 'special' attributes. Please use ncgen >= 4.3.
@verbatim
netcdf fillIn2 {
dimensions:
time = UNLIMITED ; // (2 currently)
sigma = 5 ;
lon = 3 ;
lat = 4 ;
variables:
short time(time) ;
time:standard_name = "time" ;
time:units = "hours since 2000-01-01 00:00:00 +00:00" ;
time:_Storage = "chunked" ;
time:_ChunkSizes = 1 ;
short sigma(sigma) ;
sigma:standard_name = "atmosphere_sigma_coordinate" ;
sigma:positive = "up" ;
sigma:scale_factor = 0.001f ;
time:_Storage = "chunked" ;
time:_ChunkSizes = 1 ;
short lon(lon) ;
lon:units = "degrees_east";
short lat(lat) ;
lat:units = "degrees_north";
short cloud_area_fraction(time, sigma, lat, lon);
cloud_area_fraction:units = "%";
cloud_area_fraction:_Storage = "chunked";
cloud_area_fraction:_ChunkSizes = 1, 1, 4, 3 ;
cloud_area_fraction:_Shuffle = "true" ;
cloud_area_fraction:_DeflateLevel = 3 ;
// global attributes:
:Conventions = "CF-1.4" ;
:_Format = "netCDF-4 classic model" ;
data:
time = 12, 24 ;
sigma = 200, 300, 500, 850, 1000 ;
lon = -10, 0, 10;
lat = 58, 59, 60, 61;
}
@endverbatim
In particular when writing compressed netcdf-4 files, make sure to set the _ChunkSize to match the usual output-size, e.g.
horizontal size when reading from grib-files. When creating the template with fimex from ncml, this is done automatically.
The usage of the program is then very simple:
@verbatim
fimex --input.file=in.nc --output.fillFile=outFill.nc
@endverbatim
@see MetNoFimex::FillWriter
@ref programDoc
@page gribReaderDoc gribReader Configuration
@section gribReaderConfig gribReader Configuration
@subsection fiIndexGribs Indexing Grib-messages
Grib-files contain a sequence of many grib-messages, giving a representation of
one layer of one parameter each. Fimex builds first a index over all available
messages to be able to access the really needed message fast. For applications,
which only need a few messages, but fast startup-time (i.e. viewer applications),
it is an advantage to pre-build the index:
@verbatim
fiIndexGribs -i input.grb
@endverbatim
This will create a file input.grbml, which will be automatically opened as long as
the file is newer than input.grb and in the directory defined by the environment
variable GRIB_INDEX_PATH. GRIB_INDEX_PATH might be absolute, starting with /, or
relative to the grib-file (input.grb). If the environment-variable is not set, the
directory of the input.grb-file is assumed. In an operational environment
GRIB_INDEX_PATH should be defined globally, and fiIndexGribs should be called from
the grib-files directory with
@verbatim
fiIndexGribs -i input.grb -o $GRIB_INDEX_PATH
@endverbatim
It is also possible to create one big grbml for many grib-files, e.g. when model-output is
spread over several files. Use the -a (appendFile) option of fiIndexGribs to generate the
file, e.g.
@verbatim
rm -f myanmar_atmos1028_00.grbml
for i in atmosfc10280000_0*.grib \
atmopl_lb10280000_0*.grib \
atmopl_lb10280000_0*.grib; \
do
fiIndexGribs -i $i -a atmos1028_00.grbml;
done
@endverbation
This grbml file can later be read like one big grib-file, i.e.
@verbatim
fimex \
--input.file=atmos1028_00.grbml \
--input.type=grbml \
--input.config=cdmGribReaderConfigEC_MAD_LB_FOG.xml \
--output.file=fromGribML.nc
@endverbatim
It is left to the user to ensure that all data-files are exactly in the same place and have
the same contents as when the grbml file was generated.
@subsection gribReaderConcat Concatenation of grib-messages
Grib-data is often splitted across several files. These files might be
combined by fimex or generally by CDMReaderFactory by a glob, e.g.
@verbatim
fimex --input.file=glob:*.grb
@endverbatim
which will read all files ending with .grb. Wildcards are: * for zero or many
characters, ? for exactly one character and ** for all subdirectories, e.g.
@verbatim
/home/heikok/**/*.grb
@endverbatim will match all grib-files somewhere in my home-directory,
while
@verbatim
/home/heikok/*/*.grb
@endverbatim
will only match all grib-files one directory below my
home-directory.
Files might also be concatenated by simply adding new files with <tt>--input.optional</tt>, e.g.
@verbatim
fimex --input.file=file1.grb --input.optional=file2.grib --input.optional=file3.grb
@endverbatim
<tt>--input.optionals</tt> does not accept glob:-syntax.
@subsection gribReaderConcatEnsemble Concatenation of grib-model output to combined ensembledata
Grib data might come with a build-in ensemble axis. But in some cases, one wants
to combine models to ensemble, though the ensemble axis is not included in the files,
e.g. in GLAMEPS ( http://www.hirlam.org/index.php?option=com_content&view=article&id=61&Itemid=103 ).
In this case, fimex allows to match the filenames by names or by regular expressions
e.g.
@verbatim
fimex --input.file=*.grb --input.optional=memberName:mbrABC --input.optional=memberName:mbr001
@endverbatim
would add ensemble-id 0 to all files containing mbrABC and ensemble-id 1 to all files containing mbr001.
It is also possible to match the members with regular expressions, e.g.
@verbatim
fimex --input.file=*.grb --input.optional="memberRegex:mbrABC.*" --input.optional="memberRegex:mbr001.*"
@endverbatim
The order of @c memberName or @c memberRegex determines the position in the final file. memberNames are
internally translated to a regex like
@code
.*\QmemberName\E.*
@endcode
To have nicely formated names instead of the @c memberRegex, one can add another parameter to @c memberRegex and
@c memberName, e.g. <tt>memberRegex:mbrABC.*:mbrABC</tt> or <tt>--input.optional=memberName:mbrABC:ensembleABC</tt>
@subsection gribReaderConfigXml Grib-Table to netcdf/CF translation
Within fimex, the data-structure needs to conform to CF. Since grib is defined by external tables, these tables
need to be translated to their netcdf/CF equivalents in the <tt>cdmGribReaderConfgig.xml</tt> which is required
for all grib-files.
When extraKey's (e.g. localDefinitionNumber) are needed to define a grib-message, and fiIndexGribs is used
to index the grib-file, fiIndexGribs needs to list all extraKeys:
@verbatim
fiIndexGribs -i input.grb --extraKey=localDefinitionNumber --extraKey=anotherKey
@endverbatim
Long example of a cdmGribReaderConfig.xml:
@verbinclude share/etc/cdmGribReaderConfig.xml
@see MetNoFimex::GribCDMReader
@page gribWriterDoc gribWriter Configuration
@section gribWriterConfig gribWriter Configuration
@verbinclude share/etc/cdmGribWriterConfig.xml
@page netcdfWriterDoc netcdfWriter Configuration
@section netcdfWriterConfig netcdfWriter Configuration
The netcdfWriterConfig gives the opportunity to set some features
explicit only for netcdf-files, i.e. file-format (netcdf3/4) or compression.
It is also possible to add an @ref ncmlConfiguration to the output to change the
internal structure just before writing.
It is also possible ot change units including all value in the netcdfWriterDoc.
Changing the units in the ncmlConfiguration would change the attribute value only, but
not the data.
The CDM resembles a netcdf datastructure. In general, there is
no need to use a configuration for this writer, but it might be useful
in the following cases:
- Output-files are to big, and a change of datatype i.e. from float to short
is desired
- Different attributes are required for special usages, but the input-configuration
of the reader shouldn't been changed.
- Different variable- or dimension-names are required for special usages.
@verbinclude share/etc/cdmWriterConfig.xml
@see MetNoFimex::GribApiCDMWriter
@page parallelization
@section parallelization Parallelization: Fork, Threads, MPI and OpenMP
@subsection Fork-safety
The %Fimex library as of version 0.56 can be used with forked processes. It requires a fork
system-call as provided by Unix/Linux environments. Fimex processes can be forked just before
the data-fetching and achieves very good scaling for reading data. An example on how
to use a getDataSlice with pre-forking can be seen under: share/doc/examples/parallelRead.cpp
in Examples
@subsection Thread-safety
The %Fimex library can be used in threaded environments. %Fimex objects are
generally not thread-safe, so every object should only be used from a single
thread. But several threads can create their own %Fimex objects.
In addition, all CDMReader::get*Data*() operations are thread-safe and the following
code will work nicely:
@verbatim
size_t unlimSlices = unLimDim->getLength();
#pragma omp parallel for default(shared)
{
for (size_t i = 0; i < unlimSlices; ++i) {
try {
doSomething(reader->getDataSlice(varName, i));
} catch (...) {}
}
}
@endverbatim
@subsection OpenMP OpenMP
%Fimex can be build with parallelization support with OpenMP with the --enable-openmp
flag of @c configure. The following code-parts are currently (0.35) parallelized:
- NetCDF-writer/Null-writer: fetches each data-slice in a thread of it's own
Next to perfect scaling until IO-system is saturated. The memory-consumption is
linear with the number of threads.
- interpolation: repositioning of values
scales about factor 1.8 per processor for bilinear, better for bicubic, worse for nearestneighbor
- interpolation: fill2d
This scales well with the number of input layers (sigma, depth)
- interpolation with coord_nearestneighbor
This contains some parallelized part in the startup of the interpolation. But
this is still much slower than the coord_kdtree.
Often, the performance is limited by the IO-system.
On the fimex-commandline, the number of threads can be set using:
@code
fimex --num_threads=2 -c test.cfg
@endcode
When using the library, one should use:
@code
#include "fimex/ThreadPool.h"
...
if (MIFI_OK == mifi_setNumThreads(2)) {
/* below starts the other fimex code */
}
...
@endcode
@subsection MPI
To get MPI to work, the following prerequisites have to be met:
- hdf5 library compiles with --enable-hl, --enable-parallel and --enable-shared
- netcdf4 library compiled against above hdf5 library with
''CC=mpicc ./configure --enable-parallel-tests --enable-netcdf-4''
- fimex compiled against above netcdf like
''CXX=mpic++ CC=mpicc CFLAGS=-O2 CXXFLAGS=-O2 ./configure --disable-openmp''
fimex can then be called with ''mpiexec -n 8 fimex'' and will use parallel MPI-IO to write the
netcdf-files with the following CAVEATS:
- compression of netcdf does not work due to hdf5 limitations in parallel mode: http://www.hdfgroup.org/hdf5-quest.html#p5comp
- unlimited dimensions are disabled - bug in netcdf-4.3.*?
- please read the created files with a ncml-file re-setting the unlimited dimension,
e.g. ''<dimension name="time" isUnlimited="true"/>''
- The implementation is currently only tested with OpenMPI 1.8.3
- The implemnetation only works for creating netcdf4 files, the other
file-formats supported by fimex don't allow parallel writing by MPI-IO
Performance reading a 11GB compressed netcdf4 file from a 16 core 32threads 2.6GHz machine
connected to a lustre parallel filesystem:
@verbatim
nproc time [s] factor
1 158.7
2 79.2 2
4 52.2 1.5
8 29.0 1.8
16 19.4 1.5
32 21.5 0.9
@endverbatim
Reading 11GB compressed netcdf4 file and writing the same as uncompressed 37GB netcdf4 file.
@verbatim
nproc time [s] factor
1 232.0
2 147.6 1.6
4 116.9 1.3
8 99.4 1.2
16 104.0 0.9
32 119.6 0.8
@endverbatim
Using other compute-intensive data-manipulations will usually improve the scaling.
@page memory_usage
@section memory_usage Memory Usage
The fimex command line utility uses memory usually as one unlimited-dimension slice size, and this often
copied between different buffers, so minimum memory-usage is that slice multiplied with 2 or 3.
Interpolation needs a interpolation-cache with the horizontal resolution times 5 when vector-reprojection
is used.
NetCDF4/hdf5 uses a hdf chunk cache of size 1009*4M (10*4M per variable)
(as default in netcdf-4.3.0). The environment-variable FIMEX_CHUNK_CACHE_SIZE
changes that size, e.g. FIMEX_CHUNK_CACHE_SIZE=2185232384 will set the global cache to
521*4M (4M=4194304), and a size of 0 disables the cache (and as long as
complete/compressed chunks are read, usually without preformance-degradations). The number of slots
can be changed by FIMEX_CHUNK_CACHE_SLOTS, and they default to 521. Good values are large primes,
much larger than the number of chunks.
@page fortran90
@section fortran90 Fortran90 interface
Fimex comes with a fortran90 interface in modules/F90/fimex.f90 using fortran:2003.
The libfimexf and fimex.mod for using it will be build when --enable-fortran is used
during ./configure of fimex, and should be linked with -lfimexf. Documentatation of the
interface can be found in fimex.f90. Of most interest are the methods belonging to
the high-level interface Fimex::FimexIO.
An example can be found in modules/F90/fortran_test.f90 (see Examples)
For working with 2D x/y-fields as common with numerical weather prediction models
and grib-files, the fimexf library contains also the module modules/F90/fimex2d.F90 .
An example on using the 2d high-level library can be found in
modules/F90/fimex2d_example.F90 (see Examples).
*/