|
|
|
|
||
|
DEFINITION MODULE DataFrames; (******************************************************************* Module DataFrames (DF_Version_2.2) Copyright (c) 1997-2006 by Andreas Fischlin, Dimitrios Gyalistras and ETH Zurich. Purpose Core routines to work with Data Frames (see also module DatFraAux for more advanced routines). Data Frames are read from file(s) and stored in memory for analysis and subsequent retrieval. Remarks Current implementation assumes any number of data frames (spread over any number of files via so-called file references) can be read into memory for subsequent retrieval. Actual data are always specified in a tabular form. The syntax of an input file is as follows: DataFrameFile = (FileReference | DataFrame) {FileReference | DataFrame}. FileReference = "FILE" "=" fileName ";" [FilterSpecif]. fileName = STRING. FilterSpecif = "USE" "IF" "FILTER" "=" filterVal ";". filterVal = LONGINT. DataFrame = "DATAFRAME" dataFrameIdent ";" [DataFrameParamList] "DATA" ":" Table "END" dataFrameIdent ";". dataFrameIdent = IDENTIFIER. DataFrameParamList = { [FilterSpecif] | [ParentOrModelSpecif] | [RemarkSpecif] | [KeyColumnSpecif] }. ParentOrModelSpecif = ("PARENT"|"MODEL") "=" parentOrModelID ";". parentOrModelID = IDENTIFIER | "ANY" | "ALL". RemarkSpecif = "REMARK" "=" STRING ";". KeyColumnSpecif = "KEYCOLUMN" "=" keyColumnID ";". keyColumnID = IDENTIFIER. Table = TableHeader TableLine {";" TableLine}. TableHeader = IDENTIFIER {IDENTIFIER} ";". TableLine = TableEle {TableEle}. TableEle = (INTEGER | LONGINT | REAL | LONGREAL | IDENTIFIER | STRING | BOOLEAN). Syntax of the elementary data types is (regular expression notation): INTEGER = [+-]?[0-9]+ LONGINT = [+-]?[0-9]+ "D" REAL = [+-]?[0-9]+ "." [0-9]+ (("E"|"e")[+-]?[0-9]+)? LONGREAL = [+-]?[0-9]+ "." [0-9]+ (("D"|"d")[+-]?[0-9]+)? IDENTIFIER = [a-zA-Z] [_a-zA-Z0-9]* STRING = ('.*')|(".*") BOOLEAN = "TRUE" | "FALSE" Comments start with "(*" and close with "*)" and may be nested. Examples: DATAFRAME SiteBatch; MODEL = ForClim; DATA: SiteName; "Bern"; "Davos"; "Bever_S" END SiteBatch; or DATAFRAME ModelParameters; REMARK = 'For any logistic growth model'; MODEL = ANY; KEYCOLUMN = Ident; DATA: (*===========================================================*) Ident Descr val min max unit ; (*-----------------------------------------------------------*) r 'Relative growth rate' 0.7 0.0 10.0 'd^-1' ; K 'Carrying capacity' 700.0 0.0 1.0E+38 'g/m^2' ; (*-----------------------------------------------------------*) END ModelParameters; Filter specifications serve as a reading filter to control conditional recognition of file references or data frames. Only those file references or data frames, respectively, are scanned for content and loaded into memory, whose filter value (see symbol filterVal in EBNF above) actually fits within the filter range [fromFilter..toFilter] passed as actual arguments to the reading routines ReadDataFramesIntoMemory or LoadDataFrames (see below). Otherwise they are entirely skipped and the scanning process continues after the file reference or data frame, respectively. Any file reference or data frame without a filter specification (optional symbol FilterSpecif, see EBNF above) will be treated to have a filter specification like "USE IF FILTER = 0 (* = defaultFilter *)". To recognize, i.e. load, all file references and data frames regardless of their filter specifications, pass the range [minFilter..maxFilter] to ReadDataFramesIntoMemory or LoadDataFrames, respectively. Note, this module does not export an abstract data type DataFrame. See module DatFraAux instead, which exports this type and allows to access and operate on individual data frames via its routines. NOTE This module uses package DataTables (all data tables are used in locked mode only). Programming o Design Andreas Fischlin 04/03/1997 Dimitrios Gyalistras 23/07/1997 o Implementation Andreas Fischlin 04/03/1997 ETH Zurich Systems Ecology CHN E 35.1 Universitaetstrasse 16 8092 Zurich SWITZERLAND URLs: <mailto:RAMSES@env.ethz.ch> <http://www.sysecol.ethz.ch> <http://www.sysecol.ethz.ch/SimSoftware/RAMSES> Last revision of definition: 05/06/1998 AF *******************************************************************) (********************************) (*##### Error Handling #####*) (********************************) FROM DMLanguage IMPORT userBase; CONST dtfOffset = userBase + 200; (* constants used for resCode *) badLoadFromAnchor = dtfOffset + 0; (* internally used by LoadDataFrames *) loadUserInterrupted = dtfOffset + 1; (* internally used by LoadDataFrames *) integerOverflow = dtfOffset + 2; (* internally used by VI,VIVec *) realOverflow = dtfOffset + 3; (* internally used by VR,VRVec *) datFraHasNoKeyCol = dtfOffset + 4; (* internally used by GetValDefsKeyColumn *) subKeyIsRealOrBoolCol = dtfOffset + 5; (* internally used by GetValDefsSubKeyColumn *) (* Note: Error #'s 500..550 (i.e. userBase + 200..userBase + 250) are reserved for the entire package DataFrames. The underlying DataTables uses range 600..650. The DataFrames package uses only module Errors and its modes for error generation and display. WARNING: In case the mode is suppressing and deferring error messages, you won't learn about any errors which have possibly occurred. See also module Errors. *) (*************************************************) (*##### Loading Data Frames Into Memory #####*) (*************************************************) TYPE ReadingFilter = LONGINT; VAR defaultFilter: ReadingFilter; (* READ ONLY *) minFilter: ReadingFilter; (* READ ONLY *) maxFilter: ReadingFilter; (* READ ONLY *) PROCEDURE ReadDataFramesIntoMemory (startOnFileWithName: ARRAY OF CHAR; fromFilter, tillFilter: ReadingFilter; VAR resCode: INTEGER); PROCEDURE LoadDataFrames (VAR startOnFileWithName: ARRAY OF CHAR; fromFilter, tillFilter: ReadingFilter; VAR resCode: INTEGER); (* Above procedures use the file with name startOnFileWithName (may contain a path) as an anchor to scan and interprete data frames. Any value definitions found during the reading process are transferred to memory and can subsequently be retrieved by any of the value definition retrieval routines described below (e.g. VR or GetRVec). Any file reference or data frame whose filter specification is out of range [fromFilter..tillFilter] will be ignored and skipped by the scanning process. Note, file references or data frames which do not specify its filter explicitely, i.e. those without the phrase "USE IF FILTER = x" are treated as if they would have the specification "USE IF FILTER = defaultFilter (* = 0 *)". To load without any filtering pass the range [minFilter..maxFilter] to above routines. Any data frames already in memory with the same identifier as a data frame encountered during the reading process will be discarded and replaced (overwritten) with the content of the newly read data frame. This allows for a partial or total updating of value definitions or loading of additional data frames etc. See also procedure DropAllDataFrames for freeing memory and module DatFraAux for more advanced operations on data frames. LoadDataFrames is very similar to ReadDataFramesIntoMemory; in fact, it calls the same core function as ReadDataFramesIntoMemory, but offers additional features like offering a file opening dialog in case that startOnFileWithName is the empty string, displays the progress by showing in a small window which file is corrently processed, and features an automatic error display in case that the anchor file can't be accessed properly or ReadDataFramesIntoMemory returns with resCode<>Errors.allOk. startOnFileWithName is also an in/out parameter which allows to learn about which file has actually been opened as anchor. If the file opening fails for whichever reason, the startOnFileWithName returns untouched. All currently installed handlers for a loading event (see module DatFraAux, routine AddAnyDFChangedHandler) will always be called at the end of ReadDataFramesIntoMemory. *) (**********************************************) (*##### Retrieving Value Definitions #####*) (**********************************************) TYPE ALPHA = ARRAY [0..63] OF CHAR; (* also maximum size of identifiers *) ValDefType = (undefined, integer, longint, singlereal, doublereal, boolean, identifier, string); (* Types of value definitions supported by data frames. undefined denotes a value definition which has never been made, i.e. the associated identifier has never been found in any data frames by previous calls to ReadDataFramesIntoMemory (see also procedure GetValDefType). Note, in contrast to a general string an identifier follows a particular syntax and starts with a letter and contains no special characters, i.e. its EBNF is: identifier ::= letter {letter|digit|"_"}. *) VAR undefInteger: INTEGER; (* READ ONLY; = MIN(INTEGER)+1 *) undefLongInt: LONGINT; (* READ ONLY; = MIN(LONGINT)+1 *) undefReal: REAL; (* READ ONLY; = DMConversions.UndefREAL() *) undefLongReal: LONGREAL; (* READ ONLY; = DMConversions.UndefLONGREAL() *) undefBoolean: BOOLEAN; (* READ ONLY; = BOOLEAN(ORD(MAX(BOOLEAN))+1) *) (* WARNING: undefBoolean may be TRUE or FALSE in a compiler dependent manner. *) undefString: ARRAY [0..0] OF CHAR; (* READ ONLY; undefString[0] = 0C *) (* Most above constants are the defaults also used by DataTables. They are returned by the value definition retrieval procedures in case the value definition could not be found or an error occurred, like a number overflow. *) (*--------------------------------*) (*===== scalar retrieval =====*) (*--------------------------------*) PROCEDURE VI (ident: ARRAY OF CHAR): INTEGER; PROCEDURE VLI (ident: ARRAY OF CHAR): LONGINT; PROCEDURE VR (ident: ARRAY OF CHAR): REAL; PROCEDURE VLR (ident: ARRAY OF CHAR): LONGREAL; PROCEDURE VB (ident: ARRAY OF CHAR): BOOLEAN; (* returns undefBoolean if undefined *) PROCEDURE Flag (ident: ARRAY OF CHAR): BOOLEAN; (* returns FALSE if undefined *) PROCEDURE GetVS (ident: ARRAY OF CHAR; VAR s: ARRAY OF CHAR); PROCEDURE GetVId(ident: ARRAY OF CHAR; VAR id: ARRAY OF CHAR); (*--------------------------------*) (*===== vector retrieval =====*) (*--------------------------------*) PROCEDURE GetIVec (ident: ARRAY OF CHAR; VAR ivec: ARRAY OF INTEGER; VAR n: INTEGER); PROCEDURE GetLIVec(ident: ARRAY OF CHAR; VAR livec: ARRAY OF LONGINT; VAR n: INTEGER); PROCEDURE GetRVec (ident: ARRAY OF CHAR; VAR rvec: ARRAY OF REAL; VAR n: INTEGER); PROCEDURE GetLRVec(ident: ARRAY OF CHAR; VAR lrvec: ARRAY OF LONGREAL; VAR n: INTEGER); PROCEDURE GetBVec(ident: ARRAY OF CHAR; VAR bvec: ARRAY OF BOOLEAN; VAR n: INTEGER); PROCEDURE GetSVec(ident: ARRAY OF CHAR; VAR svec: ARRAY OF ALPHA; VAR n: INTEGER); PROCEDURE GetIdVec(ident: ARRAY OF CHAR; VAR idvec: ARRAY OF ALPHA; VAR n: INTEGER); PROCEDURE GetSs (ident: ARRAY OF CHAR; sepChar: CHAR; VAR ssvec: ARRAY OF CHAR; VAR n: INTEGER); PROCEDURE GetIds (ident: ARRAY OF CHAR; sepChar: CHAR; VAR idvec: ARRAY OF CHAR; VAR n: INTEGER); (* Above procedures allow to fetch value definitions as contained in all data frames currently stored in memory. Basic methods of retrieving value definitions --------------------------------------------- Value definitions are given by an identifier consisting of 2 (vector retrieval) or 3 parts (scalar retrieval), where parts are delimited by ".". Ex.: "MyModel.Bern.Bucketsize" - scalar retrieval returns single value "MyModel.Bucketsize" - vector retrieval returns vector of n values Formation of ident ------------------ The identifier ident denotes a particular value definition, the procedures return the corresponding value (in case of vector retrieval additional information on the dimension of the vector is also returned). In case of scalar retrieval, the middle part needs not be an identifier, but can be any string and may even contain a period. Ex.: good idents for value definitions: "ForClim.Bern.Beech_species" "FC_C.St. Gallen.Bucketsize" bad idents for value definitions: "ForClim.Bern.Beech specs." "FC-C.St. Gallen.Bucketsize" Note: There are auxiliary routines (ConcatIdent and SplitIdent) available from this module for construction and analysis of ident. The first part of ident is always formed exactly the same way: It is either the identifier of the data frame (see also routine DatFraAux.GetDataFrameIdent) or it is the parent or model identifier, respectively. Any data frame may use a so called parent or model specification (see EBNF-production ParentOrModelSpecif), hereby using a common identifier for value definitions. This allows to construct large sets of value definitions denoted by a common first part from several data frames. Ex.: DATAFRAME SiteEdaphics; MODEL = ForClim; Data: (*-------------------------------------*) Site Bucketsize SiteID; (*-------------------------------------*) "Bern" 30.0 333333 ; "Bever S" 20.0 222222 ; "Davos" 25.0 666666 ; END SiteEdaphics; DATAFRAME TreeParams; MODEL = ForClim; KEYCOLUMN = SpecIdent; DATA: (*-------------------------------------------------------------*) SpecIdent Scientific_name Common_name; (*-------------------------------------------------------------*) Fsil "Fagus silvatica L." "European beech" ; Aalb "Abies alba Mill." "European silver fir" ; END TreeParams; Above data frames allow for accessing value definitions, which all start with an identical first part, i.e. idents like "ForClim.Site", "ForClim.BucketSize", or "ForClim.Fsil.Scientific_name" (see also following explanations). The remainder parts of ident are constructed in two different ways, depending on the retrieval method: Retrieval method Examples of ident ---------------- ------------------------- scalar retrieval "MyModel.ThirdRowEntry.MyColumnHead" "SiteEdaphics.Bern.BucketSize" vector retrieval "MyModel.MyColumnHead" "SiteEdaphics.BucketSize" IMPORTANT NOTE: Scalar retrieval is only possible if the data frame uses a keyColumn. Vector retrieval is always possible. In case of scalar retrieval, the first part of ident is followed by a '.' and the identifier of a particular column. Such an ident always denotes an entire vector of values. The tabulated number of rows defines its dimension n. E.g. the following data frame DATAFRAME SiteEdaphics1; DATA: (*-------------------------------------*) Site Bucketsize SiteID; (*-------------------------------------*) "Bern" 30.0 333333 ; "Bever S" 20.0 222222 ; "Davos" 25.0 666666 ; END SiteEdaphics1; allows to retrieve all bucket sizes by code similar to this VAR bs: ARRAY [0..maxNoSites] OF LONGREAL; ... GetRVec("SiteEdaphics1.Bucketsize",bs,n); IF n<=maxNoSites THEN (* use bs *) ELSE Errors.Info(... END(*IF*); Similarily the corresponding site identifiers could be retrieved by VAR ss: ARRAY [0..maxNoSites] OF ALPHA; ... GetSVec("SiteEdaphics1.Site",ss,n); In case of scalar retrieval (a keyColumn is in use), the remainder parts of the identifier is built as follows: Again first part is followed by a '.', but then comes the string found in the particular row of the column headed by keyColumn, another '.', plus the identifier of the column. E.g. the following data frame lists in the key column Site strings; each of these can be used to denote a particular row. DATAFRAME SiteEdaphics2; KEYCOLUMN = Site; DATA: (*-------------------------------------*) Site Bucketsize SiteID; (*-------------------------------------*) "Basel" 30.0 111111 ; "Bern" 30.0 333333 ; "Bever S" 20.0 222222 ; "St. Gotthard" 20.0 444444 ; "Bever_N" 15.0 555555 ; "Davos" 25.0 666666 ; "Bern" 25.0 333333 ; END SiteEdaphics2; Above data frame allows to retrieve individual bucket sizes (Bucketsize) or site identification numbers (SiteID) as follows myBucketSize := VR("SiteEdaphics2.Basel.Bucketsize"); theSiteID := VI("SiteEdaphics2.Davos.SiteID"); or GetSVec("SiteEdaphics2.Site",ss,n); IF n<=maxNoSites THEN FOR i := 0 TO n-1 DO ConcatIdent("SiteEdaphics2",ss[i],"Bucketsize",id); myBucketSize[i] := VR(id); END(*FOR*); ELSE Errors.Info(... END(*IF*); To allow for model specific retrieval above header of the data frame would need the additional MODEL clause similar to this: DATAFRAME SiteEdaphics2; KEYCOLUMN = Site; MODEL = ForClimS; DATA: (*-------------------------------------*) Site Bucketsize SiteID; (*-------------------------------------*) "Basel" 30.0 111111 ; ... ... allowing for retrievals similar to this myBucketSize := VR("ForClimS.Basel.Bucketsize"); above retrieval would also work if the headers of data frame SiteEdaphics2 would contain the reserved word ANY or ALL, i.e. "MODEL = ANY" or "MODEL = ALL". Uniqueness of Value Definitions ------------------------------- ASSERTION: Since the reading process of data frames unifies any value definitions referring to the same value, the access of a particular value definition via its identifier ident and its expected type is unique at all times. Note, in the above example for data frame SiteEdaphics2, which contains 2 entries for "Bern", above assertion becomes relevant. Module DataFrames always unifies all data definitions encountered by treating any conflicts like redifinitions and overwriting a particular value definition with the last found during reading, i.e. VR("SiteEdaphics2.Bern.Bucketsize") returns in the above example 25.0 and not 30.0. To learn about the fact of multiple definitions use procedure DatFraAux.InspectValDef (returned value of nrOfDefs). Note also, in case the same identifier denotes value definitions of different types, they can still be retrieved in a unique way. E.g. DATAFRAME SiteEdaphics3; MODEL = ForClim; KEYCOLUMN = Site; DATA: (*-------------------------------------*) Site Bucketsize ; (*-------------------------------------*) "Bern" 30.0 ; "Bever S" 20.0 ; "Davos" 25.0 ; END SiteEdaphics3; DATAFRAME SiteEdaphics4; MODEL = ForClim; KEYCOLUMN = Site; DATA: (*-------------------------------------*) Site Bucketsize ; (*-------------------------------------*) "Bern" high ; "Bever S" low ; "Davos" medium ; END SiteEdaphics4; In above example you can either call VR("ForClim.Bern.Bucketsize") or you can call GetVS("ForClim.Bern.Bucketsize", bucketSizeCode) successfully. The result is different and lets you distinguish between the two types of value definitions. However, within a particular type, the identifier denotes uniquely a single value definition without ambiguity. Important note: Be aware that for the sake of simplicity, the following types (see ValDefType) are merged and columns of one of these types is treated as the same: integer with longint, singlereal with doublereal, This means that a real value definition for a given identifier is uniqe within all present real types, i.e. regardless wether a particular column is of type singlereal or doublereal etc. Note also this means that you can retrieve the very same real value either with VR or with VLR. If the actual value is too large to fit into a single real (type REAL), the difference is that you get an overflow by using VR. This means, that VR vs. VLR, or VI vs. VLI etc. differ only in their memory requirements for storing the returned values, but not which original value definition they are referencing in the data base. IMPLEMENTATION RESTRICTION: A column which contains only NA is of no defined type. Thus any retrieval of values from such a column would always only return undefined values. Therefore, for the sake of simplicity and efficiency such columns will be completely ignored and attempts to retrieve data from them will be treated by this model as if the column would be completely absent. Finally note that certain priority rules are observed during data retrieval to allow for easier assignment of data to models in the context of modeling and simulation (clauses "MODEL = ANY" and "MODEL = ALL"). See below type definition DataFrameSet and routine LastValDefFoundIn. Vector dimension (in case of vector retrieval) ---------------- In case of vector retrieval n returns the dimension of the actual vector as stored in the data frame where ident was found. Usually n is the number of elements actually assigned to the returned vector variable like rvect, ivect etc. Note however, if the callee passes a vector of insufficient length, only HIGH(vect) values will be assigned and no error message is produced (incomplete retrieval). Only n will return the actual, larger number of elements contained in the data frame. To learn about the actual dimension of the vector data use routine GetVectDim in order to proved a sufficiently large object for data retrieval. If the value definition couldn't be found, n is 0. Type of Value Definitions ------------------------- Note, a data frame defines the type of values hold in a particular column by implicit definition. The first row tabulated defines the type of the data expected in the entire table (for details see modules DatFraAux and DataTables, especially the latter which is used by DataFrames to load data frames). Any subsequent row, which contains an entry in a particular column which does not match the expected type, will be treated as an erronous condition. Such errors will be normally reported during any calls to LoadDataFrames or ReadDataFramesIntoMemory and no such faulty data frame should ever be available for retrieval of value definitions. If the type of the value definition does not match the expected one as given by the routine like VR or GetRVect, one of the read only variables undefInteger, undefReal, or undefBoolean are returned instead; for the string or identifier types, the emtpy string is returned. The same is the case for missing values. The latter are entered in a data frame by using the symbol NA instead of the value. IMPLEMENTATION RESTRICTIONS: To allow for an unambiguous detection of a missing value definition (symbol NA) in case VI, VB, GetVS, or GetVId (GetIVec, GetBVec, GetSVec, GetIdVec, GetSS, or GetIds respectively) return undefInteger or undefBoolean, inspect n or use also GetValDefType to learn whether the value definition is really undefined or the data frame happens to contain exactly that value used by undefInteger or undefReal. A column which contains only NA is of no defined type. Thus any retrieval of values from such a column would always only return undefined values. Therefore, for the sake of simplicity and efficiency such columns will be completely ignored and attempts to retrieve data from them will be treated by this model as if the column would be completely absent. GetVS vs. GetVId ---------------- GetVS can retrieve only a string, whereas GetVId can retrieve only an identifier. GetSs/GetIds vs. GetSVec/GetIdVec --------------------------------- GetSs (or GetIds) returns a vector of strings (or identifiers), but in contrast to GetSVec (or GetIdVec) not as an open array, but as a single large string, where the elements of the vector are delimited by the separator sepChar, typically "|". Use DMStrings.ExtractSubString to access invidual elements with the following algorithm GetSs(ident,"|",ssvec,n); ix := 0; count := 0; ExtractSubString(ix,ssvec,element,"|"); WHILE (element[0]<>0C) DO INC(count); (* process element count *) ExtractSubString(ix,ssvec,element,"|"); END(*WHILE*); IF count<>n THEN Errors.Info(... END(*IF*); *) (*************************************************************************) (*##### Construction & Analysis of Value Definition Identifiers #####*) (*************************************************************************) PROCEDURE ConcatIdent( dtfParModId, (* data frame (no key column) or parent or model identifier (with key column) *) rowId, (* = empty string if vector retrieval *) colId: ARRAY OF CHAR; VAR ident: ARRAY OF CHAR); PROCEDURE SplitIdent ( ident: ARRAY OF CHAR; VAR dtfParModId, (* data frame (no key column) or parent or model identifier (with key column) *) rowId, (* = empty string if vector retrieval *) colId: ARRAY OF CHAR); (* Concatenates, respectively splits, ident from/to its parts for subsequent uses while calling routines retrieving value definitions. *) PROCEDURE FixIdent(VAR ident: ARRAY OF CHAR); (* Utility to fix the identifier 'ident', which has recently been used to retrieve a value definition according to the result by LastValDefFoundIn() (from this module). Upon returning from FixIdent the first part of the ident may contain ALL or ANY, respectively, instead of the original model name, depending in which set of data frames the value definition was actually retrieved. If the value definition could not be retrieved, the first part contains "-" instead of the model name. *) (********************************************) (*##### Type of A Value Definition #####*) (********************************************) TYPE ValDefTypes = SET OF ValDefType; (* Since value definitions referred to by the same identifier can have different types, a particular identifier can be used to retrieve not just a single, but a set of value definition types (see also above comments on topic "Uniqueness of Value Definitions" and routine GetValDefType). *) PROCEDURE GetValDefType(ident: ARRAY OF CHAR; VAR fdt: ValDefTypes); (* Allows to learn about the type(s) of the value definitions, without actually having to retrieve the data. Returns the empty set if the value definition is currently not stored in memory. *) (***********************************************************) (*##### Origin of Last Retrieved Value Definition #####*) (***********************************************************) TYPE DataFrameSet = (nowhere, inALL, inSpecific, inANY); (* Each particular value definitions can be retrieved from 3 basic sets of data frames. The first set (inALL) is the set of all data frames with the phrase "MODEL = ALL" respectively "PARENT = ALL". The second set (inSpecific) is the set of all data frames with the phrase "MODEL =" respectively "PARENT = PROCEDURE LastValDefFoundIn(): DataFrameSet; (* Immediately after retrieving a value definition use this routine to learn about the set of data frames from which the value definition has been retrieved. See also explanations on TYPE DataFrameSet (No implementation restrictions). *) PROCEDURE GetLastValDefsDFIdent (VAR lastDataFrameIdent: ARRAY OF CHAR); (* Immediately after retrieving a value definition use this routine to learn about the identifier of the data frame from which the value definition has been retrieved. IMPLEMENTATION RESTRICTION: Works only if the data frame has not been dropped in the meantime. *) (******************************************) (*##### Special Vector Retrieval #####*) (******************************************) PROCEDURE GetVectDim (ident: ARRAY OF CHAR; ofType: ValDefType; VAR n: INTEGER); (* Allows to learn about the dimension n of the vector denoted by identifier ident and of type ofType in the context of vector retrieval without actually having to retrieve the data. Returns n = 0 if value definition is currently not present. *) PROCEDURE GetValDefsKeyColumn (ident: ARRAY OF CHAR; ofType: ValDefType; sepChar: CHAR; VAR keys: ARRAY OF CHAR; VAR n: INTEGER); (* Allows to retrieve the key column of the data frame in which the value definition denoted by identifier ident and of type ofType is contained. n returns the dimension of the key column vector (regardless wether all its content fits within keys or not). Actual keys are delimited by character sepChar, typically "|". See also below routine GetValDefsSubKeyColumn. *) PROCEDURE GetValDefsSubKeyColumn (ident: ARRAY OF CHAR; ofType: ValDefType; subKeyIdent: ARRAY OF CHAR; sepChar: CHAR; VAR subKeys: ARRAY OF CHAR; VAR n: INTEGER); (* Allows to retrieve the column with identifier subKeyIdent of the data frame in which the value definition denoted by identifier ident and of type ofType is contained. The column must be of type integer, long integer, string, or identifier, or the procedure will return nothing (n=0, subKeys empty). Use for subKeys not an open array, but a single large string, where the elements of the vector are delimited by the separator sepChar, typically "|". Consequently, in case of an integer column, you have to convert the strings first to integer values. The ideas of subkeys is simply that in a data frame some columns beside the key column may serve to establish a particular order among the rows within a data frame. Typically you can use such a column to store indices which help to further specify the main key to denote particular rows when multiple value definitions are present in data frames. By means of vector retrieval it is then possible to retrieve more than just the last value definition by searching within the vector denoted by ident in function of the independent main key and sub keys of any number. E.g.: DATAFRAME MonitoringPart1; MODEL = ANY; KEYCOLUMN = Ident; DATA: (*------------------------------------------------------------*) Ident MonitLev Filing Tabulation ; (*------------------------------------------------------------*) grass 3 FALSE FALSE ; grass 4 FALSE FALSE ; grassDot 3 TRUE FALSE ; grassDot 4 TRUE FALSE ; (*------------------------------------------------------------*) END MonitoringPart1; DATAFRAME MonitoringPart2; MODEL = ANY; KEYCOLUMN = Ident; DATA: (*------------------------------------------------------------*) Ident MonitLev ScaleMin ScaleMax Graphing ; (*------------------------------------------------------------*) grass 3 0.0 1000.0 Y ; grass 4 0.0 700.0 X ; grassDot 3 NA NA noG ; grassDot 4 0.0 500.0 Y ; (*------------------------------------------------------------*) END MonitoringPart2; The two data frames define for the same keys as given by the identifier listed in the key column 'Ident' for the two monitoring levels 3 and 4 in the first data frame the filing and tabulation attributes, in the second data frame the graphing attributes. Column 'MonitLev' serves here as a subkey column, since each variable identifier, e.g. grass, has two value definitions tabulated, one for monitoring level 3 and the other for level 4. Of course only vector retrieval can retrieve these multiple definitions, yet the tabulated data can still be interpreted fully, since the order as given by the main key and the subkeys (redundantly present in both data frames) can still be retrieved and used by calling routines GetValDefsKeyColumn and GetValDefsSubKeyColumn like this: GetValDefsKeyColumn("MyModel.Tabulation","|",idents,n); GetValDefsSubKeyColumn("MyModel.Tabulation",boolean, "MonitLev", monLevsArr, n); will return: idents = "grass|grass|grassDot|grassDot" monLevsArr = "3|4|3|4" n = 4 Note in above example the calls GetValDefsKeyColumn("MyModel.Graphing","|",idents,n); GetValDefsSubKeyColumn("MyModel.Graphing",identifier, "MonitLev", monLevsArrStr, n); retrieve the same information as GetSs("MyModel.Ident","|",idents,n); GetIVec("MyModel.MonitLev",monLevsArrInt,n); which is not necessary the same as what GetValDefsKeyColumn("MyModel.Tabulation","... might retrieve, in case the key columns or sub key columns should differ in content from on to the other data frame. The actual content of the key column and possible subkey columns and wether they really correctly define the same ordering among two data frames is fully the users responsibility. Same order means for instance that individual components from retrieved vectors can be paired, e.g. the vectors 'MyModel.Tabulation' with 'MyModel.Graphing'. An analysis of the order defining vectors can reveal the validity of the assumption before actually using any retrieved data. *) (*******************************************************) (*##### Dropping Data Frames / Freeing Memory #####*) (*******************************************************) PROCEDURE DropAllDataFrames; (* Removes all data frames from memory and releases the memory for other uses. Note, if you call several times LoadDataFrames respectively ReadDataFramesIntoMemory without calling DropAllDataFrames inbetween, that data frames tend to accumulate in memory, using up more and more memory. This is of course not true, if you reread the same data frame files starting with the same anchor (and the same reading filers), because data frames with an identical identifier will be overwritten with the latest data found in the scanned data frame files, hereby reusing the memory space efficiently. Note also, this means that you ought to call procedure DropAllDataFrames in case you want to make sure that older value definitions contained in any previously read data frames are forgotten. In case you wish to forget only about individual data frames, see the routines provided by module DatFraAux, which support versatile plus efficient rereading and management of data frames. All currently installed handlers for a dropping event (see module DatFraAux, routines AddAnyDFChangedHandler and AddDFDropHandler) will also be called. *) END DataFrames." or without that phrase but with the data frame identifier . E.g. the value definition given by "MyModel.Bern.Bucketsize" defines as the set inSpecific all data frames with the phrase "MODEL = MyModel", respectively "PARENT = MyModel" plus the single data frame with the name "MyModel" but without the phrase "MODEL = ...". The third set (inANY) is the set of all data frames with the phrase "MODEL = ANY" respectively "PARENT = ANY". While retrieving data the following, fundamental priority rules apply: 1) A value definition present in the set inALL, is retrieved with the highest priority. The first part of the identifier is ignored but the remainder parts have to match exactly the value definition. E.g. the value definition given by "MyModel.Bern.Bucketsize" is retrieved from the set inALL as soon it contains a data frame where the key column contains in a row "Bern" and another column is headed by "Bucketsize". The value definition actually used is "ALL.Bern.Bucketsize". This is the case regardless whether the value definition is also present in a present data frame using "MODEL = MyModel". 2) A value definition present in the set inSpecific, is retrieved when it can't be retrieved from the set inALL. Then the entire identifier denoting the value definition must match; in particular its first part must match exactly used in phrase "MODEL = ", "PARENT = ", or in "DATAFRAME ". E.g. the value definition given by "MyModel.Bern.Bucketsize" is retrieved from the set inSpecific as soon as this set contains a data frame where the key column contains in a row "Bern" and another column is headed by "Bucketsize". The value definition actually used is exactly "MyModel.Bern.Bucketsize". This is the case regardless whether the value definition is also contained in a data frame present using "MODEL = ANY". 3) A value definition present in the set inANY, is retrieved when it can't be retrieved from the set inALL, nor the set inSpecific. Then again as with the set inALL, the first part of the identifier is ignored and only the remainder parts have to match the value definition. E.g. the value definition given by "MyModel.Bern.Bucketsize" is retrieved from the set inANY as soon as this set contains a data frame where the key column contains in a row "Bern" and another column is headed by "Bucketsize". The value definition actually used is "ANY.Bern.Bucketsize". A value definition not present in any of the three sets, is not defined and will return values undefined. To learn about the origin of a particular, successfully retrieved value definition, use procedure LastValDefFoundIn. In the context of modeling and simulation above sets serve particular purposes: The set inALL is particularily useful to define common parameters which are used by several submodels in a structured, complex model system. The set inSpecific of course, does exactly the opposite, i.e. it defines parameters which are specific to individual submodels. The set inANY is useful to define particular model parameters without having already a specific model in mind. Ex.: (*===========================================================*) (* Model Parameters *) (*===========================================================*) DATAFRAME ModelParameters; REMARK = 'For any logistic growth model'; MODEL = ANY; KEYCOLUMN = Ident; DATA: (*===========================================================*) Ident Descr val min max unit ; (*-----------------------------------------------------------*) r 'Relative growth rate' 0.7 0.0 10.0 'd^-1' ; K 'Carrying capacity' 700.0 0.0 1.0E+38 'g/m^2' ; (*-----------------------------------------------------------*) END ModelParameters; (*===========================================================*) Using module ModDatAccess it is easy possible to have the tabulated value definitions be assigned to any logistic growth model, regardless of its identifier. I.e. retrieving the values VR("MyModel.r.val"), GetVS("MyModel.r.Descr",descr), or VR("MyModel.r.min") etc. will always be successful thanks to the fact the data frame ModelParameters is of model ANY. However, if the data frame would be DATAFRAME ModelParameters OF MODEL Logistic retrieving data by VR("MyModel.r.val") would not succeed (For explanations on the procedures VR, GetVS etc. see their comments). *)
|
||
|
|
|