1 | Developer Notes for the XPATH-lite API |
---|
2 | |
---|
3 | This API is built on top of the more fundamental SAX one. The basic |
---|
4 | idea is to have a set of programable handlers that communicate among |
---|
5 | themselves and with the parser via module variables. |
---|
6 | |
---|
7 | A new pseudo-handler, "signal_handler", has been added to the optional |
---|
8 | argument list of xml_parse(). In the current implementation it just |
---|
9 | checks whether a stop signal has been raised by the user side of the |
---|
10 | program (for example, after a "begin tag" event, or after going out |
---|
11 | of the scope of a parent element). |
---|
12 | |
---|
13 | Since the SAX parser is stream-oriented, and XPATH searches can be |
---|
14 | done in any order, there are new routines to rewind the XML file and |
---|
15 | to synchronize the physical file reader with a previously saved point |
---|
16 | in the XML tree. |
---|
17 | |
---|
18 | These tools, and the rather sophisticated path-matching routine |
---|
19 | provided (which allows for wildcards and the "//" construction) would |
---|
20 | be enough for standard use, as implemented in the routine |
---|
21 | "get_node". However, there is also the possibility of performing |
---|
22 | searches constrained to a given ancestor element ("context" searches) |
---|
23 | so that blocks of logically related information can be processed |
---|
24 | together. A context is implicitly created by a call to |
---|
25 | "mark_node". Contexts can be saved and "synched" to, allowing for |
---|
26 | repeated constrained searches (calls with relative paths). Contexts |
---|
27 | can even be passed to subroutines to package the parsing of common |
---|
28 | elements once and for all. (See for example Examples/xpath/pseudo.f90). |
---|
29 | This feature is nevertheless in need of a more rigorous specification: |
---|
30 | |
---|
31 | * What should be the behavior if a "mark_node" is followed by a |
---|
32 | call to "get_node" with an absolute path? |
---|
33 | * Should there always be an "automatic rewind" to the beginning of the |
---|
34 | context before any successive calls to "get_node"? |
---|
35 | |
---|
36 | |
---|
37 | LIMITATIONS |
---|
38 | |
---|
39 | The pcdata buffer provided by the user as a character variable could |
---|
40 | overflow. Note that the parser itself uses a string of length |
---|
41 | MAX_PCDATA_SIZE (currently 65536) as a buffer to hold PCDATA. A |
---|
42 | warning is issued if there is not enough space (in the user or in the |
---|
43 | system buffer) to hold the data. |
---|
44 | |
---|
45 | Support for converting PCDATA characters to numerical arrays "on the |
---|
46 | fly" is planned for a forthcoming version. |
---|
47 | |
---|
48 | |
---|
49 | The coding style is that of the F subset of Fortran90. I strongly |
---|
50 | believe that it makes for better coding and fewer errors. |
---|
51 | Go to http://www.fortran.com/imagine1/ and get a feel for it. You can |
---|
52 | download free implementations for Linux and Windows, or get an |
---|
53 | inexpensive CD+Book combination to help support the project. Of course, |
---|
54 | F *is* Fortran, so you can always compile it with a Fortran compiler. |
---|
55 | |
---|
56 | |
---|
57 | |
---|
58 | |
---|
59 | |
---|
60 | |
---|
61 | |
---|
62 | |
---|
63 | |
---|