[6] | 1 | Developer Notes for the XPATH-lite API |
---|
| 2 | |
---|
| 3 | This API is built on top of the more fundamental SAX one. The basic |
---|
| 4 | idea is to have a set of programable handlers that communicate among |
---|
| 5 | themselves and with the parser via module variables. |
---|
| 6 | |
---|
| 7 | A new pseudo-handler, "signal_handler", has been added to the optional |
---|
| 8 | argument list of xml_parse(). In the current implementation it just |
---|
| 9 | checks whether a stop signal has been raised by the user side of the |
---|
| 10 | program (for example, after a "begin tag" event, or after going out |
---|
| 11 | of the scope of a parent element). |
---|
| 12 | |
---|
| 13 | Since the SAX parser is stream-oriented, and XPATH searches can be |
---|
| 14 | done in any order, there are new routines to rewind the XML file and |
---|
| 15 | to synchronize the physical file reader with a previously saved point |
---|
| 16 | in the XML tree. |
---|
| 17 | |
---|
| 18 | These tools, and the rather sophisticated path-matching routine |
---|
| 19 | provided (which allows for wildcards and the "//" construction) would |
---|
| 20 | be enough for standard use, as implemented in the routine |
---|
| 21 | "get_node". However, there is also the possibility of performing |
---|
| 22 | searches constrained to a given ancestor element ("context" searches) |
---|
| 23 | so that blocks of logically related information can be processed |
---|
| 24 | together. A context is implicitly created by a call to |
---|
| 25 | "mark_node". Contexts can be saved and "synched" to, allowing for |
---|
| 26 | repeated constrained searches (calls with relative paths). Contexts |
---|
| 27 | can even be passed to subroutines to package the parsing of common |
---|
| 28 | elements once and for all. (See for example Examples/xpath/pseudo.f90). |
---|
| 29 | This feature is nevertheless in need of a more rigorous specification: |
---|
| 30 | |
---|
| 31 | * What should be the behavior if a "mark_node" is followed by a |
---|
| 32 | call to "get_node" with an absolute path? |
---|
| 33 | * Should there always be an "automatic rewind" to the beginning of the |
---|
| 34 | context before any successive calls to "get_node"? |
---|
| 35 | |
---|
| 36 | |
---|
| 37 | LIMITATIONS |
---|
| 38 | |
---|
| 39 | The pcdata buffer provided by the user as a character variable could |
---|
| 40 | overflow. Note that the parser itself uses a string of length |
---|
| 41 | MAX_PCDATA_SIZE (currently 65536) as a buffer to hold PCDATA. A |
---|
| 42 | warning is issued if there is not enough space (in the user or in the |
---|
| 43 | system buffer) to hold the data. |
---|
| 44 | |
---|
| 45 | Support for converting PCDATA characters to numerical arrays "on the |
---|
| 46 | fly" is planned for a forthcoming version. |
---|
| 47 | |
---|
| 48 | |
---|
| 49 | The coding style is that of the F subset of Fortran90. I strongly |
---|
| 50 | believe that it makes for better coding and fewer errors. |
---|
| 51 | Go to http://www.fortran.com/imagine1/ and get a feel for it. You can |
---|
| 52 | download free implementations for Linux and Windows, or get an |
---|
| 53 | inexpensive CD+Book combination to help support the project. Of course, |
---|
| 54 | F *is* Fortran, so you can always compile it with a Fortran compiler. |
---|
| 55 | |
---|
| 56 | |
---|
| 57 | |
---|
| 58 | |
---|
| 59 | |
---|
| 60 | |
---|
| 61 | |
---|
| 62 | |
---|
| 63 | |
---|