Outdated egg!
This is an egg for CHICKEN 4, the unsupported old release. You're almost certainly looking for the CHICKEN 5 version of this egg, if it exists.
If it does not exist, there may be equivalent functionality provided by another egg; have a look at the egg index. Otherwise, please consider porting this egg to the current version of CHICKEN.
libxml2
Libxml2 is a XML C parser and toolkit with DOM, SAX and text-reader APIs.
TOC »
- Outdated egg!
- libxml2
- LibXML2
- Author
- Upstream
- Egg Source Code
- Miscellaneous
- DOM Parser
- Example
- Node Types
- dom:element-node
- dom:attribute-node
- dom:text-node
- dom:cdata_section_node
- dom:entity-ref-node
- dom:entity-node
- dom:pi-node
- dom:comment-node
- dom:document-node
- dom:document-type-node
- dom:document-frag-node
- dom:notation-node
- dom:html-document-node
- dom:dtd-node
- dom:element-decl
- dom:attribute-decl
- dom:entity-decl
- dom:namespace-decl
- dom:xinclude-start
- dom:xinclude-end
- API
- dom:is-element-node?
- dom:is-text-node?
- dom:is-attribute-node?
- dom:parse-string
- dom:parse-string-default
- dom:cleanup-parser
- dom:parse-file
- dom:free-doc
- dom:make-parser-context
- dom:read-file-with-context
- dom:is-valid?
- dom:free-parser-context
- dom:to-string
- dom:next-node
- dom:node-content
- dom:node-children
- dom:node-name
- dom:is-element-name?
- dom:get-attribute
- dom:attributes
- SAX Parser
- Text Reader Parser
- Example
- Node Types
- text-reader:none
- text-reader:element
- text-reader:attribute
- text-reader:text
- text-reader:cdata
- text-reader:entity-reference
- text-reader:entity
- text-reader:processing-instruction
- text-reader:comment
- text-reader:document
- text-reader:document-type
- text-reader:document-fragmenta
- text-reader:notation
- text-reader:whitespace
- text-reader:significant-whitespace
- text-reader:end-element
- text-reader:end-entity
- text-reader:xml-declaration
- API
- text-reader:element-to-string
- text-reader:end-element-is?
- text-reader:start-element-is?
- text-reader:end-element-node?
- text-reader:text-node?
- text-reader:element-node?
- text-reader:make
- text-reader:read-more
- text-reader:free
- text-reader:node-type
- text-reader:empty-element?
- text-reader:move-to-attribute
- text-reader:all-attributes
- text-reader:move-to-next-attribute
- text-reader:move-to-first-attribute
- text-reader:move-to-element
- text-reader:next
- text-reader:next-sibling
- text-reader:name
- text-reader:value
- About this egg
LibXML2
Libxml2 is the XML C parser and toolkit developed for the Gnome project but usable outside of the Gnome platform), it is free software available under the MIT License. XML itself is a metalanguage to design markup languages, i.e. text language where semantic and structure are added to the content using extra 'markup' information enclosed between angle brackets. HTML is the most well-known markup language. Though the library is written in C a variety of language bindings make it available in other environments.
Author
David Ireland (djireland79 at gmail dot com)
Upstream
Egg Source Code
https://gitlab.com/maxwell79/chicken-libxml2
libxml
[module] libxml
- attributes->string
- text-reader:element-to-string
- text-reader:end-element-is?
- text-reader:start-element-is?
- text-reader:end-element-node?
- text-reader:text-node?
- text-reader:element-node?
- text-reader:make
- text-reader:read-more
- text-reader:free
- text-reader:depth
- text-reader:node-type
- text-reader:empty-element?
- text-reader:move-to-attribute
- text-reader:all-attributes
- text-reader:move-to-next-attribute
- text-reader:move-to-first-attribute
- text-reader:move-to-element
- text-reader:next
- text-reader:next-sibling
- text-reader:name
- text-reader:value
- sax:attributes->list
- sax:parse-file
- sax:parse-string
- sax:make-handler
- sax:free-handler
- dom:is-element-node?
- dom:is-text-node?
- dom:is-attribute-node?
- dom:parse-string
- dom:parse-string-default
- dom:cleanup-parser
- dom:memory-dump
- dom:parse-file
- dom:free-doc
- dom:make-parser-context
- dom:read-file-with-context
- dom:is-valid?
- dom:free-parser-context
- dom:to-string
- dom:copy-doc
- dom:root-element
- dom:copy-node
- dom:copy-node-list
- dom:next-node
- dom:node-content
- dom:node-children
- dom:node-type
- dom:node-name
- dom:is-element-name?
- dom:get-attribute
- dom:attributes
Miscellaneous
attributes->string
- attributes->string attributesprocedure
Converts an attribute list to string
- attributes
- List of attributes
Examples
Example:
(attributes->string `(("id1" . "value1") ("id2" . "value2"))) => " id2=\"value2\" id1=\"value1\""
DOM Parser
DOM stands for the Document Object Model; this is an API for accessing XML or HTML structured documents.
Example
(define (dom-demo) (define (print-element-names node) (let loop ((n node)) (when n (when (dom:is-element-node? n) (print "element <" (dom:node-name n) ">") (print "@ => " (dom:attributes n))) (when (dom:is-text-node? n) (print "content => " (dom:node-content n))) (print-element-names (dom:node-children n)) (loop (dom:next-node n))))) (define ctx (dom:make-parser-context)) (define doc (dom:read-file-with-context ctx "foo.xml" #f 0)) (define root (dom:root-element doc)) (define valid? (dom:is-valid? ctx)) (print "XML is valid?: " valid?) (print "root: " root) (print-element-names root) (dom:free-doc doc) (dom:cleanup-parser))
Node Types
dom:element-node
- dom:element-nodeconstant
DOM element node
dom:attribute-node
- dom:attribute-nodeconstant
DOM attribute node
dom:text-node
- dom:text-nodeconstant
DOM text node
dom:cdata_section_node
- dom:cdata_section_nodeconstant
DOM CData node
dom:entity-ref-node
- dom:entity-ref-nodeconstant
DOM Entity reference node
dom:entity-node
- dom:entity-nodeconstant
DOM entity node
dom:pi-node
- dom:pi-nodeconstant
DOM pi-node
dom:comment-node
- dom:comment-nodeconstant
DOM comment node
dom:document-node
- dom:document-nodeconstant
DOM document node
dom:document-type-node
- dom:document-type-nodeconstant
DOM document type node
dom:document-frag-node
- dom:document-frag-nodeconstant
DOM document frag node
dom:notation-node
- dom:notation-nodeconstant
DOM notation node
dom:html-document-node
- dom:html-document-nodeconstant
DOM HTML document node
dom:dtd-node
- dom:dtd-nodeconstant
DOM DTD node
dom:element-decl
- dom:element-declconstant
DOM element declaration
dom:attribute-decl
- dom:attribute-declconstant
DOM attributte declaration
dom:entity-decl
- dom:entity-declconstant
DOM entity declaration
dom:namespace-decl
- dom:namespace-declconstant
DOM namespace declaration
dom:xinclude-start
- dom:xinclude-startconstant
DOM xinclude start declaration
dom:xinclude-end
- dom:xinclude-endconstant
DOM xinlude end declaration
API
dom:is-element-node?
- dom:is-element-node? nodeprocedure
Checks if specified dom:node is a element node
- node
- A dom:xml-node
dom:is-text-node?
- dom:is-text-node? nodeprocedure
Checks if specified dom:node is a text node
- node
- A dom:xml-node
dom:is-attribute-node?
- dom:is-attribute-node? nodeprocedure
Checks if specified dom:node is an attribute node
- node
- A dom:xml-node
dom:parse-string
- dom:parse-string xml-string xml-size URL encoding optionsprocedure
Parse string using the DOM parser API
- xml-string
- XML string
- xml-size
- Size of the XML string
- URL
- XML URL
- encoding
- Encoding
- options
- Options
dom:parse-string-default
- dom:parse-string-default strprocedure
Parse string using the DOM parser API with default options and encoding
- xml-string
- XML string
dom:cleanup-parser
- dom:cleanup-parserconstant
Free the dom:doc
dom:parse-file
- dom:parse-file filenameprocedure
Parse a file using the DOM parser API
- filename
- XML file
dom:free-doc
- dom:free-docprocedure
Free the dom:doc
dom:make-parser-context
- dom:make-parser-contextprocedure
Create a DOM parser context
dom:read-file-with-context
- dom:read-file-with-context context filename encoding optionsprocedure
Parse a XML file using the given DOM parser context
- context
- DOM parser context
- filename
- encoding
- options
dom:is-valid?
- dom:is-valid? contextprocedure
Checks if the parser context is valid after parsing a file
- context
- DOM parser context
dom:free-parser-context
- dom:free-parser-contextprocedure
Free the dom:parser-context
dom:to-string
- dom:to-stringprocedure
Convert a dom:node to string including the children nodes
dom:next-node
- dom:next-nodeprocedure
Move to the next dom:node
dom:node-content
- dom:node-contentprocedure
Returns the contents (text) of the dom:node
dom:node-children
- dom:node-childrenprocedure
Returns the first child node
dom:node-name
- dom:node-nameprocedure
Returns the name of the dom:node
dom:is-element-name?
- dom:is-element-name? name dom:nodeprocedure
Checks if the current name of the dom:node matches the specified string
- name
- Name (string) to match
- dom:node
dom:get-attribute
- dom:get-attribute key dom:nodeprocedure
Returns the attribute from the specified key
- key
- string
- dom:node
dom:attributes
- dom:attributes nprocedure
Returns the complete set of XML attributes for the given node
- dom:node
SAX Parser
Sometimes the DOM tree output is just too large to fit reasonably into memory. In that case (and if you don't expect to save back the XML document loaded using libxml), it's better to use the SAX interface of libxml. SAX is a callback-based interface to the parser. Before parsing, the application layer registers a customized set of callbacks which are called by the library as it progresses through the XML input.
Example
(define (sax-demo) (define sax (sax:make-handler (lambda (localname attribute-list) (print "<" localname ">") (print "@ => " attribute-list)) (lambda (localname) (print "<" localname "/>")) (lambda (characters) (print "[on-chars]: characters: " characters)))) (sax:parse-file sax #f "foo.xml") (sax:free-handler sax))
sax:parse-file
- sax:parse-file handler user-dataprocedure
Parse a XML file using the SAX handler
- handler
- SAX handler
- user-data
- SAX parser context
sax:parse-string
- sax:parse-string sax-handler user-data xml-string sizeprocedure
Parse a XML string using the SAX handler
- sax-handler
- user-data
- SAX parser context
- xml-string
- size
- The size of the XML string
sax:make-handler
- sax:make-handler on-start on-end on-charactersprocedure
Makes a SAX handler
- on-start
- λ called on start of element
- on-end
- λ called on end of element
- on-characters
- λ called on start of reading characters
sax:free-handler
- sax:free-handler sax-handlerprocedure
Frees the SAX handler
- sax-handler
Text Reader Parser
Libxml2 main API is tree based, where the parsing operation results in a document loaded completely in memory, and expose it as a tree of nodes all availble at the same time. This is very simple and quite powerful, but has the major limitation that the size of the document that can be handled is limited by the size of the memory available. Libxml2 also provide a SAX based API, but that version was designed upon one of the early expat version of SAX, SAX is also not formally defined for C. SAX basically work by registering callbacks which are called directly by the parser as it progresses through the document streams. The problem is that this programming model is relatively complex, not well standardized, cannot provide validation directly, makes entity, namespace and base processing relatively hard.
The text-reader API provides a far simpler programming model. The API acts as a cursor going forward on the document stream and stopping at each node in the way. The user's code keeps control of the progress and simply calls a read-next procedure repeatedly to progress to each node in sequence in document order. There is direct support for namespaces, xml:base, entity handling and adding DTD validation on top of it was relatively simple. This API is really close to the DOM Core specification This provides a far more standard, easy to use and powerful API than the existing SAX. Moreover integrating extension features based on the tree seems relatively easy.
In a nutshell the text-reader API provides a simpler, more standard and more extensible interface to handle large documents than the existing SAX version.
Example
(define (text-reader-demo) (define tr (text-reader:make "foo.xml")) (define (helper tr) (when (text-reader:element-node? tr) (print "<" (text-reader:name tr) ">") (print "@ => " (text-reader:all-attributes tr))) (when (text-reader:text-node? tr) (print "value =>" (text-reader:value tr))) (if (> (text-reader:read-more tr) 0) (helper tr))) (helper tr) (text-reader:free tr))
Node Types
text-reader:none
- text-reader:noneconstant
Text-Reader none
text-reader:element
- text-reader:elementconstant
Text-Reader element
text-reader:attribute
- text-reader:attributeconstant
Text-Reader attribute
text-reader:text
- text-reader:textconstant
Text-Reader text
text-reader:cdata
- text-reader:cdataconstant
Text-Reader cdata
text-reader:entity-reference
- text-reader:entity-referenceconstant
Text-Reader entity reference
text-reader:entity
- text-reader:entityconstant
Text-Reader entity
text-reader:processing-instruction
- text-reader:processing-instructionconstant
Text-Reader processing instruction
text-reader:comment
- text-reader:commentconstant
Text-Reader comment
text-reader:document
- text-reader:documentconstant
Text-Reader document
text-reader:document-type
- text-reader:document-typeconstant
Text-Reader document type
text-reader:document-fragmenta
- text-reader:document-fragmentaconstant
Text-Reader document fragments
text-reader:notation
- text-reader:notationconstant
Text-Reader notation
text-reader:whitespace
- text-reader:whitespaceconstant
Text-Reader whitespace
text-reader:significant-whitespace
- text-reader:significant-whitespaceconstant
Text-Reader signficiant whitespace
text-reader:end-element
- text-reader:end-elementconstant
Text-Reader element end
text-reader:end-entity
- text-reader:end-entityconstant
Text-Reader entity end
text-reader:xml-declaration
- text-reader:xml-declarationconstant
Text-Reader XML declaration
API
text-reader:element-to-string
- text-reader:element-to-string rprocedure
Converts a text reader to string including child nodes
- text-reader
text-reader:end-element-is?
- text-reader:end-element-is? name readerprocedure
Checks if end element is specified name
- name
- Element name (string)
- text-reader
text-reader:start-element-is?
- text-reader:start-element-is? name readerprocedure
Checks if start element is specified name
- name
- Element name (string)
- text-reader
text-reader:end-element-node?
- text-reader:end-element-node? readerprocedure
Checks if node is an end element
- reader
text-reader:text-node?
- text-reader:text-node? readerprocedure
Checks for text node
- reader
text-reader:element-node?
- text-reader:element-node? readerprocedure
Checks if node is an element
- reader
text-reader:make
- text-reader:make filenameprocedure
Makes a new text-reader
- filename
text-reader:read-more
- text-reader:read-more text-readerprocedure
Reads the next node in the text-reader
- text-reader
text-reader:free
- text-reader:free text-readerprocedure
Free the specfied text-reader
- text-reader
text-reader:node-type
- text-reader:node-type text-readerprocedure
Returns the node type
- text-reader
text-reader:empty-element?
- text-reader:empty-element? text-readerprocedure
Checks if text-reader is empty
- text-reader
text-reader:move-to-attribute
- text-reader:move-to-attribute text-reader attribute-nameprocedure
Moves text-reader to the specified attribute
- text-reader
- attribute-name
- (string)
text-reader:all-attributes
- text-reader:all-attributes rprocedure
Extracts all the attributes from the element. Attributes are placed into an association list
- text-reader
text-reader:move-to-next-attribute
- text-reader:move-to-next-attribute text-readerprocedure
Moves text-reader to the next attribute
- text-reader
text-reader:move-to-first-attribute
- text-reader:move-to-first-attribute text-readerprocedure
Moves text-reader to the first attribute
- text-reader
text-reader:move-to-element
- text-reader:move-to-element text-readerprocedure
Moves text-reader to first element
- text-reader
text-reader:next
- text-reader:next text-readerprocedure
Moves text-reader to next node
- text-reader
text-reader:next-sibling
- text-reader:next-sibling text-readerprocedure
Moves text-reader to next sibling node
- text-reader
text-reader:name
- text-reader:name text-readerprocedure
Returns the name of the node
- text-reader
text-reader:value
- text-reader:value text-readerprocedure
Returns the value of the node
- text-reader
About this egg
Author
Colophon
Documented by hahn.