xml

✓Verified·Scanned 2/18/2026

Parse, generate, and transform XML with correct namespace handling and encoding.

from clawhub.ai·vb34f19b·2.6 KB·0 installs

Scanned from 1.0.0 at b34f19b · Transparency log ↗

$ vett add clawhub.ai/ivangdavila/xml

Namespaces

XPath /root/child fails if document has default namespace—use //*[local-name()='child'] or register prefix
Default namespace (xmlns="...") applies to elements, not attributes—attributes need explicit prefix
Namespace prefix is arbitrary—<foo:element> and <bar:element> are identical if both prefixes map to same URI
Child elements don't inherit parent's prefixed namespace—each must declare or use prefix explicitly

Encoding

<?xml version="1.0" encoding="UTF-8"?> must match actual file encoding—mismatch corrupts non-ASCII
Encoding declaration must be first thing in file—no whitespace or BOM before it (except UTF-8 BOM allowed)
Default encoding is UTF-8 if declaration omitted—but explicit is safer across parsers

Escaping & CDATA

Five entities always escape in text: & < > " '
CDATA sections <![CDATA[...]]> for blocks with many special chars—but ]]> inside CDATA breaks it
Attribute values: use " if delimited by ", or ' if delimited by '
Numeric entities < and < work everywhere—useful for edge cases

Whitespace

Whitespace between elements is preserved by default—pretty-printing adds nodes that may break processing
xml:space="preserve" attribute signals whitespace significance—but not all parsers respect it
Normalize-space in XPath: normalize-space(text()) trims and collapses internal whitespace

XPath Pitfalls

//element is expensive—traverses entire document; use specific paths when structure is known
Position is 1-indexed: [1] is first, not [0]
text() returns direct text children only—use string() or . for concatenated descendant text
Boolean in predicates: [@attr] tests existence, [@attr=''] tests empty value—different results

Structure

Self-closing <tag/> and empty <tag></tag> are semantically identical—but some legacy systems choke on self-closing
Comments cannot contain --—will break parser even inside string content
Processing instructions <?target data?> cannot have ?> in data
Root element required—document with only comments/PIs and no element is invalid

Validation

Well-formed ≠ valid—parser may accept structure but fail against schema
DTD validates but can't express complex constraints—prefer XSD or RelaxNG for new projects
XSD namespace xmlns:xs="http://www.w3.org/2001/XMLSchema" commonly confused with instance namespace