XML - SAX

What SAX Is

  • SAX is a parser API for XML documents.

  • Instead of reading the whole document into memory, it works in an event-driven way:

    • The parser reads the XML document sequentially (top to bottom).

    • When it encounters something (like the start of an element, text, or end of an element), it triggers an event (callback) in your code.


How It Works

  • You write event handlers (functions) for different events, such as:

    • startElement → called when a start tag is found (<book>).

    • characters → called when text content is found (XML Guide).

    • endElement → called when an end tag is found (</book>).

  • The SAX parser calls these handlers as it streams through the XML.


Example

XML:

<book>
  <title>XML Guide</title>
  <author>A. Smith</author>
</book>

SAX Parsing events:

  1. startElement("book")

  2. startElement("title")

  3. characters("XML Guide")

  4. endElement("title")

  5. startElement("author")

  6. characters("A. Smith")

  7. endElement("author")

  8. endElement("book")


Advantages of SAX

  • Fast and memory-efficient: doesn’t load the whole XML into memory (good for large files).

  • Streaming: processes XML as it’s read.

Disadvantages of SAX

  • One-way, forward-only: can’t go back once parsed.

  • Complex handling: if you need random access to data or to build a tree structure, SAX is harder to work with than DOM.


In short:
SAX = event-driven, streaming XML parsing → good for large XML documents, but less convenient than DOM if you need random access to the whole document structure.