XML - StAX

What StAX Is

  • StAX = Streaming API for XML.

  • Like SAX, it’s a stream-based parser (reads XML sequentially, doesn’t load the whole file into memory).

  • But unlike SAX, it gives you more control:

    • You can pull events from the parser when you need them, instead of reacting to callbacks.

    • This is why StAX is often called a pull parser, while SAX is a push parser.


How It Works

  • You create a parser and iterate through the XML.

  • As you move forward, you “pull” the next event.

  • Events include:

    • START_ELEMENT → start of a tag

    • CHARACTERS → text content

    • END_ELEMENT → end of a tag


Example

XML:

<book>
  <title>XML Guide</title>
  <author>A. Smith</author>
</book>

StAX parsing flow (pulling events):

  1. START_ELEMENT: book

  2. START_ELEMENT: title

  3. CHARACTERS: XML Guide

  4. END_ELEMENT: title

  5. START_ELEMENT: author

  6. CHARACTERS: A. Smith

  7. END_ELEMENT: author

  8. END_ELEMENT: book


Advantages of StAX

  • Control: You decide when to get the next event (pull model).

  • Efficient: Still lightweight and memory-friendly like SAX.

  • Flexible: Easier to mix reading and writing XML (it has cursor and iterator APIs).

Disadvantages of StAX

  • Forward-only: like SAX, you can’t go back in the document.

  • More coding effort: you need to manage parsing logic manually.


In Short

  • DOM → Loads whole XML in memory (tree structure, random access).

  • SAX → Event-driven, parser pushes events to your code.

  • StAX → Stream-based, you pull events when you want them (more control, less memory).