Monday, 14 July 2014

Java XML Parser DOM vs. SAX

1. SAX(simple API for XML)

In SAX, events are triggered when the XML is being parsed. When the parser is parsing the XML, and encounters a tag starting (e.g. <something>), then it triggers the tagStarted event (actual name of event might differ). Similarly when the end of the tag is met while parsing (</something>), it triggers tagEnded. Using a SAX parser implies you need to handle these events and make sense of the data returned with each event.

-Event based parser (Sequence of events).
-SAX parses the file at it reads i.e. Parses node by node.
-No memory constraints as it does not store the XML content in the memory.
-SAX is read only i.e. can’t insert or delete the node.
-Use SAX parser when memory content is large.
-SAX reads the XML file from top to bottom and backward navigation is not possible.
-Faster at run time.
Pros
  • event based
  • memory efficient
  • faster than DOM
  • supports schema validation
Cons
  • No object model, you have to tap into the events and create your self
  • Single parse of the xml and can only go forward
  • read only api
  • no xpath support
  • little bit harder to use


2. DOM

In DOM(document object model), there are no events triggered while parsing. The entire XML is parsed and a DOM tree (of the nodes in the XML) is generated and returned. Once parsed, the user can navigate the tree to access the various data previously embedded in the various nodes in the XML.
-Tree model parser(Object based) (Tree of nodes).
-DOM loads the file into the memory and then parse the file.
-Has memory constraints since it loads the whole XML file before parsing.
-DOM is read and write (can insert or delete the node).
-If the XML content is small then prefer DOM parser.
-Backward and forward search is possible for searching the tags and evaluation of the information inside the tags. So this gives the ease of navigation.
-Slower at run time.
Pros
  • in-memory object model
  • preserves element order
  • bi-directional
  • read and write api
  • xml MANIPULATION
  • simple to use
  • supports schema validation
Cons
  • memory hog for larger XML documents (typically used for XML documents less than 10 mb)
  • slower
  • generic model i.e. you work with Nodes

Reference:

No comments:

Post a Comment