How can I store XML data into a database?

Bruce Martin

Lots of jGuru members contributed to the answer.

Mark Penningroth writes:

There are a number of ways to do this, but it depends on what you need. <bl>

  • Store whole document in a blob.
  • If you know the structure of the doc (DTD etc.), you could define a schema based on the info and then use DOM or SAX to extract the data and insert/update the database. (There are tools that help. See Ceriumworks)
  • You could look at OO databse extensions. eXcelon and Poet are a couple that have done a lot of work here
  • </bl>

    Pavel Tavoda writes:

    Some database support storing and retrieving data natively. Now Oracle and DB2 support XML as native format and also as query language (XQL). Support for other database is on way. For storing specific XML datas is best way parse XML document with some DOM parser (e.g. IBM, SUN, Oracle, ...) to memory and store it to database how you like and back. Because if you are storing data you must deal with some problems e.g. flat storing, recursion, structure parsing, datatype recognision, formating which are not handled always right way in standard products.

    Thijs Stalenhoe writes:

    There are several possible answers to this questions: <bl>

  • You could use a standard relational database, but then it will be quite tricky to maintain the hierarchical nature of XML documents.
  • You can use an Object Oriented database such as Ozone (which is written in Java) to store the XML. This will preserve the hierarchy.
  • There are specialized databases for XML storage available. One of them is www.dbxml.org (not yet released) but there are many others. www.xml.com has several listed.
  • Loads of companies have jumped on the XML bandwagon, each with their own unique ideas, features and quirks. My advice would be to look well at all that is being offered, evaluate some and then pick what's best for the project you are working on.
  • </bl> Later Thijs adds:

    This is a tricky question. You could of course use a simple SQL database and have a table which takes just two entries, for instance:

    "id" and "xml-doc"

    You would then get it back by just doing a simple SQL query on "id" and returning "xml-doc". This however does not give you much flexibility. You can't really have the database return certain tags and/or parts of a given XML-document without first grabbing the whole thing. What if I wanted to get every <PERSON> tag and its contents out of each document? The amount of SQL queries would be absurd.

    A second way of storing XML in a SQL-db is to try and preserve the hierarchical structure of the XML by using multiple tables linked together by keys.

    This works fine for preserving a single type of XML document. It becomes cumbersome however once you start using loads of different types of documents, as for every type of document you would need to create an (almost) entirely new set of linked tables.

    Of course you could go out and buy an XML store like eXcelon (which isn't really a db) or Tamino (which is quite expensive) but there is another way:
    Store your XML in an LDAP database. LDAP uses an hierarchical model for storing information, which maps nicely to and from XML. It may seems weird at first, but it works and the implementation of such a scheme is pretty straightforward. Once done you can easily implement something like an XPath processor (which btw, maps nicely to LDAP queries) to select various parts of your XML document.

    Also see, http://www.jguru.com/jguru/faq/view.jsp?EID=64722

    jguru m writes:

    Oracle iFS is will allow you to store xml documents in its file system and it will map the xml data to tables so that you can do faster searching and indexing.