The History Of The Xml Databases Information Technology Essay
A persistent data structure is a data structure that stores both the previous as well as the current versions of the data. These versions of data can also be queried. An XML database is one such software system which allows data to be stored in XML format. This data can be queried or rendered into a desired format suitable for any particular purpose.
XML enabled database (XEDB) and Native XML database (NXD) are the two major types of XML databases. The former accepts xml as input and renders xml as output relying on the database itself to perform the conversion. In contrast, the latter, uses XML documents as the fundamental unit of storage. The third variety is Hybrid XML Database (HXD). Depending on the requirements of the application, the HXD can be treated as either a Native XML Database (NXD) or as an XML Enabled Database (XEDB).
An XML database contains ‘Collections’ which in turn is a group of XML documents. Each XML database can contain many ‘Collections’. The XML Path Language (Xpath) is the query language used to query documents or collections of documents. XSLT (Extensible Stylesheet Language Transformations) is the language uses for transforming documents retrieved from database into other formats including Plain text, XML, or HTML.
About XML databases History:
For three decades, application developers have relied on relational databases as the bedrock for a persistent data storage layer. While the technology is mature, today’s requirements are becoming more complex and relational databases may not be the tool for the job in hand. If developers have been dumping down and creating more work for themselves (unknowingly) for many years, XML databases attempts to give an eye-opener into a new approach of storing and retrieving data.
XML databases came into existence in 2000. Many of the new startups were also started. This database lets to organize data irrespective of whether it is organized or not. Arranging and complexity for storing data is significantly got reduced with the language.XML databases are for managing semi-structured data.XML database is one that treats XML documents and elements as the fundamental structures rather than tables, records, and fields. Such a database enables developers to use tools and languages that more naturally fit the structure of the documents they’re working with, thereby enhancing productivity. It is also widely believed (if not exactly proven) that XML databases can significantly outperform traditional relational databases for tasks that involve heavy document processing, such as newspaper publishing, Web site management, and Web services.
Though we have existing databases where comes the use of this area:
Relational databases in general, and SQL databases in particular, have been so incredibly successful that they’ve almost completely eliminated the competition, at least in mind share if not always in actual installations. (A lot of data is still locked up in hierarchical, big iron databases like IMSâ„¢, and quite a bit more is stored in lower-end, non-SQL databases like FileMaker.) However, although relational databases fit a lot of problems very well, they don’t really fit XML documents, at least not in their full generality. While you can shred an XML document enough to stuff it into a relational table or just treat it as one big blob, neither approach really lends itself to indexing and fast queries. In practice, shredding also tends to lead to the loss of details like element order, processing instructions, comments, white space, and other elements that are important in many applications in which XML documents don’t look exactly like serialized tables
in the first place. Field and record boundaries just don’t match the boundaries of an XML document. Applications such as publishing systems that care about these details need to look beyond the relational database for their information storage needs there comes the use of XML databases.
Comparison between Relational Databases and XML Databases:
A relational database contains tables.
An XML database contains collections.
A relational table contains records with the same schema.
A collection contains XML documents with the same schema.
A relational record is an unordered list of named values.
An XML document is a tree of nodes.
A SQL query returns an unordered set of records.
An XQuery returns an ordered sequence of nodes.
Key issues- unresolved by current technology:
The Relational database systems are readily available and can handle vast amounts of data very efficiently. Taking advantage of this, algorithms have been specifically designed to operate on XML data mapped to relational tables. The security is another concern as in an outsourced XML database service model, organizations rely upon the premises of external service providers for the storage and retrieval management of their XML data. Data confidentiality, user and data privacy, query assurance, secure auditing, and secure and efficient storage model are always a major cause of concern.
Does it truly represent a shift in technology:
Is RDBMS a competitor to XML databases? The answer is definitely NO. We need to consider a few questions while comparing both the databases.
1. Which fits your applications needs more closely?
2. How large a data set you need to handle?
3. Are you transferring data between applications or are you going to query it?
We use RDBMS, if we have large data processing and querying needs. We use XML if we need to export data or transfer it between applications. An RDMBS is for storing large amounts of data in a consistent way. The RDBMS should takes care of the consistency of the data, etc. XML can be used for data-exchange between different computer systems for instance, but it should not be used to store large amounts of data over a long period of time. Xml doesn’t allow you to take care of data-consistency like an RDMBS does; it doesn’t take care of transactions, etc.
Are there any disadvantages of an XML Database??
Yes! There are indeed a few disadvantages of XML databases. XML databases are easy to create but have disadvantages for users and software support.
(1) XML databases run slower:
XML documents are verbose. XML requires all open and close markup tags to be present in order to work properly. When an XML database is built from XML documents, XML databases require data compression to run quickly. Because XML documents and databases are text based, there is also more information to maintain than if it was simply stored as cell values.
(2) XML searches are slow:
XML has slower querying and searching functionality than other databases. The searches must sort through the text based information as well as the tags, which is slower than a search of only cell contents in a relational database. XML documents are built into databases via document trees, and the search must go through all branches of the tree before completing unless the search code is written to look for all related nodes and only search-related nodes.
(3) Difficulty with XML database conversion:
XML is not widely accepted as a database tool as it is for document encoding. This has resulted in fewer database tools that can handle XML than other database applications. XML is hierarchical while most other databases are relational. XML databases may need to be restructured before being converted.
(4) XML limitations as a database:
XML is designed for free-form document creation. While XML documents can be kept indefinitely, XML databases created from those documents are not designed for long-term data storage. XML can be set up with defined schema or rules. However, XML does not support enforcement of defined schema.
XML databases do not have referential integrity to ensure that data stays where it was placed for storage, which can cause data references to be lost. If a document tree within the database is changed, it will not generate error messages when database references are broken.
(5) XML disadvantages in data security:
One of the disadvantages of XML is that it requires the entire data set to be loaded into the database before it can be viewed, so it cannot be checked in part without loading the entire database. XML does not offer role-based security like other database applications. It cannot be set up to limit who can add, delete or change data.
XML databases can set security permissions based on containers, but once a user has permission to access a container, he can view all information stored within it. Access controls can be tightened by creating subcontainers and limiting permissions to subcontainers, but this increases the amount of work required to set up and then maintain access control.
Why Are Companies Using XML??
XML as Document Content is Different:
It isn’t about format. It isn’t for the convenience of printers, compositors, or designers. It doesn’t make any (one) thing easier. It makes many things more difficult. Content Creators and Publishers want it – for their own reasons.
XML is everywhere:
In some circles, XML Web Services are all the rage. Bank transactions are in XML. e-Commerce and e-Business happen in XML. Digital cameras create XML headers on images. Printers use XML for job control. State troopers record traffic warrants in XML.
View of an XML System:
Why a Business Wants XML in Publishing
Repurposing and new products.
Customization and internationalization.
Multiple products from one source.
Protect content investment.
Applications of XML databases in real time – Examples of companies using them:
(1) Catalog Data:
Elsevier Science is a publisher of scientific, technical, and medical information. This uses Mark Logic’s Content Interaction Server to manage more than two terabytes of data: five million full-text journal articles, 60 million citations and abstracts, thousands of complete books, and five thousand informational pamphlets. The system is used to search and transform documents.
(2)Medical information storage:
Amirsys uses Ipedo XML Store to manage descriptions of radiology diagnostic cases, image data, and data used to drive a document editor. The editor is designed for writing books about radiology and contains features such as queries on existing descriptions and the ability to insert images. Managers can query across documents, such as to track the progress of authors or check that elements are used in the same way by all authors.
(3)Corporate information portals:
An unnamed bank uses Xyleme Zone Server as the basis for an equity research portal, which is composed of multiple sub-portals. It is used by roughly 10,000 employees inside the bank and more than 30,000 customers outside the bank, who perform both contextual and full-text searches. Analysts add thousands of documents daily. As these are loaded, a separate engine queries them and notifies users of any changes of interest. Latency between loading and user notification is “a couple of minutes”.
The Tasmanian Government uses TeraText DBS to power a Web site that allows users to track Tasmanian legislation. A single piece of legislation is stored as a series of time-stamped fragments; this allows users to track changes to the legislation over time. In addition, links in the documents refer to fragments also stored in the database, allowing them to be implemented as queries on the database.
(5)Document management systems:
Le Monde uses Xyleme Zone Server to manage an archive holding more than 800,000 documents and using 6 gigabytes of storage. The archive is used by employees, partners, and customers.
Products are divided into six categories:
XML-Enabled Web Servers.
Content Management Systems And
Persistent DOM Implementations.
VENDORS PROVIDING XML DATABASES:
Apache Software Foundation (Xindice),Bluestream Database Software Corporation (XStreamDB),Cognetic Systems, Inc. (XQuantum),Coherity, Inc. (Coherity XML Database),data ex machina GmbH (Natix),EMC Corporation (Documentum XML Store),Endpoint Systems (Figaro – .NET version of the Oracle Berkeley DB),Oracle Corporation (Oracle Berkeley DB — formerly Sleepycat Berkeley DB XML)etc.
Impact of XML databases:
Today we are living in the information age, businesses are talking to each other via complex XML data structures, (SOAP and RESTful Web Services becoming the ever more popular means of information exchange between disparate applications and systems).
The XML messages exchanged are by nature hierarchical and deeply tree structured, sometimes the data is even unpredictable and sometimes the structure is prone to change at any time, developers trying to map this data to a relational structure may find their lives becoming more and more difficult.
XML Databases offer the same functionality of Object Databases, data is structured in a hierarchical manner except XML Databases store XML documents instead of theoretical Objects. While in principle this is the same concept of data storage, XML databases have the added benefit of being able exchange the data in its native format, which is perfect for today’s requirements.
Where Object Databases have Object Query Language (OQL), XML Databases have XQuery which is a W3C standard. XQuery covers the major functionality from former language proposals like XML-QL, XQL, OQL and the SQL standard.
“BIG SHOTS,” in this area:
The XML:DB initiative made efforts of XML:DB to bring the XML database industry and that XML databases can make it into the standard toolset used by IT departments worldwide.XML:DB is an industry initiative formed by SMB GmbH, the dbXML Group L.L.C and the OpenHealth Care Group. XML:DB is also supported by a growing listof organizations with interest in XML and XML databases.
The XML: DB Initiative’s long term goals can be summarized as:
Development of technology specifications for managing the data in XML Databases
Contribution of reference implementations of those specifications under an Open Source License
Formation of a community where XML database vendors and users can ask questions and exchange information to learn more about XML database technology and applications.
Evangelism of XML database products and technologies to raise the visibility of XML databases in the marketplace
The market and technological forecast for the area over the next few years:
XML databases are used in the real world — most commonly for managing documents, integrating data, and managing semi-structured data. What is important about these uses is that most represent cases where people have tried to use relational or other types of databases and have either failed or written less sophisticated applications than they would like. Native XML databases have succeeded because of their query languages (most notably XQuery, but also XML-aware full-text queries), the flexibility of the XML data model, and their ability to handle schemaless data.
So is a XML database in your future:
This question is best answered by quoting Arun Gaikwad. In an article XML database from Apache, he wrote: “A XML database is something which you may think is unnecessary but once you start using it, you wonder how you would survive without it.”
Every fruit of technology will have two sides that is pros and cons and XML database also have the same which are included above by considering all information provided above I conclude that “A new project which deals with XML and/or unpredictable data, choosing to use a Relational Database will not stop the project in its tracks but a great deal of time will be wasted on trivial matters that could be easily solved by making use of an XML Database instead”.Order Now