XQUERY,XPATH,XMLSCHEMA,XML INDEX



(XQUERY,XPATH,XMLSCHEMA,XML INDEX)


Ø  XQuery is the language for querying XML data.


   Ø  XPath was designed to navigate an XML document to retrieve the documents elements and attributes.


     Ø  XQuery is built on XPath expressions.

Ø To put it simple XQuery,XPath are used to traverse through XML document/fragment to fetch/modify attribute values or element nodes.



XQuery FLWOR Expression:


  • for - (optional) binds a variable to each item returned by the in expression
  • let - (optional) to assign value to variable
  • where - (optional) specifies a criteria
  • order by - (optional) specifies the sort-order of the result
  • return - specifies what to return in the result.


Ø  The xml data type features several built-in methods that allow you to manipulate XML instance data. 

These methods allow you to query, modify, or shred your XML data into relational form.


query()   -The query() method allows you to perform an XQuery on your xml instance.
The result returned is untyped XML. 

Syntax: DbObject.query('XQuery')


value()   - The value() method allows you to perform an XQuery on your xml instance and returns a scalar value cast to a SQL Server data type.

Syntax: DbObject.value('XQuery', 'SqlType')

               
exist() - The exist() method allows you to specify an XQuery on your xml instance and returns a SQL bit value of 1 if the XQuery returns a result, 0 if the XQuery returns no result, or NULL if the xml instance is NULL.

Syntax :DbObject.exist('XQuery')



modify()                 - The modify() method allows you to execute XML Data Manipulation Language (XML DML) statements against an xml instance. The modify() method can only  be used with a SET clause or statement.

Syntax :DbObject.modify('XQuery')


nodes() - The nodes() method allows you to shred xml instances. Shredding is the process of converting your XML data to relational form.

Syntax :DbObject.nodes('XQuery') AS TableAlias(ColumnAlias)



XML DML Keywords:

Delete - Deletes the node specified by an XQuery path expression.

Insert -  Inserts one or more nodes as the children or siblings of a node specified
by an XQuery path expression.

replace value of    -  Updates the value of a node specified by an XQuery path.



--Examples for XQUERY & XPATH

--lets us  create sample data


USE TEST_XML
GO



CREATE TABLE College_Master 
(College_ID INT NOT NULL, 
College_Name VARCHAR(20),
College_Details XML)



INSERT INTO College_Master VALUES (1,'BCC UNIVERSITY','<STUDENTINFO>
  <student ID="1" name="Sathya">
    <subject ID="1" Name="Electronics and Communication" />
    <subject ID="2" Name="Circuit Analysis" />
    <subject ID="3" Name="Mobile Communication" />
  </student>
  <student ID="2" name="Deepak">
    <subject ID="4" Name="Data Structure" />
    <subject ID="5" Name="Java" />
    <subject ID="6" Name="Database Management System" />
  </student>
  <student ID="3" name="sathish">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="9" Name="Concrete Design" />
  </student>
</STUDENTINFO>')


INSERT INTO College_Master VALUES (2,'ICC UNIVERSITY','<STUDENTINFO>
  <student ID="1" name="Karthi">
    <subject ID="1" Name="Electronics and Communication" />
    <subject ID="2" Name="Circuit Analysis" />
    <subject ID="3" Name="Mobile Communication" />
  </student>
  <student ID="2" name="Vikram">
    <subject ID="4" Name="Data Structure" />
    <subject ID="5" Name="Java" />
    <subject ID="6" Name="Database Management System" />
  </student>
  <student ID="3" name="Anandh">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="9" Name="Concrete Design" />
  </student>
</STUDENTINFO>')



SELECT * FROM College_Master

--query() & value() examples:

--query() returns typed XML

SELECT College_Details.query('/STUDENTINFO/student') FROM College_Master
WHERE College_ID = 1;

--value() returns single value of mentioned data type

SELECT College_Details.value('(/STUDENTINFO/student/@name)[1]','varchar(20)') FROM College_Master
WHERE College_ID = 1;


--XPath examples :

--to get all below nodes from root node - STUDENTINFO
 SELECT College_Details.query('/STUDENTINFO') FROM College_Master
WHERE College_ID = 1;

--to get all  element nodes named 'student'
 SELECT College_Details.query('//student') FROM College_Master
WHERE College_ID = 1;

-- to get student node with ID = 1
 SELECT College_Details.query('//student[@ID = 1]') FROM College_Master
WHERE College_ID = 1;

--to get student node at position = 1 from root node
 SELECT College_Details.query('/STUDENTINFO/student[position()=1]') FROM College_Master
WHERE College_ID = 1;

-- to get student node at last position  from root node
 SELECT College_Details.query('/STUDENTINFO/student[last()]') FROM College_Master
WHERE College_ID = 1;

--to return  unknown nodes from root node - STUDENTINFO
 SELECT College_Details.query('/STUDENTINFO/*') FROM College_Master
WHERE College_ID = 1;

--to return unknown nodes from XML column - College_Details
 SELECT College_Details.query('//* ') FROM College_Master
WHERE College_ID = 1;


--XQuery FLWOR example:

--from resultset of $i ,assigning it to variable $j using let clause,
Getting top two students by using where & order by clause.

SELECT College_Details.query('for $i in /STUDENTINFO/student
let $j:= $i
where ($j/@ID) <= 2
order by ($j/@ID)[1] ascending
return $j')
FROM College_Master
WHERE College_ID = 1;



--Example:  
                for XQuery built-in functions & operators:

--    Sql:column can be used to access non-xml columns in a relational table.

--Example:

SELECT College_Details.query ('
<College_Master>
<College_Details> {
sql:column("College_ID"),
sql:column("College_Name")
} </College_Details>
</College_Master>')
FROM College_Master
WHERE College_ID = 1;


-- Sql:variable can be used to use sql variables inside query()

--Example: 
DECLARE @ID INT = 1;
SELECT College_Details.query('/STUDENTINFO/student[@ID = sql:variable("@ID")]') FROM College_Master
WHERE College_ID = 1;

--string()
 SELECT College_Details.query('string((/STUDENTINFO/student[1]/@name)[1])') FROM College_Master
WHERE College_ID = 1;


--data()
 SELECT College_Details.query('data(//student/@name)') FROM College_Master WHERE College_ID = 1;

--Operators examples:

Value comparison - Value comparison operators help compare atomic values.

Example: checking first student id = 1

SELECT College_Details.query('(/STUDENTINFO/student/@ID)[1] eq 1') 
FROM College_Master
WHERE College_ID = 1


SELECT College_Details.query('(/STUDENTINFO/student/@ID)[1] eq 1') 
FROM College_Master
WHERE College_ID = 1

Node comparison- node comparison operator is to compare two nodes to determine if they represent the same node or not.

Example:comparing nodes

SELECT College_Details.query('
   if ( (/STUDENTINFO/student)[1] is (//student)[1] )
   then
      <Result>Nodes are equal</Result>
   else
      <Result>Nodes are not equal</Result>
') as Result
FROM College_Master
WHERE College_ID = 1


SELECT College_Details.query('
   if ( (/STUDENTINFO/student)[1] is (//subject)[1] )
   then
      <Result>Nodes are equal</Result>
   else
      <Result>Nodes are not equal</Result>
') as Result
FROM College_Master
WHERE College_ID = 1



--XML DML examples

--modify() is used for insert/update/delete operations

--to insert 'Remarks' node after first 'student' node

UPDATE College_Master
SET College_Details.modify('
  insert(<Remarks> Passed  </Remarks>)
  after(/STUDENTINFO/student)[1]')
WHERE College_ID = 1;

SELECT * FROM College_Master WHERE College_ID = 1;

--to update 'Remarks' node value & also notice that  text() is used to fetch text value of node.

UPDATE College_Master
SET College_Details.modify('
  replace value of (/STUDENTINFO/Remarks/text())[1]
  with "Passed with Distinction" ')
WHERE College_ID = 1;

SELECT * FROM College_Master WHERE College_ID = 1;

--to delete 'Remarks' node

UPDATE College_Master
SET College_Details.modify('delete(/STUDENTINFO/Remarks)[1]')
WHERE College_ID = 1;

SELECT * FROM College_Master WHERE College_ID = 1;


--exist() example:

--returns 1 if exists

SELECT College_Details.exist('/STUDENTINFO/student[@name = "Sathya"]') FROM College_Master
WHERE College_ID = 1;

--returns 0 if not exists

SELECT College_Details.exist('/STUDENTINFO/student[@name = "sathya"]') FROM College_Master
WHERE College_ID = 1;

--nodes() used for shredding XML data into columns of relational table

Examples:

--to shred StudentID, StudentName into Student table from XML column.

CREATE TABLE Student_Master (ID INT,NAME NVARCHAR(30))

INSERT INTO Student_Master
SELECT
 tab.col.value('@ID[1]', 'INT') StudentID
,tab.col.value('@name[1]', 'NVARCHAR(30)') StudentName
FROM College_Master CROSS APPLY College_Details.nodes('/STUDENTINFO/student') tab(col)
WHERE College_ID = 1

SELECT * FROM Student_Master

--to shred SubjectID, SubjectName into Subject table from XML column.


CREATE TABLE Subject_Master (ID INT,NAME NVARCHAR(30))

INSERT INTO Subject_Master
SELECT
 tab.col.value('@ID[1]', 'INT') SubjectID
,tab.col.value('@Name[1]', 'NVARCHAR(30)') SubjectName
FROM College_Master CROSS APPLY College_Details.nodes('/STUDENTINFO/student/subject') tab(col)
WHERE College_ID = 1

SELECT * FROM Subject_Master





Using OPENXML is another approach for shredding XML data into columns of relational table

Example:


DECLARE @DocHandle int
DECLARE @XmlDocument nvarchar(1000)

SET @XmlDocument = N'<STUDENTINFO>
  <student ID="1" name="Sathya">
    <subject ID="1" Name="Electronics and Communication" />
    <subject ID="2" Name="Circuit Analysis" />
    <subject ID="3" Name="Mobile Communication" />
  </student>
  <student ID="2" name="Deepak">
    <subject ID="4" Name="Data Structure" />
    <subject ID="5" Name="Java" />
    <subject ID="6" Name="Database Management System" />
  </student>
  <student ID="3" name="sathish">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="9" Name="Concrete Design" />
  </student>
</STUDENTINFO>'

EXEC sp_xml_preparedocument @DocHandle OUTPUT, @XmlDocument

SELECT * FROM OPENXML (@DocHandle, '/STUDENTINFO/student',1)
WITH (ID INT,
name VARCHAR(30))

EXEC sp_xml_removedocument @DocHandle
 
 
 
Ø  The sp_xml_preparedocument stored procedure stores the prepared XML document in SQL Server's internal cache. 

Ø  Calling the sp_xml_removedocument stored procedure is necessary in order to remove the prepared XML document from the cache. 

Ø  When OPENXML approach is used,according to SQL Server 2005 Books on Line, one-eighth of the total memory available to SQL Server may be used by the MSXML parser.


XML SCHEMA COLLECTION:


XML Schema is used to
                       validate structure of XML,
                       validate  data types and
                       to perform restriction on values.

To  associate XSD schemas with a variable or a column of xml type use  XML schema collection

If  a variable or a column is associated with XML schema then it called typed xml,if not it is called untyped XML.


o    CREATE XML SCHEMA COLLECTION - Imports schema components into a database.
o    ALTER XML SCHEMA COLLECTION - Modifies the schema components in an existing XML schema collection.
o    DROP XML SCHEMA COLLECTION - Deletes a complete XML schema collection and all its components.
  

--Examples:

--you can create xml schema,typing it manually  or you generate it automatically with help of various tools.

--step 1: create xml schema

CREATE XML SCHEMA COLLECTION Studentinfo_studentID1_Schema
    AS
    '<xs:schema attributeFormDefault="unqualified" elementFormDefault="qualified" xmlns:xs="http://www.w3.org/2001/XMLSchema">
  <xs:element name="STUDENTINFO">
    <xs:complexType>
      <xs:sequence>
        <xs:element name="student" maxOccurs="3" minOccurs="0">
          <xs:complexType>
            <xs:sequence>
              <xs:element name="subject" maxOccurs="unbounded" minOccurs="0">
                <xs:complexType>
                  <xs:simpleContent>
                    <xs:extension base="xs:string">
                      <xs:attribute type="xs:byte" name="ID" use="optional"/>
                      <xs:attribute type="xs:string" name="Name" use="optional"/>
                    </xs:extension>
                  </xs:simpleContent>
                </xs:complexType>
              </xs:element>
            </xs:sequence>
            <xs:attribute type="xs:byte" name="ID" use="optional"/>
            <xs:attribute type="xs:string" name="name" use="optional"/>
          </xs:complexType>
        </xs:element>
      </xs:sequence>
    </xs:complexType>
  </xs:element>
</xs:schema>'


--step 2: create table with xml column registered with xml schema

CREATE TABLE XML_Schema_test
(
  ID INT PRIMARY KEY,
  College_Details XML(Studentinfo_studentID1_Schema)
);

--step 3:insert with 4 <student> element nodes

--it will throw error because in schema,we have mentioned that max occurance of <student> = 3
--  " <xs:element name="student" maxOccurs="3" minOccurs="0"> "

INSERT INTO XML_Schema_test SELECT 1,
 '<STUDENTINFO>
  <student ID="1" name="Sathya">
    <subject ID="1" Name="Electronics and Communication" />
    <subject ID="2" Name="Circuit Analysis" />
    <subject ID="3" Name="Mobile Communication" />
  </student>
  <student ID="2" name="Deepak">
    <subject ID="4" Name="Data Structure" />
    <subject ID="5" Name="Java" />
    <subject ID="6" Name="Database Management System" />
  </student>
  <student ID="3" name="sathish">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="9" Name="Concrete Design" />
  </student>
  <student ID="4" name="Sunny">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="9" Name="Concrete Design" />
  </student>
</STUDENTINFO>
'

-- try again with 3 <student> element nodes

INSERT INTO XML_Schema_test SELECT 1,
 '<STUDENTINFO>
  <student ID="1" name="Sathya">
    <subject ID="1" Name="Electronics and Communication" />
    <subject ID="2" Name="Circuit Analysis" />
    <subject ID="3" Name="Mobile Communication" />
  </student>
  <student ID="2" name="Deepak">
    <subject ID="4" Name="Data Structure" />
    <subject ID="5" Name="Java" />
    <subject ID="6" Name="Database Management System" />
  </student>
  <student ID="3" name="sathish">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="9" Name="Concrete Design" />
  </student>
</STUDENTINFO>
'
---try to do the below insert,you will get error,
--because of data type mismatch(datatype mentioned in schema & what we are inserting are different)
-- "  <xs:attribute type="xs:string" name="Name" use="optional"/> "
INSERT INTO XML_Schema_test SELECT 1,
 '<STUDENTINFO>
  <student ID="1" name="Sathya">
    <subject ID="1" Name="Electronics and Communication" />
    <subject ID="2" Name="Circuit Analysis" />
    <subject ID="3" Name="Mobile Communication" />
  </student>
  <student ID="2" name="Deepak">
    <subject ID="4" Name="Data Structure" />
    <subject ID="5" Name="Java" />
    <subject ID="6" Name="Database Management System" />
  </student>
  <student ID="3" name="sathish">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="Nine" Name="Concrete Design" />
  </student>
</STUDENTINFO>
'

-- try again with matching data type value

INSERT INTO XML_Schema_test SELECT 1,
 '<STUDENTINFO>
  <student ID="1" name="Sathya">
    <subject ID="1" Name="Electronics and Communication" />
    <subject ID="2" Name="Circuit Analysis" />
    <subject ID="3" Name="Mobile Communication" />
  </student>
  <student ID="2" name="Deepak">
    <subject ID="4" Name="Data Structure" />
    <subject ID="5" Name="Java" />
    <subject ID="6" Name="Database Management System" />
  </student>
  <student ID="3" name="sathish">
    <subject ID="7" Name="Soil Mechanics" />
    <subject ID="8" Name="Steel Design" />
    <subject ID="9" Name="Concrete Design" />
  </student>
</STUDENTINFO>
'


 XML index:

Ø  Types: Primary (base) and Secondary (PATH, VALUE, PROPERTY – Non-clustered on the primary)

Ø  Storage: Increases XML storage cost. Estimate as 3 times the storage requirement of XML instance in base table.


Ø  To create the primary XML index, table must have a clustered primary key


Ø  Primary index has 11 columns, including id, nid (node), hid (hierarchy), value, lvalue, PK[n]

Ø  The primary XML index on an XML column is a clustered index on an internal table known as the node table that users cannot use directly from their T-SQL statements.

Ø  The primary XML index is a B+tree and its usefulness is due to the way that the optimizer creates a plan for the entire query.

Ø  The primary XML index essentially contains one row for each node in the XML instance

Ø  Once the primary XML index has been created, an additional three kinds of secondary XML index can be created.

Ø  The secondary XML indexes assist in certain types of XQuery processing. These are called the PATH, PROPERTY, and VALUE indexes.

XML Indexing – Secondary Indexes:
  • Path index - Good for path queries
  • Value index - Optimized to find the value and we don’t know the path
  • Property index - Optimized for name-value pair storage

Examples:


--create primary key to create primary xml index


ALTER TABLE College_Master ADD CONSTRAINT PK_College_ID
PRIMARY KEY CLUSTERED (College_ID);


--primary xml index

CREATE PRIMARY XML INDEX IDX_College_Details ON College_Master(College_Details)
GO



-- below query will display the columns in the primary XML index (node table)

SELECT * FROM sys.columns c
 JOIN sys.indexes i ON i.object_id = c.object_id
 WHERE i.name = 'IDX_College_Details'
 AND i.type = 1


 --create secondary xml index

CREATE XML INDEX IDX_SEC_College_Details ON College_Master(College_Details)
USING XML INDEX IDX_College_Details FOR PATH

--check execution plan of FLWOR expression example query

SELECT College_Details.query('for $i in /STUDENTINFO/student
let $j:= $i
where ($j/@ID) <= 2
order by ($j/@ID)[1] ascending
return $j')
FROM College_Master
WHERE College_ID = 1;



1 comment:

Anonymous said...

Thank you very much! Excellent stuff!!

One thing for the beginners is XML queries are case sensitive.

-Vijay