30 November 2015

Parsing XML to objects in .NET

Introduction

There is a general requirement in many projects to parse XML to an object and vice versa. Instead of using inefficient and error prone method of manual XML parsing, .NET object serialization/deserialization functionality can be used to achieve similar result in a more efficient and elegant manner. This approach is very generic and will work with all types of XML.

Approach

System.Runtime.Serialization provides methods for serializing and de-serializing XMLs.
DataContractSerializer method of System.Runtime.Serialization namespace can be used to achieve functionality of -

  • XML to object conversion (XML Parsing [De-Serialization])
  • Object to XML conversion (Object Serialization)
  • Apart from this, Generics feature of C# has been utilized to make the methods generic and independent of object types. It also enhances type-safety of the code. Actual type needs to be specified only when method is called.
  • The method for ‘Serialization’ accepts type of data to serialize as a parameter ‘T’. This makes the method generic in a sense that it can accept object of any type for serialization and we don’t have to define that at time of method declaration.
  • The method for ‘De-Serialization’ returns de-serialized data. The return type has been specified as generic ‘T’. This makes the method generic in a sense that it can de-serialize and return object of any type.
Implementation

1.    Class declaration
1.1. Create a static class named ‘XmlParserDeparser’.
1.2. Reason for making this class static is that this class acts as a utility class and there will be no need to create instance of this class to access its helper methods.
1.3. Add following namespace in the class file –
1.3.1.   using System;
1.3.2.   using System.Collections.Generic;
1.3.3.   using System.IO;
1.3.4.   using System.Runtime.Serialization;
1.3.5.   using System.Text;

2.    XML to object conversion (XML parsing/de-serialization)
2.1. To achieve this conversion, DataContractSerializer method of System.Runtime.Serialization will be used.
2.2. Declare a static method with name XmlToObject<T>.
2.3. Here ‘T’ will be type of the data object to create after parsing XML using de-serialization techniques.
2.4. This method will return object created from XML.
2.5. It will take two input parameters -
·         XML as string
·         Known types to include in the serializer (can be null too)
2.6. Complete code for the method is presented below –
/// <summary>
/// Deserializes the specified XML string.
/// </summary>
/// <typeparam name="T">The type of the data to deserialise to.</typeparam>
/// <param name="xml">The XML string representing the data to deserialise.</param>
/// <param name="knownTypes">The known types to include in the serializer.</param>
/// <returns>The deserialised data in object form.</returns>
public static T XmlToObject<T>(string xml, IEnumerable<Type> knownTypes)
{
    using (var stream = new MemoryStream(Encoding.Unicode.GetBytes(xml)))
    {
        var serializer = new DataContractSerializer(typeof(T), knownTypes);
        T theObject = (T)serializer.ReadObject(stream);
        return theObject;
    }
}

2.7. Here a memory stream is created using XML as input.
2.8. An instance of System.Runtime.Serialization.DataContractSerializer is created specifying type ‘T’ and enumeration of knownTypes as input. Latter is optional though (can be null).
2.9. Now the object of DataContractSerializer reads the XML and creates a generic object.
2.10. This object is cast to type ‘T’ and then returned.
2.11. We can also create over-load of this method with only one parameter i.e. ‘XML as string’.
2.12. This method will internally call the above method with ‘knownTypes’ passed as null.
2.13. The signature for this method will be as follows –
        /// <summary>
        /// Deserializes the specified XML string.
        /// </summary>
        /// <typeparam name="T">The type of the data to deserialise to</typeparam>
        /// <param name="xml">The XML string representing the data to deserialise</param>
        /// <returns>The deserialised data in object form</returns>
        public static T XmlToObject<T>(string xml)
        {
            return XmlToObject<T>(xml, null);
        }


3.    Object to XML conversion (Object serialization)
3.1. To achieve this conversion, DataContractSerializer method of System.Runtime.Serialization will be used.
3.2. Declare a static method with name ObjectlToXml<T>.
3.3. Here ‘T’ will be type of the data object to convert to XML using serialization techniques.
3.4. This method will return XML as string.
3.5. It will take two input parameters -
·         Type of object to convert to XML
·         Known types to include in the serializer (can be null too)
3.6. Complete code for the method is presented below –

/// <summary>
/// Serializes the specified input data, of type T, to XML string.
/// </summary>
/// <typeparam name="T">The type of the data to serialise.</typeparam>
/// <param name="data">The data to serialise.</param>
/// <param name="knownTypes">The known types to include in the serializer.</param>
/// <returns>The data as an XML string.</returns>
public static string ObjectToXml<T>(T data, IEnumerable<Type> knownTypes)
{
    using (var memoryStream = new MemoryStream())
    {
        var serializer = new DataContractSerializer(typeof(T), knownTypes);
        serializer.WriteObject(memoryStream, data);
        memoryStream.Seek(0, SeekOrigin.Begin);

        var reader = new StreamReader(memoryStream);
        string content = reader.ReadToEnd();
        return content;
    }
}

3.7. Here a memory stream is created first.
3.8. An instance of System.Runtime.Serialization.DataContractSerializer is created specifying type ‘T’ and enumeration of knownTypes as input. Latter is optional though (can be null).
3.9. Now the object of DataContractSerializer reads the object data and writes the output (XML) into memory stream.
3.10. This memory stream is then read by a StreamReader object till end and converted to string which is returned by method.
3.11. We can also create over-load of this method with only one parameter i.e. ‘Type of object to convert to XML’.
3.12. This method will internally call the above method with ‘knownTypes’ passed as null.
3.13. The signature for this method will be as follows –
  /// <summary>
        /// Serializes the specified input data, of type T, to XML string.
        /// </summary>
        /// <typeparam name="T">The type of the data to serialise.</typeparam>
        /// <param name="data">The data to serialise.</param>
        /// <returns>The data as an XML string.</returns>
        public static string ObjectToXml<T>(T data)
        {
            return ObjectToXml(data, null);
        }


Advantages
  • The advantage of using this approach is that you are able to perform XML parsing very easily without writing custom parsing code.
  • Similarly you are able to easily convert an object to XML without need of any custom code.
  • This code is generic and will work for all types of XML and objects. We just need to ensure to specify proper XML and object type (T).

No comments: