In this post I’ll discuss how to generate Word 2007 documents natively from BizTalk 2006 using the Office Open Xml System.IO.Packaging API recently released by the Microsoft Office Team under .Net 3.0.
Background
Unless you’ve lived under a rock during the last year, you’ll know that the Office Open XML (OOXML) format is the new Xml format for the Office 2007 suite, namely Word, Excel and Powerpoint. OOXML uses a file package conforming to the Open Packaging Convention and contains a number of individual files that form the basis of the document; the package is then zipped to reduce the overall size of the resulting file (either a .docx, .xlsx or .pptx).
Generating Word Documents – Overview
Generating a Word document is relatively simple and only requires a custom send pipeline component that generates our OOXML package.
In this post I will be using a Sales Report scenario, generating a Word document from the output of a fictional ERP system; to that extent, I’ll also be mapping from a fictional sales summary Xml message to the required OOXML format before generating the final .docx. The final document will look something like the following (note that the areas in red will be replaced with content from our ERP sales summary message – click on the image for a larger version):

Note that the structure of an OOXML document is outside of the scope of this post (but a good understanding is fundamental when working with these documents) and I would recommend that you read the excellent Open Xml Markup Explained by Wouter van Vugt.
Generating Word Documents – The ‘Main’ Document
The main document body (i.e. document.xml) is the only part that is generated in the BizTalk solution. We don’t actually create a file called document.xml – the packaging API does this for us – instead we simply create a message that conforms to the OOXML schema and pass this into the custom Send pipeline.
In our scenario, we are generating a Sales Report document for distribution to the finance department – we will receive an Xml sales summary document from our fictional ERP system that resembles the following:
<?xml version="1.0" encoding="utf-8"?> <ns0:SalesReport xmlns:ns0="http://schemas.modhul.com/erp/salesreport-1.0"> <Author>Nick Heppleston</Author> <Email>nick@modhul.com</Email> <SalesStart>10th January 2008</SalesStart> <SalesEnd>17th January 2008</SalesEnd> <SalesSummary>100,48.00</SalesSummary> </ns0:SalesReport>
which needs to be mapped into our OOXML main document body message (I think the layout of the OOXML message is pretty self explanatory, however I would point you at Open Xml Markup Explained if you’re after a more detailed explanation):
<?xml version="1.0″ encoding="utf-8″ ?> <w:document xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"> <w:body> <w:p> <w:r> <w:rPr> <w:b /> <w:sz w:val="52″ />" <w:rFonts w:ascii="Cambria" /> </w:rPr> <w:t xml:space="preserve">Sales Summary for: </w:t> <w:t>Nick Heppleston</w:t> </w:r> </w:p> <w:p> <w:r> <w:rPr> <w:i /> <w:sz w:val="52″ />" <w:rFonts w:ascii="Cambria" /> <w:spacing w:val="15″ />" <w:color w:val="48FDB2″ />" </w:rPr> <w:t xml:space="preserve">Sales from: </w:t> <w:t>10th January 2008</w:t> <w:t xml:space="preserve"> to </w:t> <w:t>17th January 2008</w:t> <w:t xml:space="preserve"> - </w:t> <w:t>£100,48.00</w:t> </w:r> </w:p> <w:p> <w:r> <w:t xml:space="preserve">Contact: </w:t> <w:t>Nick Heppleston</w:t> <w:t xml:space="preserve"> | </w:t> <w:t>nick@modhul.com</w:t> </w:r> </w:p> </w:body> </w:document>
This transformation can be performed anywhere: in the sample solution I’ve put the map on the Receive Port. Also, because I can’t think of any way to generate this type of message using a standard BizTalk Map – how do I graphically say ‘map from this source node to this destination node’ when all of the destination nodes simply repeat themselves – I am using custom XSLT to drive the map.
Note: I’ve yet to find a satisfactory XSD for the WordprocessingML markup so the solution contains a OOXML schema that was automagically generated from the above destination format. I’m working on sourcing the schema – I have a number of ‘feelers’ out with the Office Team and I hope to be able to provide a reference in the next couple of days.
With our Sales Summary message now mapped and in the necessary OOXML format, we can send it to the custom pipeline / pipeline component for it to do its work and generate our .docx package.
Generating Word Documents – The Custom Pipeline Component
The custom pipeline component is relatively simple. It uses the System.IO.Packaging API introduced in .Net 3.0 which can be found in windowsbase.dll (C:\Program Files\Reference Assemblies\Microsoft\Framework\v3.0\windowsbase.dll); full documentation regarding this namespace can be found online at MSDN. The API is invoked in the pipeline component Execute() method as follows:
1: public IBaseMessage Execute(IPipelineContext pc, IBaseMessage inmsg)2: {3: XmlDocument InputXmlDocument = new XmlDocument();4: InputXmlDocument.XmlResolver = null;5:6: // Define bodypart instances7: IBaseMessagePart bodyPart = inmsg.BodyPart;8:9: // Define stream instances10: Stream originalStream = null;11: MemoryStream odfStream = new MemoryStream();12:13: string docContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml";14: string docRelationshipType = "http://schemas.openxmlformats.org/officeDocument/2006/relationships/officeDocument";15:16: if (null != bodyPart)17: {18: // Get a *copy* of the original stream19: originalStream = bodyPart.Data;20:21: // Check that the original stream is not null22: if (null != originalStream)23: {24: // Load the original message stream into our input xml document25: // to be used as the basis of the OOXML document.26: InputXmlDocument.Load(originalStream);27:28: try29: {30: // Create a new OOXML package31: Package pkg = Package.Open(odfStream, FileMode.Create, FileAccess.ReadWrite);32:33: // Create a Uri for the document part34: Uri docPartUri = new Uri("/word/document.xml", UriKind.Relative);35:36: // Create the document part37: PackagePart mainPart = pkg.CreatePart(docPartUri, docContentType);38:39: // Add the data from the Xml Document to the document part40: Stream partStream = mainPart.GetStream(FileMode.Create, FileAccess.Write);41: InputXmlDocument.Save(partStream);42: partStream.Close();43: pkg.Flush();44:45: // Create the relationship between the part and the package.46: PackageRelationship pkgRelationship = pkg.CreateRelationship(docPartUri, TargetMode.Internal, docRelationshipType, "rId1");47:48: // Flush the changes then close the package49: pkg.Flush();50: pkg.Close();51: }52: catch (Exception Ex)53: {54: EventLog.WriteEntry("BizTalk 2006 - Build ODF Package", "Error encountered building the package: " + Ex.Message, EventLogEntryType.Error);55: }56:57: try58: {59: // Rewind the new OOXML stream60: odfStream.Seek(0, System.IO.SeekOrigin.Begin);61: }62: catch (Exception Ex)63: {64: EventLog.WriteEntry("BizTalk 2006 - Build ODF Package", "Error encountered rewinding the stream: " + Ex.Message, EventLogEntryType.Error);65: }66: finally67: {68: // Add the new OOXML stream into the return message.69: bodyPart.Data = odfStream;70: pc.ResourceTracker.AddResource(odfStream);71: }72: }73: }74:75: return inmsg;76: }
A quick overview of the code is as follows:
- Line 26: We load a copy of the original message data part stream into an XmlDocument to use as the main document body (the document.xml) when building the package.
- Line 31: Create a new OOXML package in a new MemoryStream.
- Line 34: Create a URI to the main document body (calling it document.xml).
- Line 37: Create the main document body part (docPartUri and docContentType.
- Lines 40 – 43: Save the contents of our BizTalk message to the main document body part (the message we created in the BizTalk map).
- Line 46: Create a package relationship for the main document body part.
- Line 60 & 69 – 70: Rewind the the MemoryStream and overwrite the original message with our new OOXML package.
- Line 75: We return the message containing the OOXML package.
The final message is sent via the FILE adapter and written to the file system. The end result looks like this (click on the image for a larger version):

Conclusion
In this post I hope I’ve shown you the tools necessary to generate Word 2007 documents natively using BizTalk 2006. The example I presented is extremely simple and does not include styles, themes, images, headers and footers, font tables etc. that would exist in a real-life document, but I hope it has presented a starting-point for your own custom development.
These same techniques can also be applied to create Excel spreadsheets or PowerPoint presentations – in fact, while writing this post I have had a number of ideas for enhancements to the pipeline component and will endeavour to create a CodePlex project if I can find the time.
Disclaimer
This work is licensed under a Creative Commons Attribution 2.5 License – you can use commercially and modify as necessary, but you must give the original author credit. Furthermore, sample projects and code are provided “AS IS” with no warranty.Click the image below to view further detail of the licence.

interesting post ,
I am thinking in the reverse scenario, I have documents in a shared folder or send by email and I need to extract the data from it and then pass to an Orchstration !
any advice
Hi Essam,
This should be relatively easy in a disassembling receive pipeline component: Receive your .docx or .xlsx via either the POP3 or FILE adapter and decompose the package using the System.IO.Packaging API. Once you have extracted the relevant Xml part of your document simply drop it out of the component and map either on either the receive port or in an orchestration. Should be simple (-ish!)
I do plan on developing a proof-of-concept to demonstrate decomposing a .docx file, however I’m sans-laptop at the moment and can’t do any development outside of work – gaaaah!!
Nick.
Thanks Nick, this is a lifesaver!
Any thoughts on how more advance formatting like page numbers, headers etc can be achieved.
Richard
Hi Richard,
Thanks for the comment; off the top of my head I’m not sure how to achieve page numbers and headers, however this will be OOXML detail that will need to be built using the XSLT transformation.
One easy way to determine the OOXML that you need is to create a sample Word 2007 document, extract the body part from the zip file and inspect the Xml that is generated. You can easily then use that as the basis of your transformation.
Cheers, Nick.