Microsoft’s new XML formats, the power of the container model
In this post I explained that I, along with a few thousand others, was pretty excited about Microsoft’s XML format developments. I also pointed to Brian Jone’s blog which is proving to be a great recourse. At Tech ED Brian gave some demonstrations showing the power of the new format, stressing the benefits of the ZIP container format and the fact that different parts of a document are represented as different objects in the ZIP container. Read for yourself, or read on and see some of the examples which are pretty cool.
- Updating a diagram in a spec: I showed an example of taking a technical spec with an old diagram, and outside of Word I swapped it out with a more up to date one. The main purpose of this wasn’t to show that an end user would do that to their files, but instead to show that people could easily build solutions that push relevant pieces of content into files.
- Removing comments: Most people that manage collections of documents or deal with publishing documents have seen the problem that can occur with extra information in their files. I took an example of a whitepaper with a bunch of comments in it. Often, an end user will just turn the comment view off, and not realize that when they save the file and post it up on the web, everyone else can still see those comments. If it turns out that an end user doesn’t know to delete the comments, it’s still easy enough to just build an automated step in the publishing process that strips those comments out. In my demo I just unzipped the file, deleted the part called “comments.xml”, and showed that when you then open the file back up all the comments are gone.
- Document corruption: I took a rich Word document and opened it up in a hex editor. I scrolled down to a random spot and just started zeroing out a bunch of bits. I then tried to open the file with a ZIP tool and showed that it was corrupted and couldn’t be opened. I opened in the Word though, and it opened just fine. Even all the formatting information was preserved, so most likely the only thing corrupted was some of the meta-data or some other piece of information that didn’t affect the display (obviously a much improved experience over the current binary formats).
- Footer & Header update: I took a nice looking whitepaper with a rich header and footer that was synced to the document title and author name. I then opened another whitepaper that had a really lame header and footer. I showed how in an automated process, it was easy to quickly take the header and footer used in one file, and apply it to the other file. This was an example of how easy it will be to update a collection of documents to match a specific corporate standard.
- Bulk style change: I used the System.IO.Packaging in the WinFX SDK to go over a collection of 100 whitepapers that all had a basic style associated with them, and update the styles to match a more colorful collection of styles. It took just a couple seconds to update all 100 documents.