Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Office 12 files to adopt fully XML format

Options
  • 03-06-2005 9:17pm
    #1
    Closed Accounts Posts: 17,208 ✭✭✭✭


    Microsoft announced this week that they will be adopting a new, open, XML-based standard for the default file structure for Office 12 documents. There will also be patches released for all versions of Office from 2000 onwards to accomodate the format, though it isn't yet known if they will offer read only or read/write support.
    Source

    1. Open Format: These formats use XML and ZIP, and they will be fully documented. Anyone will be able to get the full specs on the formats and there will be a royalty free license for anyone that wants to work with the files.
    2. Compressed: Files saved in these new XML formats are less than 50% the size of the equivalent file saved in the binary formats. This is because we take all of the XML parts that make up any given file, and then we ZIP them. We chose ZIP because it’s already widely in use today and we wanted these files to be easy to work with. (ZIP is a great container format. Of course I’m not the only one who thinks so… a number of other applications also use ZIP for their files too.)
    3. Robust: Between the usage of XML, ZIP, and good documentation the files get a lot more robust. By compartmentalizing our files into multiple parts within the ZIP, it becomes a lot less likely that an entire file will be corrupted (instead of just individual parts). The files are also a lot easier to work with, so it’s less likely that people working on the files outside of Office will cause corruptions.
    4. Backward compatible: There will be updates to Office 2000, XP, and 2003 that will allow those versions to read and write this new format. You don’t have to use the new version of Office to take advantage of these formats. (I think this is really cool. I was a big proponent of doing this work)
    5. Binary Format support: You can still use the current binary formats with the new version of Office. In fact, people can easily change to use the binary formats as the default if that’s what they’d rather do.
    6. New Extensions: The new formats will use new extensions (.docx, .pptx, .xlsx) so you can tell what format the files you are dealing with are, but to the average end user they’ll still just behave like any other Office file. Double click & it opens in the right application.

    Technical details on the formats are expected to be announced during TechEd next week.

    One of the most interesting developments is the faxt that documents with embedded macros will have a different file extension than those that do not. Macro enabled documents will append an "m" to the current extension rather than an "x". Mind you, I'm not sure how many macro-based viruses are still going around these days.


Comments

  • Registered Users Posts: 20,993 ✭✭✭✭Stark


    Hooray, years after they said they would in Office 2000 :) Now where's that object orientated filesystem they promised in Windows 2000 :p


  • Registered Users Posts: 640 ✭✭✭Kernel32


    The xml formats have been around for a while, ExcelML and WordprocessingML. They haven't been that well documented and also limited in some areas, like you couldn't save an embedded image in a excel file whose format is xml. Its good to see its being opened up more. My guess it's all part of an effort to standardise RDL which is the format for sql server reporting services, and infopaths format with office so they can interop better.

    Good old Microsoft, always thinking about us.


  • Registered Users Posts: 885 ✭✭✭clearz


    Stark wrote:
    Now where's that object orientated filesystem they promised in Windows 2000 :p

    Flushed down the toilet along with the WinFS filesystem promised in Longhorn.


  • Moderators, Recreation & Hobbies Moderators, Science, Health & Environment Moderators, Technology & Internet Moderators Posts: 91,690 Mod ✭✭✭✭Capt'n Midnight


    Microsoft announced this week that they will be adopting a new, open, XML-based standard for the default file structure for Office 12 documents.
    let's not forget that they have certain patents on XML -http://www.boards.ie/vbulletin/showthread.php?t=262280 and Microsoft do not use English the same way we do when it comes to office. I refer you to the phrase "remove all" and how office leaves dozens of files and reg entries behind despite the normal usage of the phase so I'll keep an "Open" mind.

    Having had to support office for a long time and still seeing Excel bomb out with an uninformative error box means my opinions of their devotion to data integrity haven't changed much since the days of windows 3.1 - I'd prefer more stable products than bells and whistles that 90% of customers will never use. Also does this mean that all macros will have to be rewritten, yet again ?



    As for the article..
    One of the most interesting developments is the faxt that documents with embedded macros will have a different file extension than those that do not. Macro enabled documents will append an "m" to the current extension rather than an "x". Mind you, I'm not sure how many macro-based viruses are still going around these days.[/QUOTE] anyone remember Lotus 123 - version 3 used to save .WK3 files, how many years ago was that ? Also MSWord used to save and execute macros in .RFT files - so I'm sceptical.
    1. Open Format: These formats use XML and ZIP, and they will be fully [strike]documented[/strike] patented. Anyone will be able to get the full specs on the formats and there will be a royalty free license for anyone that wants to work with the files. will you have to sign an NDA or will the terms be like benchmarking where you need written permission to comment on speeds/sizes

    2. Compressed: Files saved in these new XML formats are less than 50% the size of the equivalent file saved in the binary formats. anyone remember word 95 documents hitting 25MB per page with embedded graphic? This is because we take all of the XML parts that make up any given file, and then we ZIP them. We chose ZIP because it’s already widely in use today and we wanted these files to be easy to work with.[b 7zip would give better compression but it's opensource[/b]

    3. Robust: Between the usage of XML, ZIP, and good documentation the files get a lot more robust. By compartmentalizing our files into multiple parts so they've finally realised that fast save was a bad idea ?within the ZIP, it becomes a lot less likely that an entire file will be corrupted (instead of just individual parts). just turn autosave and always create copy - the defauls back in the 1980's or better still save revision number - default on VMS from earlier The files are also a lot easier to work with, so it’s less likely that people working on the files outside of Office will cause corruptions. Ha bloody ha - excel has never been better than second best at reading documents it corrupted itself - was always beaten by lotus/openoffice etc.

    4. Backward compatible: There will be updates to Office 2000, XP, and 2003 that will allow those versions to read and write this new format. You don’t have to use the new version of Office to take advantage of these formats. (I think this is really cool. I was a big proponent of doing this work)translate: I remember the corporate backlash with office 95 and we don't want them to defect to OpenOffice

    5. Binary Format support: You can still use the current binary formats with the new version of Office. In fact, people can easily change to use the binary formats as the default if that’s what they’d rather do.Just like OpenOffice

    6. New Extensions: The new formats will use new extensions (.docx, .pptx, .xlsx) so you can tell what format the files you are dealing with are, but to the average end user they’ll still just behave like any other Office file. Double click & it opens in the right application. again with the macros in .RFT idea - how did that get through a design walk through ?


  • Closed Accounts Posts: 17,208 ✭✭✭✭aidan_walsh


    let's not forget that they have certain patents on XML
    IANAL, but AFAIK they only refer to object serialization. There is also speculation as to whether this will ever be enforced, or whether it was taken to stop other companies trying to take it out and enforce it against MS.
    Also does this mean that all macros will have to be rewritten, yet again ?
    AFAIK, no. The most logical way I can see of doing this would be to encapsulate the macro as a CDATA block, and interpret it as usual.
    Also MSWord used to save and execute macros in .RFT files - so I'm sceptical.
    Maybe, but here everything will be DOM controlled, so nothing thats not in the declared schema type will be supported. At least, thats what would be expected.
    anyone remember word 95 documents hitting 25MB per page with embedded graphic?
    I'm pretty sure they're doing a text only analysis there.

    As for the rest of your comments, I can't say but time will tell.


  • Advertisement
  • Closed Accounts Posts: 14,483 ✭✭✭✭daveirl


    This post has been deleted.


  • Registered Users Posts: 21,264 ✭✭✭✭Hobbes


    I can see it now, using XML format with zip and the data inside the XML will be encoded in some way. It will accept text but you cannot get the full benifit of the file format without buying Office version of the month.

    Btw, Open Office have been doing this for years. Their document files are just zip files with XML data.


  • Closed Accounts Posts: 14,483 ✭✭✭✭daveirl


    This post has been deleted.


  • Registered Users Posts: 21,264 ✭✭✭✭Hobbes


    I would be wary of anything MS would promise.

    Take for example when they said they would "open thier code" for people to see in combating open source. Turned out you had to pay silly sums to see the code (understandable) but any self respecting developer signing the shared source contract they had basically screwed themselves from developing anything which is remotely similar to the code they were allowed look at.


  • Closed Accounts Posts: 14,483 ✭✭✭✭daveirl


    This post has been deleted.


  • Advertisement
Advertisement