Advertisement
If you have a new account but are having problems posting or verifying your account, please email us on hello@boards.ie for help. Thanks :)
Hello all! Please ensure that you are posting a new thread or question in the appropriate forum. The Feedback forum is overwhelmed with questions that are having to be moved elsewhere. If you need help to verify your account contact hello@boards.ie

Understanding Encoding methods?

Options
  • 15-08-2012 9:20am
    #1
    Registered Users Posts: 7,501 ✭✭✭


    Ive read the documentation but cant figure out why the Encoding method is not working as i expected it to.
    Im receiving a byte array which can contain the UTF-8 character set.

    If i decode this string using the below all unicode characters are replaced by a black diamond with a question mark inside it if i open it with notepad:

    Encoding.UTF8.GetString(byteArray)

    However if i decode it using the below it works fine:

    Encoding.Default.GetString(byteArray)

    Can anyone explain this to me because im just not getting why the first one does not work?


Comments

  • Registered Users Posts: 7,157 ✭✭✭srsly78


    Coz unicode chars are 16 bits in size. UTF8 is therefore mangling them.


  • Registered Users Posts: 7,501 ✭✭✭BrokenArrows


    srsly78 wrote: »
    Coz unicode chars are 16 bits in size. UTF8 is therefore mangling them.

    Close.

    I ended up just encoding the original byte array to every possible encoding to see which one worked.

    Turns out i was being sent the byte array in UTF-7 not 8.

    Muppets!!

    So im now just doing a convert and all is good in the world.:
    Encoding.Convert(Encoding.UTF7, Encoding.UTF8, byteArray);


  • Moderators, Sports Moderators, Regional Abroad Moderators Posts: 2,646 Mod ✭✭✭✭TrueDub


    An excellent article on character-encoding:

    http://www.joelonsoftware.com/articles/Unicode.html


  • Registered Users Posts: 7,157 ✭✭✭srsly78


    Close.

    I ended up just encoding the original byte array to every possible encoding to see which one worked.

    Turns out i was being sent the byte array in UTF-7 not 8.

    Muppets!!

    So im now just doing a convert and all is good in the world.:
    Encoding.Convert(Encoding.UTF7, Encoding.UTF8, byteArray);

    This isn't unicode then. edit: derp ok it is encoded unicode >.<


Advertisement