Home C# .NET Xml Serialization - wrong encoding?
Reply: 1

C# .NET Xml Serialization - wrong encoding?

PolarisPlus
1#
PolarisPlus Published in 2018-02-13 19:47:14Z

I am creating new version of old software (written in different language) which must be compatible, and more precisely export and import should work between them.

Old soft XML file in notepad++: original xml

My XML created in C# also in notepad++: new xml

Code to generate this XML:

        XAttribute rootName = new XAttribute("Name", "");
        XElement root = new XElement("Template", rootName);

        root.Add(new XElement("CODE", "JPā€ž"));

        var document = new XDocument(new XDeclaration("1.0", "ISO-8859-1", "yes"), root);
        document.Save("C:\\temp\\Test.xml");

The special character in my xml is encoded incorrect. It makes me dizzy, because it should be in ISO-8859-1 encoding, even notepad++ shows that file has this encoding.

How can I force my XML to treat special characters like the old one?

Feri
2#
Feri Reply to 2018-02-13 20:07:57Z

The ā€ž character is not represented in iso-8859-1 so you have no way to write it to a 8859-1 encoded file.

Your old program writes some byte after JP but it's only some non-defined behaviour what is specific to your old language/application. IND is a control character, not a 8859-1 character. XDocument does the best it can: writes the unicode code point with character representation. (Your program code is in utf8, I guess, so the ā€ž is represented in your program in utf8.)

I guess, you should investigate what your old program does and mimic it by hand. For example, you could create a mapping that assigns every possible non-8859-1 character to a valid 8859-1 character.

You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.397258 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO