Home C#, .NET Core, Encoding.UTF8.GetBytes is not consistent
Reply: 0

C#, .NET Core, Encoding.UTF8.GetBytes is not consistent

user1379
1#
user1379 Published in May 25, 2018, 11:11 am

Update. I created a test project on GitHub, where you can see the tests are passing on Appveyor (Windows) and failing on Travis (both Linux and OSX).
https://github.com/nopara73/UTF8Problems/


I have the a .NET Core 2 xUnit test project, where I am testing the UTF8 encoding of the string "é". On Windows the tests are passing, on Linux and OSX, they are failing.


Code.

[Fact]
public void CanEncode()
{
    var character = "é";
    var encoded = Encoding.UTF8.GetBytes(character);

    var bytes = new byte[] { 195, 169 };

    Assert.Equal(bytes, encoded);
}

[Fact]
public void CanDecode()
{
    var character = "é";

    var bytes = new byte[] { 195, 169 };
    var decoded = Encoding.UTF8.GetString(bytes);

    Assert.Equal(character, decoded);
}

[Fact]
public void CanEncodeDecode()
{
    var character = "é";
    var encoded = Encoding.UTF8.GetBytes(character);

    var decoded = Encoding.UTF8.GetString(encoded);

    Assert.Equal(character, decoded);
}

Failing output. Travis, Linux:


Questions.

  • What is the reason for this behavior?
  • How should I encode such strings to make sure I get identical results, regardless of the platform?
You need to login account before you can post.

About| Privacy statement| Terms of Service| Advertising| Contact us| Help| Sitemap|
Processed in 0.376504 second(s) , Gzip On .

© 2016 Powered by mzan.com design MATCHINFO