• Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
  • Asset Store
  • Get Unity

UNITY ACCOUNT

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account
  • Blog
  • Forums
  • Answers
  • Evangelists
  • User Groups
  • Beta Program
  • Advisory Panel

Navigation

  • Home
  • Products
  • Solutions
  • Made with Unity
  • Learning
  • Support & Services
  • Community
    • Blog
    • Forums
    • Answers
    • Evangelists
    • User Groups
    • Beta Program
    • Advisory Panel

Unity account

You need a Unity Account to shop in the Online and Asset Stores, participate in the Unity Community and manage your license portfolio. Login Create account

Language

  • Chinese
  • Spanish
  • Japanese
  • Korean
  • Portuguese
  • Ask a question
  • Spaces
    • Default
    • Help Room
    • META
    • Moderators
    • Topics
    • Questions
    • Users
    • Badges
  • Home /
avatar image
Question by Nis Baggesen · Jan 02, 2014 at 02:31 PM · wwwunicodemime

WWW Object and UTF8 to Unicode conversion

I'm using the WWW object to retrieve data from a webservice.

The data in question is a JSON structure, and typical data might look like this:

 [{"aktivitet_id":18,"afviklinger":[{"afvikling_id":33,"aktivitet_id":18,"lokale_id":"11","lokale_navn":"B39","start":{"day":"28","month":"3","year":"2013","h":"11","m":0,"date":"28-3-2013","datetime":"28-3-2013 11:00","timestamp":1364464800,"mysql":"2013-03-28 11:00:00"},"end":{"day":"28","month":"3","year":"2013","h":"16","m":0,"date":"28-3-2013","datetime":"28-3-2013 16:00","timestamp":1364482800,"mysql":"2013-03-28 16:00:00"},"linked":0,"length":5},{"afvikling_id":34,"aktivitet_id":18,"lokale_id":"30","lokale_navn":"2.02","start":{"day":"29","month":"3","year":"2013","h":"20","m":0,"date":"29-3-2013","datetime":"29-3-2013 20:00","timestamp":1364583600,"mysql":"2013-03-29 20:00:00"},"end":{"day":"30","month":"3","year":"2013","h":"1","m":0,"date":"30-3-2013","datetime":"30-3-2013 1:00","timestamp":1364601600,"mysql":"2013-03-30 01:00:00"},"linked":0,"length":5}],"info":{"title_da":"Klip en h\u00e6l og hug en t\u00e5","text_da":"Af Michael Sonne-J\u00f8rgensen\nDenne bog var ikke som de andre rejsebeskrivelser, der var dumpet ned fra hullet i paddehatten over\u00a0Nimbuuksens hytte. Forfatteren var dens absolutte yndling, Thotvalius Pl\u00f8, den ber\u00f8mte Klyngenkender, legenden der havde sat sin f\u00f8dder overalt p\u00e5 Klyngen.\nBogen,som var blevet kastet i Utrov\u00e6rdighedens Hul af Ord Arsenalet., beskrev 3 historier, som p\u00e5 forunderligvis blev v\u00e6vet sammen til et storsl\u00e5et eventyr p\u00e5 en megetuforudsigelig dag i Kr\u00e6mmerby.\nTag med p\u00e5 eventyr og oplev de dovne skokkeskubbere, der er blevet sendt til byen for at k\u00f8be l\u00f8gposer,\u00a0men som i stedet har lagt en snedig plan om, at bruge alle pengene til at k\u00f8be flamboyant t\u00f8j, og skifte\u00a0navne til deres helte: Ghungas Gumbanik og det beskidte sl\u00e6ng. Hvis de alts\u00e5 lige kan komme af med\u00a0bondemandens nev\u00f8, som er sendt med for at holde \u00f8je med dem.\nOplev, hvordan k\u00f8dsnedkeren Mester Mesterhaks samsurlinge har f\u00e5et en genial ide til at f\u00e5 deres elskede\u00a0og ret s\u00e5 talentl\u00f8se mester \u00f8verst i det nepotistiske k\u00f8dsnedkerhierarki. De vil sl\u00e5 byens absolut hotteste\u00a0cirkustrup ihjel, men f\u00f8rst skal de lige ud af K\u00f8dbyen og over lambruskoernes bro.\nOg sidst, men ikke mindst, tr\u00e6d ind i manegen med B-truppen fra Cirkus Let P\u00e5 T\u00e5, som pr\u00f8ver, med livet\u00a0som indsats, at g\u00f8gle sig igennem en spektakul\u00e6r forestilling for at tilfredsstille Klyngens absolut farligste\u00a0bande: Ghungas Gumbanik og det beskidte sl\u00e6ng, fordi den rigtige cirkustrup p\u00e5 mystisk vis, aldrig dukkede\u00a0op. Men f\u00f8rst skal de lige overbevise de blodt\u00f8rstige k\u00f8dnsedkere, som har hyret Cirkus Let P\u00e5 T\u00e5, om\u00a0at de rent faktisk har noget med cirkusset at g\u00f8re. Og s\u00e5 lige finde ud af, hvordan de f\u00e5r transporteret\u00a0artillerinissens k\u00e6mpe kanon ind i manegen.\n\\\"Klip en h\u00e6l og hug en t\u00e5\\\" er en blanding af almindeligt rollespil, hvor spillerne har ansvaret for, at drive\u00a0eventyret frem via deres roller, og fort\u00e6llerrollespil, hvor spillerne selv skal bidrage med input til\u00a0verdenen.\nVarighed: 3-5 timer\nAntal spillere: 4 spillere og 1 spilleleder\nGenre: Eventyr\nSpillertype: Modne spillere, som har lyst til at v\u00e6re drivkraften i eventyret og tage et stort ansvar i den\u00a0f\u00e6lles oplevelse for at bringe Klyngen til live.\nSpilleledertype: Erfaren spilleleder, der kan improvisere p\u00e5 stedet og holde den strenge disciplin det\u00a0kr\u00e6ver at bevare alvoren i en eventyrlig verden.\nOm forfatteren: Michael har skrevet en masse scenarier i alle mulige genre. Dette er det andet scenarie som foreg\u00e5r i den\u00a0eventyrlige og sk\u00f8re verden Klyngen.","description_da":"","title_en":"Heel and toe, cut and go","text_en":"","description_en":"","author":["Michael Sonne-J\u00f8rgensen"],"price":0,"min_player":4,"max_player":4,"type":"rolle","play_hours":5,"language":"dansk+engelsk","wp_id":"4784"}}]

So pure textual data encoded as UTF8. As you can see it contains a number of UNICODE escape characters as well linebreaks etc.

However when I get the www.text member from the WWW object, these escape codes are preserved even though the text has been converted into a default C# unicode string. So even the unicode version of the string contains e.g. a substring "\u00f8" instead of the properly converted 'ø' character it should be.

I've tried grabbing the www.bytes instead and running those through the System.Text.Encoder.Converter, but that gives me the same result.

Is there some way of getting the raw data or making the WWW object aware of the mimetype of the data it is receiving (which is properly specified in the header by the webservice).

Comment

People who like this

0 Show 0
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users

2 Replies

  • Sort: 
avatar image

Answer by Briksins · Jan 02, 2014 at 04:21 PM

I believe you should encode string manually to UTF-8

it can be done like this:

 byte[] bytes = Encoding.Default.GetBytes(myString);
 myString = Encoding.UTF8.GetString(bytes);

in your case you can read your WWW as bytes straightaway, and after convert those bytes to string using encoder

 yourString = Encoding.UTF8.GetString(<your_WWW_byts>);
Comment

People who like this

0 Show 2 · Share
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image Nis Baggesen · Jan 02, 2014 at 05:10 PM 0
Share

But that is exactly what I'm saying is not working: "I've tried grabbing the www.bytes instead and running those through the System.Text.Encoder.Converter, but that gives me the same result."

I've tried

 string sText= Encoding.UTF8.GetString(www.bytes);

and

 byte[] bConvert = UnicodeEncoding.Convert(Encoding.UTF8,Encoding.Unicode,www.bytes);
 string sText= Encoding.Unicode.GetString(bConvert);

and ofcourse just grabbing the raw www.text.

In all cases I end up with unconverted escape codes in the resulting unicode string.

avatar image Nis Baggesen · Jan 02, 2014 at 05:11 PM 0
Share

But thanks for the answer - At least it tells me that I wasn't quite wrong in what I thought should be working. :)

avatar image

Answer by Nis Baggesen · Jan 02, 2014 at 06:21 PM

Oh well - It seems like this a a basic problem with C# and unicode escape sequences. They are only actually resolved in very specific places:

http://msdn.microsoft.com/en-us/library/aa664669%28v=vs.71%29.aspx

Guess I will have to write my own converter.

Comment

People who like this

0 Show 2 · Share
10 |3000 characters needed characters left characters exceeded
▼
  • Viewable by all users
  • Viewable by moderators
  • Viewable by moderators and the original poster
  • Advanced visibility
Viewable by all users
avatar image DaveA · Jan 02, 2014 at 06:29 PM 0
Share

You may still leverage the Encoding class(es) but probably will need more than one line of code.

avatar image $$anonymous$$ · Nov 22, 2017 at 03:24 AM 0
Share

Did you ever figure this one out? Three and a half years later having similar issues (see: https://answers.unity.com/questions/1433716/jsonutility-deserialization-via-json-file-doesnt-w.html?childToView=1433926#comment-1433926) and the above MSDN link is expired.

Sorry to bring up a cold case, just seeing if you found a workaround.

Unity Answers is in Read-Only mode

Unity Answers content will be migrated to a new Community platform and we are aiming to launch a public beta by June 9. Please note, Unity Answers is now in read-only so we can prepare for the final data migration.

For more information and updates, please read our full announcement thread in the Unity Forum.

Follow this Question

Answers Answers and Comments

20 People are following this question.

avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image avatar image

Related Questions

WWW in Webplayer lag problem 1 Answer

Is it actually true that StopAllCoroutines stops WWWW downloads? 2 Answers

WWW : Waiting for response from server 1 Answer

Can't access Web Service exception using Unity's WWW 0 Answers

How to get this .mp3 conversion script to play through an audio source? 0 Answers


Enterprise
Social Q&A

Social
Subscribe on YouTube social-youtube Follow on LinkedIn social-linkedin Follow on Twitter social-twitter Follow on Facebook social-facebook Follow on Instagram social-instagram

Footer

  • Purchase
    • Products
    • Subscription
    • Asset Store
    • Unity Gear
    • Resellers
  • Education
    • Students
    • Educators
    • Certification
    • Learn
    • Center of Excellence
  • Download
    • Unity
    • Beta Program
  • Unity Labs
    • Labs
    • Publications
  • Resources
    • Learn platform
    • Community
    • Documentation
    • Unity QA
    • FAQ
    • Services Status
    • Connect
  • About Unity
    • About Us
    • Blog
    • Events
    • Careers
    • Contact
    • Press
    • Partners
    • Affiliates
    • Security
Copyright © 2020 Unity Technologies
  • Legal
  • Privacy Policy
  • Cookies
  • Do Not Sell My Personal Information
  • Cookies Settings
"Unity", Unity logos, and other Unity trademarks are trademarks or registered trademarks of Unity Technologies or its affiliates in the U.S. and elsewhere (more info here). Other names or brands are trademarks of their respective owners.
  • Anonymous
  • Sign in
  • Create
  • Ask a question
  • Spaces
  • Default
  • Help Room
  • META
  • Moderators
  • Explore
  • Topics
  • Questions
  • Users
  • Badges