10.23.13

Gemini version available ♊︎

Microsoft Culture Against Another Universal Standard: Unicode

Posted in Standard at 5:31 am by Dr. Roy Schestowitz

Unicode

Summary: Microsoft’s long battle against character encoding standards such as Unicode, which bridge the gap for communication between people, not just applications

HALF A decade ago we spent a lot of time here promoting open standards — the grooves for connectivity between applications, operating systems, and pertinent pieces of code. Without standards, there is little collaboration because the cost of connecting separate pieces of software is quite high.

“But to Microsoft consistency was an evil threat; it threatened its monopoly.”Assuming that collaboration is the key to rapid advancement and innovation — reusing knowledge, pooling human resources, etc. — standards are important everywhere we look, e.g. electrics, plumbing, energy, automobiles and so on. Encoding of characters is not everyone’s field of expertise; it is a low-level area of computing, akin to assembly code and little/big endian. But the principles of standards stay the same across fields and standards are almost always beneficial. I have wasted many hours of my life trying to overcome issue associated with Microsoft’s broken character encodings. It was a long time ago that people appreciated the value of consistency in some areas (not to be confused with monoculture or monopoly). But to Microsoft consistency was an evil threat; it threatened its monopoly. The Scientist published a piece called “Standards Needed” [1] not too long ago and Linux Journal praised Unicode [2], which helps bridge character encoding barriers. Thanks to Unicode, many of us out there can access and render pages in almost any language, even rare languages (and even if we cannot understand them). The Register, however, thought it would be productive to bash Unicode [3]. And watch who wrote the piece: a Windowshead. What a surprise!

Related/contextual items from the news:

  1. Opinion: Standards Needed
  2. Unicode

    Let’s give credit where credit’s due: Unicode is a brilliant invention that makes life easier for millions—even billions—of people on our planet. At the same time, dealing with Unicode, as well as the various encoding systems that preceded it, can be an incredibly painful and frustrating experience. I’ve been dealing with some Unicode-related frustrations of my own in recent days, so I thought this might be a good time to revisit a topic that every modern software developer, and especially every Web developer, should understand.

  3. Down with Unicode! Why 16 bits per character is a right pain in the ASCII

    In the beginning – well, not in the very beginning, obviously, because that would require a proper discussion of issues such as parity and error correction and Hamming distances; and the famous quarrel between the brothers ASCII, ISCII VISCII and YUSCII; and how in the 1980s if you tried to send a £ sign to a strange printer that you had not previously befriended (for example, by buying it a lovely new ribbon) your chances of success were negligible; and, and…

    But you are a busy and important person.

    So in the beginning that began in the limited world of late MS-DOS and early Windows programming, O best beloved, there were these things called “code pages”.

    To the idle anglophone Windows programmer (ie: me) code pages were something horrible and fussy that one hoped to get away with ignoring. I was dimly aware that, to process strings in some of the squigglier foreign languages, it was necessary to switch code page and sometimes, blimey, use two bytes per character instead of just one. It was bad enough that They couldn’t decide how many characters it took to mark the end of a line.

    [...]

    As far as I know, there isn’t a creation myth associated with the unification of the world’s character sets.

    [...]

    For Windows C++ programmers, the manifesto identifies specific techniques to make one’s core code UTF-8 based, including a proto-Boost library designed for the purpose. (Ironically, the first thing you have to do is turn the Unicode switch in the Visual C++ compiler to ‘on’.)

    [...]

    Next weekend I will be scraping all my Unicode files off my hard disk, taking them to the bottom of the garden, and burning them. As good citizens of the digital world, I urge you all to do the same.

Share in other sites/networks: These icons link to social bookmarking sites where readers can share and discover new web pages.
  • Reddit
  • email

Decor ᶃ Gemini Space

Below is a Web proxy. We recommend getting a Gemini client/browser.

Black/white/grey bullet button This post is also available in Gemini over at this address (requires a Gemini client/browser to open).

Decor ✐ Cross-references

Black/white/grey bullet button Pages that cross-reference this one, if any exist, are listed below or will be listed below over time.

Decor ▢ Respond and Discuss

Black/white/grey bullet button If you liked this post, consider subscribing to the RSS feed or join us now at the IRC channels.

A Single Comment

  1. UndeadPotato said,

    October 23, 2013 at 8:17 am

    Gravatar

    It feels like you did not read the article. From what I understood from my reading of it, it’s arguing that UTF-8 is more expandable and does not depend on a certain number of bytes per character. I admit, I don’t fully understand the technical merits of Unicode vs. UTF-8 vs. UTF-16, etc. but there’s more to the article than what you say here. It seems like a reasonable opinion based on valid arguments.

DecorWhat Else is New


  1. Computer Users Should be Operators, But Instead They're Being Operated by Vendors and Governments

    Computers have been turned into hostile black boxes (unlike Blackbox) that distrust the person who purchased them; moreover, from a legislative point of view, encryption (i.e. computer security) is perceived and treated by governments like a threat instead of something imperative — a necessity for society’s empowerment (privacy is about control and people in positions of unjust power want total and complete control)



  2. Peak Code — Part I: Before the Wars

    Article/series by Dr. Andy Farnell: "in the period between 1960 and 2060 people had mistaken what they called "The Internet" for a communications system, when it had in fact been an Ideal and a Battleground all along - the site of the 100 years info-war."



  3. Links 21/1/2022: RISC-V Development Board and Rust 1.58.1

    Links for the day



  4. IRC Proceedings: Thursday, January 20, 2022

    IRC logs for Thursday, January 20, 2022



  5. Gemini Lets You Control the Presentation Layer to Suit Your Own Needs

    In Gemini (or the Web as seen through Gemini clients such as Kristall) the user comes first; it's not sites/capsules that tell the user how pages are presented/rendered, as they decide only on structural/semantic aspects



  6. The Future of Techrights

    Futures are difficult to predict, but our general vision for the years ahead revolves around more community involvement and less (none or decreased) reliance on third parties, especially monopolistic corporations, mostly because they oppress the population via the network and via electronic devices



  7. [Meme] UPC for CJEU

    When you do illegal things and knowingly break the law to get started with a “legal” system you know it’ll end up in tears… or the CJEU



  8. Links 20/1/2022: 'Pluton' Pushback and Red Hat Satellite 6.10.2

    Links for the day



  9. The Web is a Corporate Misinformation/Disinformation Platform, Biased Against Communities, Facts, and Science

    Misinformation/disinformation in so-called 'news' sites is a pandemic which spreads; in the process, the founder of GNU/Linux gets defamed and GNU/Linux itself is described as the problem, not the solution to the actual problems



  10. Links 20/1/2022: McKinsey Openwashing and Stable Kernels

    Links for the day



  11. IRC Proceedings: Wednesday, January 19, 2022

    IRC logs for Wednesday, January 19, 2022



  12. Links 20/1/2022: Linuxfx 11.1 WxDesktop 11.0.3 and FreeIPMI 1.6.9 Released

    Links for the day



  13. Links 19/1/2022: XWayland 22.1 RC1 and OnlyOffice 7.0 Release

    Links for the day



  14. Links 19/1/2022: ArchLabs 2022.01.18 and KDE's 15-Minute Bug Initiative

    Links for the day



  15. When Twitter Protects Abusers and Abuse (and Twitter's Sponsors)

    Twitter is an out-of-control censorship machine and it should be treated accordingly even by those who merely "read" or "follow" Twitter accounts; Twitter is a filter, not a news/media platform or even means of communication



  16. IRC Proceedings: Tuesday, January 18, 2022

    IRC logs for Tuesday, January 18, 2022



  17. Links 19/1/2022: Wine 7.x Era Begins and Istio 1.12.2 is Out

    Links for the day



  18. Another Video IBM Does Not Want You to Watch

    It seems very much possible that IBM (or someone close to IBM) is trying to purge me from Twitter, so let’s examine what they may be trying to distract from. As we put it 2 years ago, "Watson" is a lot more offensive than those supposedly offensive words IBM is working to purge; think about those hundreds of Red Hat workers who are black and were never told about ethnic purges of blacks facilitated by IBM (their new boss).



  19. What IBM Does Not Want You to Watch

    Let's 'Streisand it'...



  20. Good News, Bad News (and Back to Normal)

    When many services are reliant on the integrity of a single, very tiny MicroSD card you're only moments away from 2 days of intensive labour (recovery, investigation, migration, and further coding); we've learned our lessons and took advantage of this incident to upgrade the operating system, double the storage space, even improve the code slightly (for compatibility with newer systems)



  21. Someone Is Very Desperate to Knock My Account Off Twitter

    Many reports against me — some successful — are putting my free speech (and factual statements) at risk



  22. Links 18/1/2022: Deepin 20.4 and Qubes OS 4.1.0 RC4

    Links for the day



  23. Links 18/1/2022: GNOME 42 Alpha and KStars 3.5.7

    Links for the day



  24. IRC Proceedings: Monday, January 17, 2022

    IRC logs for Monday, January 17, 2022



  25. Links 17/1/2022: More Microsoft-Connected FUD Against Linux as Its Share Continues to Fall

    Links for the day



  26. The GUI Challenge

    The latest article from Andy concerns the Command Line Challenge



  27. Links 17/1/2022: digiKam 7.5.0 and GhostBSD 22.01.12 Released

    Links for the day



  28. IRC Proceedings: Sunday, January 16, 2022

    IRC logs for Sunday, January 16, 2022



  29. Links 17/1/2022: postmarketOS 21.12 Service Pack 1 and Mumble 1.4 Released

    Links for the day



  30. [Meme] Gemini Space (or Geminispace): From 441 Working Capsules to 1,600 Working Capsules in Just 12 Months

    Gemini space now boasts 1,600 working capsules, a massive growth compared to last January, as we noted the other day (1,600 is now official)


RSS 64x64RSS Feed: subscribe to the RSS feed for regular updates

Home iconSite Wiki: You can improve this site by helping the extension of the site's content

Home iconSite Home: Background about the site and some key features in the front page

Chat iconIRC Channel: Come and chat with us in real time

Recent Posts