Do you get a small feeling of excitement when seeing a character on a page that is not 0-9, A-z? Do you want to ????? ?? your social media posts, writing and digital art with the full range and capabilities of text that computers can handle. This post is a must-read for anyone bored of normal text. (Ｕｎｉｃｏｄｅ ｉｓ ｎｏｔ ｆｏｒ ｎｏｒｍｉｅｓ)
Extremely Brief Explanation Of Unicode
Unicode is the text encoding standard that includes hundreds of thousands of characters. Basically, it lets any computer and web browser view characters like this 漢 and this ? (as long as the font supports it) and also includes some special features that we will show you. This was a great step up from the previous leading standard which was ASCII that was not much more than 0-9, A-z and hence alienated the whole of Asia and the Middle East.
The standard typing characters aren’t enough for some bands. The witch house genre includes many cool examples of bands using nonstandard/‘special’ characters in their band and song titles. For instance, the band Black Ceiling is often stylised as BL▲CK † CEILING and has song titles including †† [like a prayer].
Another possibility is using the script of a non-English language. In fact, this is by far the most common use of Unicode special characters. For instance, take the excellent song Machine Girl – 覆面調査員 (GabberTrap Mix) [Frenesi remix]. This allows you to include non-English scripts in writing, translations on webpages and exotic characters in your digital art. Translation to Arabic: هذا يسمح لك لتشمل النصوص غير الإنجليزية في الكتابة والترجمات على صفحات الويب والأحرف الغريبة في الفن الرقمي الخاص بك.
You may have noticed something about that previous Arabic translation. Arabic is conventionally written right-to-left rather than English which is always left-to-right.
Unicode includes a feature called ‘bidirectional text’ which allows text to go from left-to-right. Arabic script, when in a text on its own will go right-to-left unless formatting is applied. However, in a text with left-to-right characters, it will go left-to-right. In order to have both directions within the same text, you must use a special character called a left-to-right mark to indicate where this directionality starts, then a right-to-left mark for where it ends. Here’s an example from this site. Compare:
The title is “مفتاح معايير الويب!” in Arabic.
The title is “مفتاح معايير الويب!” in Arabic.
The second one has a left-to-right marker that reverses the applicable Arabic text so that the entire thing is written in reverse order. Nifty.
Using these markers does require a bit of fiddling around though and it’s a bit difficult because they are invisible. There are text editors that display them however, such as Vim.
It should be noted that some scripts particularly Chinese and Japanese often run top-to-bottom. Unicode doesn’t have a feature that supports this because it is somewhat easy to achieve using formatting.
Breaking And Spaces
There is more to spaces than just pressing the space bar, a lot more, and here is a quick run through of Unicode’s main ‘space’ characters.
Space: The space bar character. In HTML, multiple spaces are collapsed into one space.
Tab: A flexible width that goes up until the next imaginary horizontal line down the page. Used to align text but doesn’t work in most cases in HTML.
Carriage return: This is the Enter/newline character. Yes, despite appearances it is essentially a character.
Non-breaking space (nbsp): Like ‘Space’, except multiple nsbp aren’t collapsed into one in HTML. Furthermore, these are non-breaking like non-space characters. Basically, characters that aren’t spaces are treated like words when strung together, i.e. they never get cut in half by the right margin of the page. Non-breaking spaces act in the same way despite being spaces, i.e. you can ‘glue’ two words of characters together with one so that they won’t get separated by the page margin. A cool usage of nbsp that i’ve seen is to add a disconcertingly long separator within your Facebook name. Like: Kate Summer
Zero-width space: An invisible space character. It’s ‘breaking’ so it means you can insert it between two consecutive characters so that they will break even when they otherwise wouldn’t. E.g. you could insert it between the words in ZeroWidth. Or, a common use is if you have a string of characters like ☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆☆ and you want them to wrap, you insert a zero-width space between each one. and you want them to wrap, you insert a zero-width space between each one.
Joiners and Non-joiners: Types of spaces that causes otherwise unconnected words to be joined (Joiners) or connected words to be disconnected (Non-joiners). These are fairly advanced and rare characters.
A lot of different sizes of spaces: There are way more than just your standard space. They go from hair space ( ) all the way up to em space ( ).
Here’s an example of:
Unicode Text Variants
Not everyone knows that you don’t need word processors and HTML/CSS to emphasise text and change fonts. You can do so much by just using Unicode characters even though it’s less convenient. But the main advantage is: you can post this almost anywhere, including on Facebook where most people think spicing up your message with ‘fonts’ is out of the question. This site is the go-to for Unicode text conversion. For instance, I want to write ‘Unicode rules!’ in some pretty ways:
And there are sooo many more variants than this.
There are even more variants when you use combining characters which we will talk about at the end. But check out this site if you want to underline text for instance:
Or do some other weird thing like this:
Really Large Characters
There’s a thing called ‘fullwidth’ that makes a character take up two spaces in text. Also, there’s a thing called ‘variable-width spacing’ that makes printed characters not always have to have the same length. For instance ‘w’ is wider than ‘i’. Because of these two properties, some characters can be extremely large. Check out the following:
Also, interestingly, here is the most complex Chinese character by strokes (64): ?. And here is the densest one: 龘
Now for something really cool, check out this: ẃ̶̞̉o̴͍̎ẇ̷͖̿ọ̸̄͊w̶̲̬̳͌̌̀͑o̵͓͐w̷̟̠̰̿̀o̴̅͌͜w̴̘̔̂̈́̃ǫ̴̙̔͒̈́͑w̵̛̼̥̳͓̍͠ö̷͓́̈́͋̕ẁ̸̟̗̻
You can seriously have these things going all over the page using this tool or similar.
These little characters on top and below of the regular ones are called combining characters in Unicode.
You can do all sorts of things like this:
Basically, Unicode has these characters that don’t occupy a space on their own, instead they combine with the character next to them to modify it. There are a lot of these that allow you to add diacritical marks to any letter and this is useful when writing in other languages. For instance, you can create this letter x̅ if unicode doesn’t have that as a single character. You can do all kinds of things like strike-out characters, encircle, etc.
And now Unicode has Emoji combinations, which mean you can combine an emoji with a new emoji combining character to make what looks like a new emoji. For instance, you can modify the skin tone of some emojis using one of these combiners: ? ? ? ? ?. ? + ?= ??. Also, emojis in sequence with zero-width joiners in-between them can combine to make new emojis in some cases. For instance: ? + ❤ + ? + ? = ?❤️??
Now Get Creative
Now hopefully you have been opened up to a world in which the text that you enter on websites, text documents, word documents, Photoshop and music titles can be formatted in numerous ways. Your imagination is the greatest limit when it comes to combining characters, special characters, spacing, text direction and any combinations of those.