Top 50 Awesome List

Codepoints/awesome-codepoints

Miscellaneous  2 months ago  724
Awesome Code Points
View byDAY/WEEK/README
View on Github

Jul 18th - Jul 24th, 2022

Other Lists of Code Points

  • Cross-platform terminal charactersstars185 - a list of characters that work on most terminals.
  • Dec 14th - Dec 20th, 2020

    Record Holders and Extremes

  • A close second place in this regard goes to the CJK unified ideographs , , , , , , , , , , , and . These so-called “ghost characters” came to Unicode via the Japanese JIS standard, where they were added, because they were mis-read or misinterpreted from other signs, when JIS was compiled from original printed text sources.
  • U+1FBA8 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE LEFT AND MIDDLE RIGHT TO LOWER CENTRE and U+1FBA9 BOX DRAWINGS LIGHT DIAGONAL UPPER CENTRE TO MIDDLE RIGHT AND MIDDLE LEFT TO LOWER CENTRE - longest name: 88 characters each.
  • Standalone Code Points

  • U+FF03 FULLWIDTH NUMBER SIGN - it is the "Japanese Hashtag" . Sites like Twitter accept it as equivalent to the regular # (U+0023).
  • Jan 29th - Feb 4th, 2018

    Record Holders and Extremes

  • U+1F4C0 DVD - only code point name without any vowel (source)
  • Nov 13th - Nov 19th, 2017

    Record Holders and Extremes

  • U+10FFFF (non-character) - last code point. The whole rest of its plane apart from U+10FFFE, the code points in the 0x10000-0x10FFFD range, are private use characters, guaranteed to be never filled by a future Unicode standard.
  • May 15th - May 21st, 2017

    For Funsies

  • U+2800 BRAILLE PATTERN BLANK - A Braille pattern that has zero of its six or eight dots filled in. According to the standard: “* while this character is imaged as a fixed-width blank in many fonts, it does not act as a space” Essentially it is rendered as white-space, but since it is designated as not white-space it isn't matched by white-space-validating regular expressions. This can be used to bypass all kinds of validation that disallows or trims white-space.
  • Nov 14th - Nov 20th, 2016

    Code Points that Affect Others

  • U+FE0F VARIATION SELECTOR-16 - force colorful emoji. If this code point follows an emoji, an explicit colorful rendering of the emoji is requested (if the client supports it).
  • Nov 23rd - Nov 29th, 2015

    Record Holders and Extremes

  • U+006F LATIN SMALL LETTER O - leads the list of characters with confusable shapes. Of all the possible mappings in the list of confusable characters, the small “o” leads with a whopping 73 entries of similar looking glyphs, followed by U+006C LATIN SMALL LETTER L with 70 entries.
  • Oct 26th - Nov 1st, 2015

    Code Points that Affect Others

  • Skin color of emoji: There are five code points, that control the skin color of emoji, U+1F3FB to U+1F3FF. They are called “Emoji Modifier Fitzpatrick Type” 1 to 6, with 1 the palest and 6 the darkest. If one of these characters follows an emoji, that emoji is meant to be rendered in the appropriate skin color of the Fitzpatrick scale. If no such modifier is added, the skin tone should be unnatural, e. g., bright yellow. Fun fact: Since the Fitzpatrick modifiers are normal code points, emoji with such skin colors have the length 2, which Twitter users noticed first. Here is a comparison chart directly from the specification:

    Code Name Samples
    U+1F3FB EMOJI MODIFIER FITZPATRICK TYPE-1-2
    U+1F3FC EMOJI MODIFIER FITZPATRICK TYPE-3
    U+1F3FD EMOJI MODIFIER FITZPATRICK TYPE-4
    U+1F3FE EMOJI MODIFIER FITZPATRICK TYPE-5
    U+1F3FF EMOJI MODIFIER FITZPATRICK TYPE-6
  • Oct 19th - Oct 25th, 2015

    Standalone Code Points

  • The code points of the Unicode blocks Box Drawing (U+2500 to U+257F) and Block Elements (U+2580 to U+259F) cover most of your monospace command-line visualization needs.

      ╭───────╮
      │Unicode│
      │rules! │
      ╰┬─────┬╯
    
  • Record Holders and Extremes

  • U+0000 <control> - first code point.
  • U+0F33 TIBETAN DIGIT HALF ZERO - code point that represents the lowest “single-digit” number and at the same time the only negative one, -½.
  • The trophy for most useless code points goes to U+0080, U+0081 and U+0099. These so-called C1 control characters are more or less unspecified. They got into Unicode, because they were present in the very first version of what should later become ISO 10646, the ISO-standardized version of Unicode. They were meant to be part of an upgrade to ISO 2022, that never came to be.
  • Code Points that Affect Others

    Breaking and Gluing other characters

  • U+00A0 NO-BREAK SPACE - force adjacent characters to stick together. Well known as &nbsp; in HTML.
  • U+00AD SOFT HYPHEN - (in HTML: &shy;) like ZERO WIDTH SPACE, but show a hyphen if (and only if) a break occurs.
  • U+200B ZERO WIDTH SPACE - the inverse to U+00A0: create no space, but allow word breaking.
  • U+200D ZERO WIDTH JOINER - force adjacent characters to be joined together (e.g., arabic characters or supported emoji). Apple uses this to compose some emoji like different families.
  • U+2060 WORD JOINER - the same as U+00A0, but completely invisible. Good for writing @font-face on Twitter.
  • Oct 12th - Oct 18th, 2015

    Record Holders and Extremes

  • U+5146 and U+16B61 - code points that represent the highest “single-digit” number. In both cases that’s 1,000,000,000,000, a trillion.
  • U+1F402 OX - shortest name.
  • U+FDFA ARABIC LIGATURE SALLALLAHOU ALAYHE WASALLAM - longest decomposition form: 18 characters.
  • For Funsies

    Games

  • Chess figures.
  • Card suits and even a whole deck of cards complete with joker and back of card.
  • Die faces and a nice die emoji.
  • Go pieces.
  • Draughts (or checkers) pieces.
  • Shogi pieces, a Japanese variant of chess.
  • Domino tiles
  • Mahjong tiles
  • Standalone Code Points

  • U+FEFF ZERO WIDTH NO-BREAK SPACE - it’s name suggests, that it can be used like U+2060 WORD JOINER. And in fact the latter was introduced to inherit its semantics. This is because U+FEFF had become a special beacon called the byte order mark, that was placed on the beginning of some UTF-8 files. In complying software (including many text editors) this character is stripped from the start of a file and handled as metadata. In non-complying software (like the PHP interpreter) this leads to all sorts of fun behaviour.
  • U+2E2E REVERSED QUESTION MARK - the “irony mark” to express irony/sarcasm. A useful character⸮
  • U+D800 to U+DFFF - surrogate code points. They are only reserved to ease UTF-16 encoding.
  • U+FFFD REPLACEMENT CHARACTER - when a character cannot be displayed (e.g., decoding an erroneous UTF-8 sequency), this code point steps into the breach.
  • U+1D455 is missing. It would be an italic small “h”. It was not encoded, because it would be identical to the Planck constant ℎ (U+210E).
  • Code Points that Affect Others

  • The Regional Indicator Symbols U+1F1E6 to U+1F1FF resemble the 26 latin characters. They are used to create flag emoji. Since the Unicode consortium didn’t feel like getting on board with international politics, the solution to flags is to combine these 26 characters to the respective ISO code for a country. Examples:

    Country ISO Code Code Points Emoji (if supported)
    USA US U+1F1FA + U+1F1F8 
    Germany DE U+1F1E9 + U+1F1EA 
    China CN U+1F1E8 + U+1F1F3 
  • U+FE0E VARIATION SELECTOR-15 - force black-&-white emoji. If this code point follows an emoji, an explicit monochrome rendering of the emoji is requested (if the client supports it).
  • U+202D and U+202E - change the text direction. Relevant XKCD:

  • Diacritics and combining marks: There is a host of characters, that add to the characters before. Those are called Combining Marks. Unicode provides a handy FAQ on the details, but in a nutshell: If you add one after a character, it is placed on top of that previous one. So, a + ̊ = å. This may lead to all kinds of funny problems, because for some combinations there are pre-composed characters. Our little å here can also be encoded as U+00E5. You might note, that while this has a length of one character, the combination of a and combining ring has a length of two characters.

    Of course, one can also do fun things with those characters like this answer on StackOverflow.

  • For Funsies

  • U+1DD2 COMBINING US ABOVE - this is the most romantic code point.
  • U+1F596 RAISED HAND WITH PART BETWEEN MIDDLE AND RING FINGERS - the Vulcan salute. Live long and prosper! 
  • U+1F918 SIGN OF THE HORNS - Rock on! 落
  • U+1680 OGHAM SPACE MARK - a space that looks like a dash. Great to bring programmers close to madness: 1 +  2 === 3.
  • U+037E GREEK QUESTION MARK - a look-alike to the semicolon. Also a fun way to annoy developers.
  • U+F8FF PRIVATE USE CODEPOINT - this private use code point is rendered as Apple logo on many Apple devices.
  • U+1F574 MAN IN BUSINESS SUIT LEVITATING - A rather curious character, that only made it into Unicode for its appearance in the Webdings font (for reasons of backwards compatibility).
  • Last Checked At: 2022-09-21T15:12:48.638Z
    Previous
    jagracey/Awesome-Unicode
    Next
    MunGell/awesome-for-beginners

    About

    Track your favorite github awesome repo, not just star it. trackawesomelist.com provides website, newsletter, RSS for tracking the popular awesome list by daily and weekly.
    Contact us: [email protected]
    Track Awesome List - Track your favorite Github awesome repos, not just star them | Product Hunt

    Subscribe

    Subscribe to our weekly newsletter to receive the awesome updates! We never send spam and you can unsubscribe instantly with one click. Here's past issues.

    Links

    Follow us on TwitterSubscribe us on TelegramSubmit awesome list repoNewsletterDonateSitemap