The `runes` Property

Posted on April 05, 2020 in Dart

runtimeType

Background

In the digital world, all characters, numbers or symbols are represented using a unique numeric value, known as Unicode. The Unicode standard provides three distinct encoding forms for Unicode characters: UTF-8 (using 8-bit), UTF-16 (16-bit), and UTF-32 (32-bit units). Data is converted in one of these encoding during data transfer. In this article, we'll explore Dart language's one of the String class property known as runes to retrieve this encoded data.

What is runes Property?

In Dart, a string is represented using sequence of UTF-16 known as code units. A code unit is the number of bits used by the given encoding.

Code Point: A character in a string is represented by one or more code point(s). One code point is represented by one or more code units.

Code Unit: A code unit is the number of bits used by an encoding. For UTF-16 encoding, it uses two bytes.

The runes property returns an iterable of Unicode code-points of this string of Runes type. The Runes is the integer representation for unicode code points of a String.


Usage

The String Class has following methods to access runes for a String.

The codeUnitAt Method

This method returns the UTF-16 (16-bit) code unit at the given index for the String.

  String myStr = "Mask";

  ///Code unit at 0th position is M.
  ///UTF-16 decimal representation for ASCII character 'M' is 77
  ///Reference: https://www.fileformat.info/info/unicode/char/004d/index.htm
  print("Code Unit for Letter M: ${myStr.codeUnitAt(0)}");

Output:

Code Unit for Letter R: 77

The codeUnits Property

This property provides an unmodifiable list of the UTF-16 code unit(s) of this string.

  String myStr = "Mask";
  print("Code Units for word Mask: ${myStr.codeUnits}");

Output:

Code Units for word Mask: [77, 97, 115, 107]

The runes Property

This property returns the list of Unicode UTF-16 code point(s) for the given string. In this case, code points are same as of code units. The myStr.runes returns the Iterable<int>.

  String myStr = "Mask";
  print("Runes for word Mask: ${myStr.runes}");

Output:

Runes for word Mask: (77, 97, 115, 107)

Example

Let's use hex representation for word "Mask". Hex for "M" is "004D", "A" is "0061", "S" is "0073", and "K" is "006B".

//Representing Mask in hex
String hexStr = "\u004D\u0061\u0073\u006B";
print("From hex to String - maskStr: ${hexStr}");

Runes codeUnits = Runes(hexStr);
print("Code Units: ${codeUnits}");
print("Code Units To String: ${String.fromCharCodes(codeUnits)}");

Output:

From hex to String - maskStr: Mask
Code Units: (77, 97, 115, 107)
Code Units To String: Mask

Example

Let's try another example to get runes for a symbol represented using 32-bit, using unicode character for mask emoji.

//Unicode character encoding : https://www.compart.com/en/unicode/U+1F637
//Decimal representation : 128567
String smileyMask = "\u{1F637}"; //needs 32 bit
print("Smiley with mask: ${smileyMask}");

// Prints Code Points for smiley with mask
print("Code Points in smily with mask: ${smileyMask.runes}");

//Printing String for code points
print(String.fromCharCodes(Runes(smileyMask)));

Output:

Smiley with mask: 😷
Code Points in smily with mask: (128567)
😷

Companion Video


Source Code

Please checkout the source code at Github here


References

  1. Runes Class
  2. String Class
  3. String Class runes property
  4. Mask Face UTF-16
  5. Character List for UTF-16
  6. Decimal to Hexadecimal converter

Happy Darting :)

_Liked the article ? Couldn't find a topic of your interest ? Please leave a comment or reach out at twitter about the topics you would like me to share !

BTW I love cupcakes and coffee both :)_

Follow me at Medium