Text.lk SMS Gateway Sri Lanka | Cheapest price on the market (0.64LKR per SMS)

  +94 76 901 1855

Login         Register

How Text.lk Encodes Your Messages? UTF-16, GSM-7, GSM-7 Extension

Here is a simple explanation of GSM-7, GSM-7 Extension, and UTF-16 encoding as used on text.lk, written for easy understanding by all users:


GSM-7, GSM-7 Extension, and UTF-16 Encoding on text.lk

1. GSM-7 Character Encoding (GSM_7BIT)

GSM-7 is the main character set used for sending SMS messages on mobile networks. It includes the most common letters, numbers, and symbols used in many Western languages.

  • How it works: Each character is stored using 7 bits (less than a full byte), which allows up to 160 characters in one SMS message.
  • What it includes: Basic Latin letters (A-Z, a-z), digits (0-9), and common punctuation marks.
  • Limitations: It does not support all characters from other languages or special symbols.
  • Special characters: Some symbols like {, }, |, ^, need an extra code, so they take up two character spaces in the message.
  • Why use it: It’s efficient and lets you send more characters per SMS if you only use supported characters124.

Learn More about GSM-7 Character Encoding (GSM_7BIT)

2. GSM-7 Extension Table (GSM_7BIT_EX)

The GSM-7 Extension is a small extra set of characters not included in the main GSM-7 alphabet.

  • How it works: These extra characters are sent using an escape code followed by the character code.
  • Examples: Characters like , ^, {, }, \, [, ], ~, and |.
  • Impact: Each extended character counts as two characters in your SMS, reducing the total number of characters you can send in one message.
  • When it matters: If your message includes any of these extended characters, it will use more space and might cost more14.

Learn More about GSM-7 Extension Table (GSM_7BIT_EX)

3. UTF-16 Character Encoding

UTF-16 is a different way to encode text that can represent every character in the world, including emojis, non-Latin scripts, and special symbols.

  • How it works: Most common characters use 2 bytes (16 bits). Characters like emojis or historic scripts use 4 bytes (two 16-bit units).
  • When it’s used: If your SMS contains any character not supported by GSM-7 or its extension (like emojis or Asian characters), the message is sent using UTF-16.
  • Effect on SMS length: Because each character uses 2 bytes, the maximum number of characters per SMS drops to 70 (or 67 if the message is split into parts).
  • Why it matters: Messages with UTF-16 encoding cost more because they use more space and fewer characters fit in one SMS45.

Learn More about UTF-16 Character Encoding


Summary for text.lk Users

  • Use GSM-7 characters for longer SMS messages and lower costs.
  • If you use special symbols from the GSM-7 Extension, your message length decreases slightly.
  • If you include emojis or non-Latin characters, your message switches to UTF-16, which halves the number of characters per SMS and may increase cost.
  • Always check your message content to know which encoding will be used and how many characters you can send in one SMS.

This simple guide helps you understand how text.lk handles SMS character encoding to optimize message length and cost.

Check sources

  1. https://supportcenter.everbridge.com/hc/en-us/articles/19141701423771-EBS-GSM-7-and-UCS-2-Character-Encodings-and-Their-Importance-When-Sending-SMS-Messages
  2. https://en.wikipedia.org/wiki/GSM_03.38
  3. https://www.infobip.com/docs/sms/language
  4. https://help.aninja.com/knowledge-base/sms-encoding-gsm-7-and-ucs-2/
  5. https://thesmsworks.co.uk/blog/gsm-character-set/
  6. https://help.goacoustic.com/hc/en-us/articles/360043843154–How-character-encoding-affects-SMS-message-length
  7. https://support.dotdigital.com/en/articles/8199189-gsm-character-set


Leave a Reply

Your email address will not be published. Required fields are marked *