Text.lk SMS Gateway Sri Lanka | Cheapest price on the market (0.64LKR per SMS)

  +94 76 901 1855

Login         Register

What is GSM-7 Character Encoding in SMS?

GSM-7 is the foundational character encoding standard for SMS messaging across GSM mobile networks. It is designed to maximize the efficiency of text message transmission by encoding the most commonly used Latin letters, numbers, and symbols into 7 bits per character, as opposed to the more typical 8 bits used in other encodings125.

What is GSM-7?

GSM-7, defined in the GSM 03.38 and 3GPP TS 23.038 standards, is a 7-bit character encoding scheme specifically optimized for SMS (Short Message Service), Cell Broadcast (CB), and USSD (Unstructured Supplementary Service Data) on GSM networks. By using only 7 bits per character, it allows more characters to be packed into the limited data space of an SMS message.

Key Features of GSM-7

Efficient Use of SMS Space

  • Each SMS message is transmitted in a payload of 140 bytes (1,120 bits).
  • With GSM-7, each character uses 7 bits, allowing up to 160 characters per SMS:
\[ \frac{140 \times 8\ \text{bits}}{7\ \text{bits/character}} = 160\ \text{characters} \]
  • This efficiency is crucial for applications where message length directly impacts cost and user experience.

Character Set Coverage

  • The GSM-7 default alphabet consists of 128 characters, including:
    • Uppercase and lowercase Latin letters (A-Z, a-z)
    • Digits (0-9)
    • Basic punctuation and several special symbols
  • It covers most Western European languages and some additional European characters, but does not support all international scripts (e.g., Cyrillic, Arabic, Asian scripts).

Special and Extended Characters

  • Some characters—such as {}[]|^~\, and —are not part of the default set.
  • These require an escape sequence: an escape character (0x1B) followed by the code for the special character.
  • Each such character consumes two character slots in the message, reducing the effective maximum length.

National Language Support

  • For languages with more than 128 commonly used symbols, GSM-7 can be extended using shift tables (locking and single shifts) to support additional characters.
  • If a message contains characters outside the GSM-7 set and no suitable shift table is available, encoding switches to UCS-2 (Unicode), which uses 16 bits per character and reduces the SMS capacity to 70 characters per message.

GSM-7 vs. Unicode (UCS-2) Encoding

FeatureGSM-7Unicode (UCS-2)
Bits per character716
Max characters/SMS16070
Language coverageMost Western EuropeanVirtually all scripts
Special charactersSome require escape sequenceAll supported directly
Fallback mechanismSwitches to UCS-2 if neededN/A

Practical Implications

  • Message Length: Using only GSM-7 characters ensures you can send up to 160 characters in a single SMS. Including any extended characters or switching to Unicode reduces this limit.
  • Cost Efficiency: More characters per message mean fewer messages sent, which is especially important for bulk SMS, marketing, and notifications.
  • Language Limitations: If your message includes unsupported characters (e.g., emojis, non-Latin scripts), it will automatically switch to Unicode, reducing the number of characters per SMS and potentially increasing costs.

GSM-7 Character Set Overview

  • Default Alphabet: Includes the basic Latin alphabet, numbers, and common punctuation.
  • Extension Table: Extra symbols (e.g., {}\) accessed via an escape character, each taking up two slots.
  • National Language Tables: Additional shift tables for specific languages, selectable via SMS headers.

Conclusion

GSM-7 remains the backbone of SMS messaging on GSM networks, balancing efficient data use with broad—though not universal—language support. Understanding its character set, limitations, and the impact of special characters is essential for anyone working with SMS, especially in global or multilingual contexts. For full compatibility and cost efficiency, always aim to keep messages within the GSM-7 character set whenever possible.


References:
Sources are available upon request and include technical documentation and industry glossaries from Twilio, Wikipedia, DevelopersHome, GMS Worldwide, Phone2, Dexatel, and others.


Leave a Reply

Your email address will not be published. Required fields are marked *