TL (Type Language) used by Telegram

The TL documents can be dense, and it's easy to get lost in the details without a clear overall picture. Let's break down how to approach TL syntax and build a more intuitive understanding, focusing on putting everything together rather than getting bogged down in individual details.

A Simplified, Step-by-Step Approach to Understanding TL Syntax

  1. The Core Idea: Data Description and Serialization
  2. Dissecting a Combinator Declaration
  3. Understanding Types
  4. The Role of Optional Arguments and Flags
  5. Serialization and Deserialization
  6. The %, !, and $ Modifiers
  7. Putting it All Together - Example

vector {t:Type} # [t] = Vector t;

user {flags:#} id:int first_name:flags.0?string last_name:flags.1?string = User;

messages.sendMessage text:string = Messages.SentMessage;
            


         `vector {t:Type} # [t] = Vector t;`
: This declares a polymorphic type (`Vector`). When creating a `Vector<int>`, the `t` will be replaced with int.  The `# [t]` syntax indicates that the vector will be represented by a sequence of t values, and the size is passed as an integer before the array.

*   `user {flags:#} id:int first_name:flags.0?string last_name:flags.1?string = User;`: Defines the `User` data structure, and conditional fields are controlled by a flag integer.

*  `messages.sendMessage text:string = Messages.SentMessage;`: This is an RPC call (a method). The server will respond with a type of `Messages.SentMessage`.
            


How to Approach TL Schemas

  1. Start with Types: Try to get a feel for the different type definitions in a schema (User, Chat, etc.). This will give you an idea of the domain being described.
  2. Focus on Common Structures: Look for generic types like Vector or Tuple. They are often reused.
  3. Trace Data Flow: Try to imagine a scenario. How would you create a User object, what information would you send across the network, and how would it be serialized?

How to Reduce Confusion

  • Don't Try to Learn Everything at Once: Focus on the core concepts first (combinator structure, types, flags), and gradually add complexity.
  • Work Through Examples: The document provides useful examples. Work through them, step by step, to understand the serialization process and structure of each data element.
  • Sketch Diagrams: Sometimes drawing out the structure of combinators can help with understanding how all the different parameters fit together and get serialized.
  • Relate it to Programming: While TL is not a programming language, if you are a programmer, then you can think of combinators as analogous to functions, where some of the parameters are implicit or are deduced by the return type, and all types have to be created at declaration time.
  • Ask Specific Questions: When confused, try to formulate very specific questions. Asking targeted questions can help pinpoint the source of confusion.
  • Practice: The more you look at TL schemas, the more comfortable you will become with the syntax.

Key Takeaway:

The goal is not to memorize all the details but to develop an intuitive understanding of how TL works. TL is all about structuring and serializing data in an efficient way, and that has implications about how its syntax is organized.

I hope this revised approach helps to clarify things and make TL more approachable for you. Feel free to ask more questions. I'm here to assist you in making the learning process smoother.



This document provides a comprehensive overview of the TL (Type Language) used by Telegram for defining data structures and remote procedure calls (RPC). Here's a breakdown of the key concepts and their implications:

What is TL?

  • A language for describing data: TL is not a programming language in the traditional sense. It's a description language that specifies how data is structured, how to serialize and deserialize it into a binary format, and how RPC calls are defined.
  • Focus on serialization: The primary purpose of TL is to enable efficient and unambiguous communication between different components of the Telegram system (e.g., client, server).
  • Type system: TL has a type system that includes built-in types (like Int, Long, String, Double, Null) as well as the ability to define custom aggregate types (like Vector, CoupleInt, User, etc.).
  • Combinators: The building blocks of TL are called combinators. These represent either:
  • Binary Data Serialization: TL leverages the combinator descriptions for binary data serialization. Every combinator corresponds to a serialized structure.
  • Little-Endian: The serialized data is represented as a sequence of 32-bit integers, each of which is serialized as 4 bytes in little-endian order.

Key Concepts Explained:

  1. Types and Constructors:
  2. Combinator Declarations:
  3. Functions:
  4. Namespaces:
  5. Comments:
  6. Serialization Process:
  7. RPC Example:
  8. Templates:

Implications:

  • Efficiency: The use of binary serialization and well-defined data structures leads to efficient data transfer between client and server.
  • Unambiguity: TL ensures there's no ambiguity about how data is structured and interpreted on both sides of the communication.
  • Extensibility: The ability to define custom types and functions makes it easier to extend the protocol as new features are added.
  • Interoperability: Since all Telegram clients use the same TL schemas, data can be exchanged smoothly between them regardless of the platform.
  • Formal verification: The provided documentation mentions formal descriptions of TL combinators and types, enabling a better understanding, specification, and reasoning about the protocol.

In Summary:

TL is a vital part of Telegram's architecture. It is a special purpose language that enables the specification of types, constructors, functions, and how they are to be serialized to a binary stream. By defining a clear, efficient, and unambiguous way to describe data and function calls, TL ensures seamless communication between various components of the system. This approach is crucial to the performance, scalability, and security of the entire platform.

This document dives into the more advanced aspects of Telegram's TL (Type Language) related to dependent types, polymorphism, and special syntax for repetitions. It also touches upon optional combinator parameters. Here's a comprehensive breakdown:

1. Dependent Types in TL

  • Beyond Polymorphism: TL not only supports polymorphism (types parameterized by other types) but also limited dependent types, where types can depend on the values of other parameters, specifically natural numbers (#).
  • Natural Numbers (# or nat): This special type represents non-negative integers (0 to 2^31 - 1). It's used as an index or length parameter in dependent types.
  • Limited Scope: The dependence is restricted to values of type #, and this dependence is only supported in the form of parameters for types. TL does not allow arbitrary computations or logic dependent on values.
  • Serialization: Values of type # are serialized as 32-bit signed integers.

2. Examples of Dependent Types

  • Integer Tuples (Vectors):
  • Type nat: The # type behaves like a recursively defined data type:
  • Formal type definition:
  • Binary Trees: The document shows how dependent types can define structures like binary trees where the height is a dependent parameter.
  • Random Tree: Another example where leaf nodes are at a fixed distance from the root

3. Polymorphic Dependent Types

  • Combining Polymorphism and Dependence: The document defines Tuple X n, which depends both on a type X and a natural number n.
  • Definition:
  • Formal Type Definition:

4. Dependent Sums

  • Tuple vs Vector:
  • Defining Vector using Tuple: The Vector type can be considered as a "sum" of Tuple across all possible lengths n.
  • Advantages of Tuple: Allows definition of fixed-size data structures like matrices.
  • Arbitrarily Sized Matrix:
  • Serialization Similarities (with caveats): The serialized representations of Tuple X n and Vector X are very similar for n > 0, involving a length (n) prefix followed by n serialized objects, as long as n is available and can be derived.

5. Special Syntax for Repetitions

  • [ array-field-name ":" ] [ nat-ident "" ] "[" field-descr ... "]: This syntax provides a shorthand for creating repeated substructures, leveraging dependent types implicitly.
  • Usage: Inside a constructor definition, it represents a tuple of objects of field-descr (or a combination of fields), of a length given by nat-ident
  • Abstract Definition: The structure is converted to a use of %Tuple in combination with a helper type.
  • Examples:
  • n+const and const+n: These expressions provide more flexibility to the syntax, allowing for small constant offsets in the repeat count.

6. Serialization of Dependent and Polymorphic Types

  • Combinators with Type Values: Types themselves are treated as values during serialization (e.g., Tuple double 10 would serialize to a form of 'Tuple' '%double' 10).
  • Practicality: At the moment (when this documentation was written), there is little need to serialize types.

7. Optional Combinator Parameters

  • Requirements:
  • Motivation: To simplify common cases where the types are already known and can be deduced from context.
  • Example:
  • Best Practice: Move optional parameters to the start of the combinator definition and order them as they appear in the result type.
  • Full versions of combinators Only needed when transmitting values as the universal object type.

Key Takeaways:

  • TL's Powerful Type System: TL goes beyond simple types and polymorphism by introducing limited dependent types, which allow types to depend on the value of natural number parameters. This feature makes it possible to represent complex data structures with inherent size dependencies (e.g., matrices, vectors).
  • Emphasis on Serialization Efficiency: All of the concepts in TL are ultimately designed to ensure efficient and unambiguous data serialization and deserialization.
  • Code Readability and Flexibility: The special repetition syntax is a crucial enhancement to the language, providing better readability and flexibility for dealing with repeated substructures.
  • Context-Aware Deserialization: Optional parameters allow the serialization to skip providing explicit information when the type is know by the context.
  • Focus on Practicality: While dependent types are a powerful feature, they are used carefully and pragmatically to address specific needs in data serialization and protocol definitions.

This document provides a deep dive into the advanced capabilities of TL, showcasing the careful design and considerations that go into creating a robust and efficient communication protocol for Telegram. Understanding these details is crucial for anyone developing applications or services based on the Telegram API.

This document provides a formal description of Telegram's TL (Type Language), focusing on its lexical structure, syntax, and predefined identifiers. It’s a more technical breakdown compared to previous documents, targeting those who need a deep understanding of how TL is parsed and interpreted.

1. Lexical Structure (Tokens)

  • Comments: Same as C/C++, removed during lexical analysis.
  • Whitespace: Used to separate tokens, otherwise ignored.
  • Character Classes:
  • Simple Identifiers and Keywords:
  • Tokens:
  • Final is a Reserved Keyword: The text explicitly states that Final is a reserved keyword. Other words like Type, are not keywords, but identifiers with pre-defined meanings.

2. Syntax (Grammar)

  • General Program Structure:
  • Declarations:
  • Expressions:
  • Combinator Declarations:
  • Built-in Combinator Declaration:
  • Partial Applications (Patterns):
  • Type Finalization:

3. Predefined Identifiers

  • Built-in Types (declared via pseudo-declarations):
  • Boolean Emulation:
  • Generic Types:
  • Result (Maybe) types:
  • Pair and Map:
  • Empty, True, and Unit:
  • Type: The type of all types, primarily used for optional parameters in polymorphic types.
  • # (Alias nat): Non-negative integers (0 to 2^31-1), with constructors O : # (zero/null) and S : # -> # (successor).
  • Tuple: A type parameterized by a type and number, representing a set of that number of values of the specified type (Tuple X n is a sequence of n values of type X).
  • Bool: Used to transmit boolean values.
  • False: A constructor-less type that triggers an error during serialization/deserialization. It acts as a placeholder for undefined or invalid types, which might be changed in the future.
  • True: A type with a single null constructor (true), similar to void in C/C++. It has zero length serialization when used as a bare type and it can indicate the presence (or not) of a parameter in a conditional field.
  • Unit: Similar to True, with a single null constructor (unit).

Example with False, True and Unit:

The document includes a detailed example on how False, True and Unit can be used in type definitions, specifically in a user constructor. It demonstrates the following points:

  • Conditional Fields: Using the ? syntax, a field can be optional based on a bit in the flags parameter.
  • False for Reserved Fields: reserved3:flags.3?False or reserved4:flags.4?False indicates that these are reserved for future use and should cause an error if present during deserialization.
  • True for Presence Indication: bot:flags.3?true means if bit 3 is set, the user is a bot. The parameter is present, but no value is assigned during deserialization.
  • Unit: Similar to True, useful as a type with a single null constructor.

ANTLR Definition:

The document mentions an ANTLR definition of the TL grammar which can be used for parsing and processing TL.

Key Takeaways:

  • Formal Specification: This document provides a detailed and rigorous specification of the TL language, making it a valuable resource for anyone needing to understand the underlying mechanics.
  • Lexical and Syntactic Rules: It defines the exact set of characters and tokens and the grammar rules that dictate the legal structure of TL programs.
  • Predefined Types and Identifiers: It clearly outlines the built-in types and their roles in the TL ecosystem.
  • Understanding the Purpose of Special Types: It explains the rationale behind types like False, True and Unit, especially their use in optional and conditional fields.
  • Foundation for Implementation: This formal description serves as a solid foundation for building parsers, interpreters, and other tools related to the TL language.

This rigorous specification is key to the interoperability and reliability of the Telegram system. Developers and researchers can utilize this information to build custom tools, analyze protocols, or extend the platform.


This document provides a formal description of how TL combinators are declared, building upon the syntax defined in the “Formal description of TL” document. It focuses specifically on explaining the meaning of various components within a combinator declaration.

1. Combinator Declaration Syntax

The core syntax for declaring combinators is:

      combinator-decl ::= full-combinator-id { opt-args } { args } = result-type ;
full-combinator-id ::= lc-ident-full | _
combinator-id ::= lc-ident-ns | _
opt-args ::= { var-ident { var-ident } : [excl-mark] type-expr }
args ::= var-ident-opt : [ conditional-arg-def ] [ ! ] type-term
args ::= [ var-ident-opt : ] [ multiplicity *] [ { args } ]
args ::= ( var-ident-opt { var-ident-opt } : [!] type-term )
args ::= [ ! ] type-term
multiplicity ::= nat-term
var-ident-opt ::= var-ident | _
conditional-arg-def ::= var-ident [ . nat-const ] ?
result-type ::= boxed-type-ident { subexpr }
result-type ::= boxed-type-ident < subexpr { , subexpr } >
            

content_copy download

Use code with caution.

Let's break down each part:

  • combinator-decl: The complete declaration of a combinator, terminated by a semicolon.
  • full-combinator-id: The full identifier of the combinator:
  • combinator-id: The basic identifier of the combinator (without combinator number).
  • opt-args: Optional arguments (zero or more). They appear at the beginning of the argument list.
  • args: Required arguments (zero or more). These can appear in different forms:
  • result-type: The type of the value returned by the combinator.
  • ;: A semicolon that terminates the declaration.

2. Key Concepts in Detail

  • Combinator Identifier (Name): A combinator is uniquely identified by its name, a 32-bit number. This number is either automatically calculated (using CRC32) or explicitly provided using the # notation followed by 8 hexadecimal digits.
  • Optional vs. Required Fields:
  • Field (Variable/Argument) Identifiers: These can be identifiers starting with either a lowercase (lc-ident) or uppercase (uc-ident) Latin letter, as long as it doesn't reference any namespace. They are names for the parameters being passed to the combinator.
  • Result Type: The result type may be composite, appear for the first time, or depend on any of the combinator's fields with types Type or #.
  • Optional Field Declarations:
  • Required Field Declarations:
  • Repetitions:
  • Conditional fields:

3. Examples

  • matrix {m n : #} a : m* [ n* [ double ] ] = Matrix m n;
  • tnil {X : Type} = Tuple X 0;
  • tcons {X : Type} {n : #} hd:X tl:%(Tuple X n) = Tuple X (S n);
  • vector {X : Type} (n : #) (v : %(Tuple X n)) = Vector X;
  • vector {t : Type} # [ t ] = Vector t;
  • This is equivalent to {t:Type} n:# v:(%Tuple t n) = Vector t; * = Vector t;: Result type Vector t.
  • user {fields:#} id:int first_name:(fields.0?string) last_name:(fields.1?string) friends:(fields.2?%(Vector int)) = User fields;

Key Takeaways:

  • Formal Rules for TL Combinators: This document provides precise rules for declaring TL combinators.
  • Separation of Concerns: It clearly distinguishes between optional and required arguments, repetition syntax, and the result type.
  • Relationship with Repetitions: The document explains how repetitions are internally represented with Tuple types and auxiliary types.
  • Foundation for TL Processing: This knowledge is essential for implementing TL parsers, code generators, and other tools.
  • Implications of Optional Fields: This is a mechanism for using values that can be derived by the context.

This deep understanding of how combinators are formally declared is vital for building tools that can correctly process, generate, and manipulate TL schemas.


This document explains how types themselves (i.e., values of type Type in TL) are serialized, building upon the concept of type constructors and their corresponding names. Here's a breakdown of the key points:

1. Types as Values

  • In TL, types are first-class values, meaning they can be passed as arguments, returned from functions, and stored in data structures.
  • This is especially important for polymorphism where you can define types that depend on other types.
  • Since types are values, they need to be serializable, just like any other data.

2. Type Constructors

  • Arity: Type constructors can have different arities (number of parameters):
  • Type Names: Each type constructor must have a unique 32-bit name. These names are used in the serialization process.

3. Serialization of Types

  • Similar to Other Recursive Types: Types are serialized much like any other recursive data structure with a defined set of constructors of differing arity.
  • The core principle: Once each type constructor has an assigned 32 bit name, the types can be serialized by serializing the type constructors and the arguments associated with it.

4. How 32-bit Names are Assigned

  • Type Name: It's not the name of the type itself, but a numerical identifier associated with the type constructor. This is assigned via:
  • Example of List:
  • Example of IntList
  • Bare Types:
  • Built-in Types:

5. Updated Description Needed

  • The document notes that the provided description is "somewhat outdated" and may need to be updated.
  • Missing Treatment of ! Modifier: Specifically, the document doesn't explain how the ! modifier is handled during type serialization, which is a detail that has not been fully explained.

Key Takeaways

  • Types as Serialized Values: Types are treated as regular values that need serialization, enabling polymorphic type definitions in TL.
  • Unique Type Constructor Names: Each type constructor is assigned a unique 32-bit "name" to allow their serialization and deserialization.
  • CRC32 and Summation: The "name" is calculated by combining the CRC32 of the type's string definition with the sum of the names of the associated constructors.
  • Bare Types and Logical Negation: Bare types have their names calculated by logical negation of one of the corresponding boxed type's constructor name.
  • Built-in Types and Pseudo-Declarations: Pseudo declarations are used to link bare types with boxed types
  • Need for Further Clarification: The treatment of the ! modifier in type serialization is not explicitly discussed in this document.

Implications:

  • Polymorphism Enabled: This serialization method allows the proper functioning of polymorphic types in TL.
  • Unambiguous Type Transfer: It ensures that types can be passed between different parts of the Telegram system (client, server, etc.) without any ambiguity.
  • Foundation for Type System: It provides a crucial piece of the puzzle for how TL's type system is implemented.

This document is essential for fully understanding how the TL type system operates at a low level, particularly when dealing with polymorphic types. The details of type serialization are important for creating robust applications that rely on the Telegram protocol.

This document describes how a TL schema itself is serialized using TL, effectively creating a self-describing format. This allows for the efficient storage and transmission of TL schemas and facilitates the development of tools that operate on them.

1. Motivation

  • Binary Representation of TL Schemas: The primary goal is to define a way to serialize a TL schema (e.g., a .tl file) into a binary format (e.g., a .tlo file).
  • Simplified Tooling: This enables developers to write a parser once to convert text TL schema files to binary files. All other tools, such as auto-generators for (de)serializers, only need to know how to read the binary format.
  • Self-Description: The TL schema for serializing TL schemas is defined using TL itself, making the format self-describing and less prone to versioning issues.

2. Core Components

  • common.tl Fragment: The document starts with a fragment from a common.tl file. This fragment provides basic built-in types and structures required for defining the TL schema serialization:
  • tl.tl (Schema for TL Schemas): The tl.tl file defines the TL schema used to serialize TL schemas. It consists of the following combinators, structured into a hierarchy:

3. Remarks

  • Magic Number (0x3a2f9be2): The serialized schema (version 2) always starts with the combinator number of tls.schema_v2, which is 0x3a2f9be2. This serves as a "magic number" for .tlo files of this format, allowing for quick identification of the file type.
  • Extensibility: Future versions can use a new constructor (e.g., tls.schema_v3) with a different number, allowing for the evolution of the schema format.

4. Example

  • tl.tlo Data: The document includes a long hexadecimal dump (tl.tlo), which represents the serialized binary data of the tl.tl schema itself.
  • Content: This tl.tlo file, contains the serialized representation of all combinators present in the tl.tl schema, including all the necessary metadata such as names, argument types, return types, flags, and so on.

5. Serialization Process

  • tls.schema_v2 as Root: Serialization starts with the tls.schema_v2 combinator, which contains all the other schema information.
  • Recursive Serialization: The binary representation is created by traversing the tls.Schema, recursively serializing all of its components, such as the types, constructors, functions, and their associated parameters.
  • Integer Sequences: The resulting binary data consists of a sequence of 32-bit integers, as dictated by TL serialization rules.
  • Little-Endian Ordering: The 32-bit integers are serialized in little-endian order, as is usual in TL.

Key Takeaways

  • Self-Describing Format: The tl.tl schema defines how a TL schema is serialized, creating a self-describing format. This significantly simplifies the tooling.
  • Efficient Representation: The binary format is designed for efficiency, making it faster to parse than the textual format.
  • Versioned Schema: The use of constructor numbers and the possibility of a tls.schema_v3 ensures extensibility in the future.
  • Foundation for Tools: This approach allows for the creation of tools that can reliably parse and process TL schemas with just one parser for .tlo files.
  • Example of a self-defined schema: This is a good example of how to create a TL schema for serializing a schema.

Implications

  • Simplified Tool Development: This approach significantly reduces the complexity of tools that work with TL schemas (e.g., code generators).
  • Improved Performance: Binary serialized schema files allow for faster parsing compared to text representations.
  • Flexibility: The scheme can be easily updated while maintaining backwards compatibility.

In summary, this document demonstrates the power and flexibility of TL by using the language to describe its own serialization format, highlighting the self-describing and self-referential nature of the protocol. This approach allows for the creation of efficient and versatile tools while remaining adaptable for future modifications to the language.

This document delves into the intricacies of optional combinator parameters in TL, explaining how they are handled during serialization and deserialization, especially in the context of polymorphism and functional combinators. Here's a detailed breakdown:

1. Optional Parameters

  • First Few Parameters: TL allows the first few parameters of a combinator to be declared as optional.
  • Implicit Values: These parameters are often not explicitly stated in the serialized form because their values can be inferred.
  • Polymorphism and Optional Parameters: Optional parameters are closely tied to polymorphism, where types can depend on other type parameters.
  • Benefit: It reduces redundancy in the serialized representation and simplifies the most common use cases.

2. Serialization/Deserialization

A (sub)expression is serialized/deserialized in two main ways:

  • Known Result Type: When the result type is known beforehand (e.g., when receiving a response to an RPC call), the result type is used to infer the values of the optional parameters.
  • Unknown Result Type: When the result type is unknown (e.g., when sending an RPC call), all optional parameters must be explicitly specified and serialized using the full version of the combinator.

3. Functional Combinators and ! Modifier

  • Distinction from Constructors: A functional combinator differs from a constructor by implicitly having the ! modifier before its result type.
  • eval Function: The (remote or local) computation of a functional expression can be thought of as the execution of a polymorphic eval : !X -> X function, that converts a type annotated with ! into a type without it.

4. Rules for Optional Parameters

Consider a constructor:

      C {a1:T1} ... {am:Tm} b1:U1 ... bn:Un = T;
            

content_copy download

Use code with caution.

Where a1 to am are optional parameters and b1 to bn are the regular parameters, and T is the result type. The following rules must hold:

  1. Type Dependencies: Each type (T1 to Tm, U1 to Un, and T) can depend on type parameters of type Type or # that have been declared to the left of the type's usage.
  2. No ! on Optional Parameter Types: The types of optional parameters (T1 to Tm) must not be modified by !.
  3. Types of Optional Parameters Optional parameters must have type Type or #.
  4. Usage Requirement: Each optional parameter (a1 to am) must be used at least once, either in one of the Ui types that have a ! modifier, or in the result type T if it does not have an explicit or implicit !.
  5. First Usage with ! Modifier: If an optional parameter ai is not used in the result type or if the result type has a !, its first (leftmost) usage must be within a type Uj that is modified by !.

5. Rationale Behind the Rules

  • Inference: The key idea is that the values of optional parameters should be inferable either from the known result type (when a value of a regular type is serialized), or from a parameter of type modified by ! (when a value of a type modified by ! is serialized).
  • Information Flow:

6. The ! Modifier and Information Flow

  • Reverse Direction: The ! modifier indicates the direction of type information flow.
  • Source of Information: Without !, the result type is usually known, so all optional type parameters should be derivable from it.
  • Recipient of Information: With !, a result type is typically unknown, so it relies on the parameters modified by ! to provide the values of the type parameters.

7. Examples (Hypothetical)

  • Valid Scenario (without ! on the result):
  • Valid Scenario (with ! on an argument):

      maybe {X:Type}  result:!X = Maybe<X>;
            

content_copy download

Use code with caution.

  • Here X is an optional parameter, it will be inferred from the type of !X
  • Invalid Scenario (without ! modifier):

Key Takeaways:

  • Optional Parameter Inference: TL's optional parameters are not just a syntactic convenience; they are designed to optimize serialization by inferring type parameters based on the context.
  • Importance of ! Modifier: The ! modifier is used to denote the inversion of the direction of information flow, which is fundamental for working with polymorphic functional types.
  • Rules for Consistency: The rules for optional parameters ensure that all parameters can be correctly inferred or explicitly provided during (de)serialization, avoiding ambiguity.
  • Type Information Flow: This is done by making the result type a source or recipient of information.

Implications:

  • Efficient Serialization: The ability to omit optional parameters when the result type is known results in smaller serialized messages.
  • Simplified APIs: TL functions can be used without having to supply all of the implicit parameters.
  • Robustness: These rules ensure that serialization and deserialization work correctly even with complex polymorphic types, and ensure that a proper type can be derived in all cases.

Understanding these rules for optional parameters is critical for building tools that correctly handle TL serialization and work with functional combinators.


This document explores the intricacies of binary serialization within the context of abstract types in Telegram's TL language, as well as introducing the concepts of constant, surface, and functional values, and the TL modifiers % , ! and $. Here's a breakdown of the key concepts:

1. Abstract vs. Concrete Types

  • Abstract Types: TL defines abstract data types conceptually, without specifying how their values are represented in memory or serialized. These types are defined in the spirit of theories of dependent intuitionistic types.
  • Concrete Types: The process of serialization defines concrete types [T]. They are subsets of A* (words built from 32-bit integers). [T] is a set of the possible serializations of a value of type T.
  • Serialization Process: The serialization process defines a mapping from abstract values of type T to elements in the concrete set [T].
  • Deserialization: The reverse of serialization.
  • Values as S-expressions: Abstract values are conceptually represented as S-expressions for easier manipulation and understanding. S-expressions are recursively defined as either an atom (primitive value or identifier) or a space delimited list of S-expressions.
  • S-expression structure: An S-expression representing a value of type T is built by composing the combinator identifier that returns a value of type T, with as many S-expressions as arguments the combinator requires.

2. Example S-Expressions

Given the TL schema:

      pair x:int y:int = Pair;
pnil = PairList;
pcons hd:Pair tl:PairList = PairList;
            

content_copy download

Use code with caution.

  • (pnil) is a value of type PairList.
  • (pcons (pair 2 3) (pcons (pair 9 4) (pnil))) is another value of type PairList.

3. Serialization Mapping

  • Primitive Types (int, string): Serialization is straightforward, defined in another document ("Binary serialization").
  • S-expressions (C E1 ... Er):
  • Formal Definition of Concrete Types:
  • Clothed Types (Int, String): These built-in types are serialized as if they were defined as: int x:int = Int; and string s:string = String; so (int 5) or (string "Test").

4. Constant, Surface, and Functional Values

  • Constant Expressions (c(T)):
  • Surface Expressions (f(T)):
  • Functional Expressions:

5. TL Modifiers: %, !, and $

  • % (Naked Type Modifier):
  • ! (Surface Value Modifier):
  • $ (Functional Value Modifier):

6. Implicit c() Modifier

  • Default Behavior: By default, TL implicitly adds the c() modifier to all combinator parameter types and results, unless modified by ! or $.
  • Cancellation: The ! modifier cancels the implicit c(), while $ reverses it.
  • Reason: TL assumes functions operate on and return constant values by default (unless otherwise explicitly specified with a modifier).

7. ! Modifier (Revisited)

  • Twin Type: The ! modifier creates a new type (a twin) for every type, which allows surface values instead of only constants.
  • No Constructors: The twin type has no inherent constructors.
  • Functional Combinators : Differ from constructors in that ! is implicitly added to their result type.
  • eval Function: The local or remote calculation of an expression is represented by a polymorphic eval : !X -> X function.

8. Optional Parameters

  • Optional combinator parameters are described in another document.

Key Takeaways

  • Abstract vs. Concrete Representation: TL clearly distinguishes between abstract data types and their serialized concrete representations.
  • S-expressions for Conceptualization: S-expressions offer a convenient way to conceptually represent abstract values, which are mapped to binary sequences during serialization.
  • %, !, and $ Modifiers: These modifiers provide fine-grained control over the types of expressions, enabling the use of constants, surface values, and full functional values where needed.
  • Information Flow: The ! modifier explicitly inverts the direction of information flow, allowing for a more flexible type system.
  • Implicit c() modifier: Most types and parameters are by default treated as constant, which is reversed by ! or $.
  • Complex System: These concepts form a complex but very powerful system that provides a well defined system for defining data types and the computations that work on them.

Implications

  • Flexibility in Data Handling: TL allows the use of different kinds of values (constants, surface, functional) as needed.
  • Efficient RPC Calls: Surface expressions are particularly suitable for representing RPC queries, which often involve executing a function with constant arguments.
  • Support for Complex Computation: The `$ modifier enables deep computations through RPCs.
  • Foundation of TL: These concepts provide the foundation for the TL language, enabling the serialization and deserialization of arbitrary TL types.

This document provides a comprehensive view of how binary serialization and abstract types are handled in TL, and how TL's type modifiers work, revealing the careful thought and design that enables Telegram's efficient and flexible communication protocol.


This document explains the concept of TL patterns (also known as partial applications) and why they are not fully utilized in the current implementation of Telegram's TL. It also addresses the trade-offs and potential issues associated with the current approach.

1. TL Patterns (Partial Applications)

  • Purpose: TL patterns are declarations that create specialized versions of types or combinators by fixing some of their parameters. They are meant to simplify code and make it more efficient in some cases.
  • Syntax: TL patterns are defined using the following syntax:
  • Original Design: Templates (another name for partial applications or patterns) were designed to:

2. Current Approach

  • Universal Constructors: Instead of using templates extensively, TL currently favors universal constructors.
  • Example: Instead of having specialized constructors for each Vector type, a single universal constructor vector {t:Type} # [t] = Vector t is used.
  • Inference of Optional Parameters: The values of optional parameters are inferred from the result type.

3. Advantages of the Current Approach

  • No Predefined Templates: It avoids the need to declare templates for every possible type usage beforehand, such as Vector SomeType for all possible SomeType.
  • Reduced Code Duplication: It eliminates the requirement to generate specific constructors for common structures.
  • Flexibility: A single constructor can handle multiple situations.

4. Drawbacks of the Current Approach

  • Ambiguity When Serializing as Object: If a value like Vector int is to be serialized as type Object (a generic type), a problem arises.
  • Lack of Information: The type information is not readily available when the result type is unknown (e.g. when using a generic type like Object).

5. Potential Solutions

The document proposes two solutions for the Object type serialization ambiguity, both requiring type serialization:

  • Full Form of the Constructor (@vector): Use the full version of the constructor corresponding to vector (denoted by @vector), which includes all of the optional parameters as required parameters.
  • TypedObject Constructor: Define a special constructor:

6. Type Serialization Requirement

Both potential solutions require type serialization, which the document mentions in the article Type serialization.

Key Takeaways:

  • TL Patterns for Specialization (Unused): TL patterns (partial applications) were originally intended to specialize polymorphic types and constructors for specific use cases, but are not extensively used.
  • Universal Constructors for Flexibility: The current TL implementation favors universal constructors and relies on inference to determine the values of optional parameters.
  • Trade-offs: This approach is simpler and more flexible, but it can cause issues when serializing values as a generic type such as Object.
  • Need for Type Serialization: The proposed solutions to solve the Object serialization problem both require type serialization.
  • Lack of a full solution: TL patterns are not currently used, and the proposed solutions are just that: proposed, and not implemented.

Implications

  • Simplified TL Schemas: The use of universal constructors keeps TL schemas shorter and easier to maintain.
  • Potential for Future Enhancement: Type serialization could lead to the implementation of a proper Object type as well as the full potential of TL patterns.
  • Performance Implications: The use of type serialization might have an impact on the size of the resulting serialized messages, which has to be taken into account.

In summary, this document provides valuable insights into the design decisions behind TL's handling of polymorphism and demonstrates why TL patterns are not used at the moment. The discussion of potential solutions highlights the ongoing evolution of the TL language and its capabilities.


This is a comprehensive TL (Type Language) schema, likely used by Telegram, defining a wide range of data structures (constructors) and remote procedure call (RPC) methods. Let's break it down into key areas:

General Structure

  • Constructors: Define how data is structured and serialized. They are primarily used to create values of specific types. Each constructor has:
  • Methods: Define RPC calls. They represent actions that can be performed on the server and return a result type.
  • Type system:
  • There is a rich type system that includes int, long, string, Bool, Vector<>, bytes, double.
  • Custom Types are created with constructors such as InputPeer, User, Chat, Document.
  • Type parameters are denoted with lowercase letters like t in vector {t:Type} # [t] = Vector t;

Key Functionality Areas

  1. Core Types (Bool, Null, Error, Vector)
  2. Input and Peer Types
  3. Files and Media
  4. Users and Profiles
  5. Chats and Groups
  6. Messages
  7. Authentication
  8. Notifications
  9. Settings and Configuration
  10. Privacy
  11. Updates
  12. Channels and Topics
  13. Payments and Invoices
  14. Secure Values
  15. Other features:

Key Design Aspects

  • Flags: The use of bitwise flags (# type) to indicate the presence of optional fields is a common pattern to conserve bandwidth and optimize performance. This is a direct usage of TL dependent types to avoid sending values when they are not required.
  • Polymorphism: The use of generics in types like vector {t:Type} # [t] = Vector t; allows reuse of these constructors for many different concrete types.
  • Namespaces: Constructors and methods are categorized into namespaces (e.g., auth, messages, channels, contacts, account, help, phone, stats). This aids in code organization and reduces naming conflicts.
  • Idempotency: The TL modifiers, in particular ! for functional combinators and % for bare types, are idempotent, so !!X = !X and %%X = %X.
  • Self-Documentation: The TL schema itself serves as a form of documentation for the API.
  • Error Handling: The error type is used for returning information about errors.

Implications

  • Comprehensive API: This schema defines a very extensive API, which is required for the vast capabilities of Telegram.
  • Interoperability: The usage of TL ensures that different Telegram clients are able to communicate in a consistent and reliable way.
  • Code Generation: This schema can be used to generate code for client-side and server-side implementations automatically
  • Extensibility: New features and improvements can be easily added and versioned using this system.

Overall

This TL schema is a critical component of the Telegram platform. It's a formal, well-structured definition of data and operations that enables efficient and reliable communication between various parts of the system. Its careful design reflects the complex functionality that Telegram provides, and is a good example of the power and flexibility of the TL language.


This document presents a series of TL (Type Language) schemas used for end-to-end encrypted (secret) chats in Telegram. It focuses on the data structures used in these chats, not the underlying encryption protocols (which are covered in other documents). Here's a breakdown of the key aspects:

Overall Structure

  • Layered Schemas: The document shows multiple schema versions (layers), indicated by Layer 185, Layer 17, etc. Each layer potentially adds new features or modifies existing data structures.
  • Secret Chat Focus: The schemas are specifically for end-to-end encrypted ("secret") chats in Telegram, not for normal chats.
  • TL Notation: Uses familiar TL notation with constructor names, parameters, and their types.
  • Key Types:

Schema Breakdown by Layer

Each layer introduces changes or enhancements to the previous layer.

Layer 185

  • Core: decryptedMessage, decryptedMessageService (for actions within a message) DecryptedMessageMedia for media.
  • Basic Media: Includes decryptedMessageMediaPhoto, decryptedMessageMediaVideo, decryptedMessageMediaGeoPoint, decryptedMessageMediaContact.
  • Actions: includes decryptedMessageActionSetMessageTTL, decryptedMessageActionReadMessages, decryptedMessageActionDeleteMessages, decryptedMessageActionScreenshotMessages, and decryptedMessageActionFlushHistory.
  • Document Attribute: Has basic documentAttributeImageSize, documentAttributeAnimated, documentAttributeSticker, documentAttributeVideo, documentAttributeAudio, documentAttributeFilename, fileLocationUnavailable, fileLocation, photoSizeEmpty, photoSize, and photoCachedSize.

Layer 17

  • Message Structure: Adds ttl (time-to-live) in decryptedMessage.
  • Media Types: Introduces mime_type in decryptedMessageMediaVideo and decryptedMessageMediaAudio.
  • New Type: Adds decryptedMessageLayer to wrap a DecryptedMessage
  • Message Actions: Includes several sendMessage...Action types for typing and upload states.
  • Actions: Adds decryptedMessageActionResend, decryptedMessageActionNotifyLayer, decryptedMessageActionTyping.

Layer 20

  • Key Exchange Actions: Introduces several new message actions related to key exchange:

Layer 23

  • Document Attributes: Defines specific document attributes using documentAttribute... constructors. This includes: * documentAttributeImageSize, documentAttributeAnimated, documentAttributeSticker, documentAttributeVideo, documentAttributeAudio, and documentAttributeFilename.

Layer 45

  • New Flags: Adds flags for decryptedMessage.
  • Captions: Introduces caption for photos and videos, and a new structure for documents decryptedMessageMediaDocument.
  • StickerSets: Adds inputStickerSetShortName and inputStickerSetEmpty for managing stickers.
  • Audio Attributes: Adds title and performer to documentAttributeAudio.
  • Venue Support: Adds decryptedMessageMediaVenue for sharing locations and points of interest.
  • Web Page Support: Adds decryptedMessageMediaWebPage.
  • MessageEntities: Includes a variety of MessageEntity types for formatting text such as messageEntityUnknown, messageEntityMention, messageEntityHashtag, messageEntityBotCommand, messageEntityUrl, messageEntityEmail, messageEntityBold, messageEntityItalic, messageEntityCode, messageEntityPre, and messageEntityTextUrl.

Layer 46

  • Adds optional fields to document attribute audio documentAttributeAudio#9852f9c6 flags:# voice:flags.10?true duration:int title:flags.0?string performer:flags.1?string waveform:flags.2?bytes

Layer 66

  • Adds support for round messages with documentAttributeVideo#ef02ce6 flags:# round_message:flags.0?true duration:int w:int h:int and the respective sendMessageRecordRoundAction and sendMessageUploadRoundAction.

Layer 73

  • Adds grouped_id to decryptedMessage, it is only added if a type is modified.

Layer 101

  • Adds styling to MessageEntity such as messageEntityUnderline, messageEntityStrike and messageEntityBlockquote.

Layer 143

  • Adds a size as long type to decryptedMessageMediaDocument.

Layer 144

  • Adds messageEntitySpoiler and messageEntityCustomEmoji types.

Key Takeaways

  • Evolutionary Design: The layered approach allows Telegram to incrementally add features and modify structures while maintaining compatibility.
  • Focus on End-to-End Encryption: These schemas specifically define how encrypted messages and related data are structured.
  • Detailed Media Handling: The schemas are complex to accommodate various forms of media (photos, videos, audios, documents) within encrypted chats.
  • Extensibility: The use of flags and optional fields allows new features to be added with minimal changes to existing structures.
  • Flags: TL flags are heavily used to define conditional fields using dependent types, which are a powerful way to compress serialized data.
  • Flexibility: The usage of these schema enables Telegram to support various features such as mentions, hashtags, styled text, inline bot support, and URLs in encrypted messages.

Implications

  • Security: The schemas help enforce end-to-end encryption by defining how data is structured and exchanged.
  • Interoperability: All Telegram clients need to use the same schemas to ensure that encrypted chats work seamlessly.
  • Efficiency: The usage of efficient serializations and data representation promotes lower bandwidth usage.

This document provides valuable insights into the structure of end-to-end encrypted communication within Telegram, demonstrating how various types and combinators work together to secure message exchanges. Understanding these schemas is crucial for anyone working with Telegram's security features or implementing clients supporting end-to-end encryption.


This document provides a high-level overview of the MTProto mobile protocol, used by Telegram for communication between clients and servers. It breaks down the protocol into its key components and explains how they interact.

Key Concepts

  • Mobile-Centric Design: MTProto is designed for mobile clients. It's not meant for web browsers.
  • Modular Architecture: The protocol is divided into three independent layers:
  • MTProto 2.0: The current version of MTProto used for cloud chats. It replaces the deprecated MTProto v1.0.

1. High-Level Component (RPC/API)

  • Sessions: Clients and servers exchange messages within a session, which is associated with the device and user key ID. Sessions are not bound to specific network connections.
  • Multiple Connections: A single client may have multiple open connections to a server.
  • Connection Independence: Responses don't need to be returned through the same connection that sent the query. However, in most cases they are returned on the same connection to reduce overhead.
  • UDP caveat: When using UDP, a response may come from a different IP address.
  • Message Types: Several types of messages exist:
  • Binary Data Format: At lower levels, messages are binary data streams aligned on 4-byte or 16-byte boundaries.
  • Message Structure:

2. Authorization and Encryption

  • Authorization Key: The client generates an authorization key upon first run (this usually doesn't change). It's the basis of authentication with the server.
  • External Header: Each message is prepended with:
  • AES-256 Encryption: Message encryption uses AES-256.
  • Variable Data: The initial part of a message (including the message header, session ID, message ID, sequence number) are included in the encryption to guarantee unpredictability of the ciphertext even for consecutive identical payloads.
  • Perfect Forward Secrecy (PFS): MTProto provides Perfect Forward Secrecy (PFS), ensuring that if the authorization key is compromised, past encrypted messages are not revealed. PFS is done using a complex key derivation mechanism

3. Time Synchronization

  • Time Discrepancies: If the client's time significantly differs from the server's time, the server might ignore the messages.
  • Time Synchronization Message: The server sends a special message containing:
  • Synchronization Logic: The client:
  • New Session Generation: If time synchronization is not done, the client needs to generate a new session because the messages will no longer be monotonic.

4. MTProto Transport Protocols

  • Secondary Header: Before being sent over the network, messages are wrapped with a secondary header defined by the MTProto transport protocol.
  • Supported Transports: The document lists multiple transport protocols:
  • Protocol Identification: The server determines the specific protocol based on the header.
  • Optional features:
  • Example implementations for transport protocols are seen in libraries like tdlib and MadelineProto.

5. Transport Protocols

  • Purpose: Transport protocols handle the delivery of encrypted message payloads with their external headers from the client to the server and back.
  • Supported Protocols:

6. Protocol Stack (Recap)

The document maps MTProto onto the ISO/OSI stack:

  • Layer 7 (Application): High-level RPC API.
  • Layer 6 (Presentation): Type Language (TL) - defines how data is structured and serialized.
  • Layer 5 (Session): MTProto session management.
  • Layer 4 (Transport):
  • Layer 3 (Network): IP
  • Layer 2 (Data link): MAC/LLC
  • Layer 1 (Physical): IEEE 802.3, 802.11, etc.

Key Takeaways

  • Multi-Layered Protocol: MTProto has clear separation of layers, each responsible for different aspects of communication.
  • Binary-Centric: The entire protocol revolves around the transmission of binary data streams.
  • Security Focused: Strong emphasis on encryption, authorization, and perfect forward secrecy for security.
  • Session Management: Sessions are core to how communication is handled.
  • Flexibility: MTProto supports different transport protocols.
  • Time-aware: The protocol includes a time synchronization mechanism to avoid data loss when client and server time is not synchronized.

Implications:

  • Efficiency: The binary nature of the protocol makes it efficient in terms of data usage and processing.
  • Security: The multi-layered approach and security features ensure the integrity of communications between the client and the server.
  • Scalability: The architecture is robust to handle the large number of clients and connections to the server.

This document provides a well-structured understanding of MTProto, crucial for anyone developing Telegram clients or understanding the inner workings of the Telegram network.


要查看或添加评论,请登录

Mehdi Baneshi的更多文章

  • TON (The Open Network) Python SDK

    TON (The Open Network) Python SDK

    Python SDK options available for developing applications on the TON (The Open Network) blockchain. We'll focus on the…

  • How upgrade website #1

    How upgrade website #1

    I wish, we had single language, single stack and single tool to do all jobs. There is a beautiful proverb in Persian!…

  • Ingredients of developing a decentralized application (dApp) on TON (The Open Network) blockchain

    Ingredients of developing a decentralized application (dApp) on TON (The Open Network) blockchain

    Ingredients for developing a decentralized application (dApp) on the TON (The Open Network) blockchain. It's a bit…

  • 50 key concepts related to TON blockchain

    50 key concepts related to TON blockchain

    Glossary of 50 key concepts related to TON blockchain development, drawing from the file names and functionalities of…

  • TON Blockchain Trust and Security

    TON Blockchain Trust and Security

    critical aspect of interacting with the TON blockchain: trust and security when communicating with Lite Servers. The…

  • What is Genesis?

    What is Genesis?

    Genesis: A Generative and Universal Physics Engine for Robotics and Beyond Genesis is a comprehensive physics…

  • Living in the Terminal #1

    Living in the Terminal #1

    Kick-off living in the terminal as a developer with us! The Power of Living in the Terminal: A Developer’s Ultimate…

  • Django Svelte Series #1

    Django Svelte Series #1

    This is just a work-in-progress endeavor, open for collaboration you can take it as an idea. DRF The web framework for…

  • What is advantage of caddy?

    What is advantage of caddy?

    what is advantage of caddy? Caddy is a modern web server designed with ease of use, security, and performance in mind…

  • Start of project

    Start of project

    After three years , from start of my journey to world of computer Science , code and programming ,finally I choose my…

社区洞察

其他会员也浏览了