Alternative Routing-ID Interpretation:
·???????? ARI is an optional feature that increases the number of functions a single device can support from 8 to 256.
·???????? It achieves this by reinterpreting the Device Number and Function Number fields as a single 8-bit Function Number.
·???????? ARI benefits both virtualized and non-virtualized environments by allowing more concurrent users of a multi-Function device.
·???????? Requires new software and is managed via the ARI Capability structure in ARI Devices and the Device Capabilities 2 and Device Control 2 registers in ARI Downstream Ports.
·???????? Backward Compatibility: ARI is designed to be backward compatible with existing PCI Express specifications.
·???????? Extended Functions: Functions with a Function Number greater than 7 are Extended Functions, accessible after ARI Forwarding is enabled in the Downstream Port above the ARI Device.
·???????? Function Groups: Functions within an ARI Device can be configured into Function Groups for VC arbitration or access control.
·???????? Topology: Root Ports and Switch Downstream Ports must support ARI Forwarding to access Extended Functions.
Traditional Multi-Function Devices (non-ARI):
·???????? Limited to a single Device Number but can implement up to eight independent Functions.
·???????? Each internal Function is selected based on decoded address information provided as part of the address portion of Configuration Request packets.
·???????? Phantom Functions Supported Field: In non-ARI multi-Function devices, the Phantom Functions Supported field within each Function's Device Capabilities register indicates support for using unclaimed Function Numbers to extend the number of outstanding transactions. This is achieved by logically combining unclaimed Function Numbers (called Phantom Functions) with the Tag identifier. ARI Devices and Phantom Functions: With every Function in an ARI Device, the Phantom Functions Supported field must be set to 00. This indicates that Phantom Functions are not supported in ARI Devices
- Mechanism: FC uses FCPs (e.g., InitFC, UpdateFC DLLPs) to transfer credit information between transmitter and receiver.
- Independence: Each VC has its own FC credit pool, managed separately to prevent inter-VC blocking.
- Credit Unit: 4 Double Words (DW, 16 bytes) for data payload credits.
- Receivers supporting End-End TLP Prefixes must include the maximum prefix size in credit calculations.
2. TLP Types and Credit Tracking
- Categories: FC distinguishes three TLP types: Posted Requests (P): E.g., Memory Writes, no completion required. Non-Posted Requests (NP): E.g., Memory Reads, require completion. Completions (Cpl): Responses to non-posted requests.
- Subdivision: Each type splits into Header (H) and Data (D), yielding six credit types per VC: PH: Posted Headers PD: Posted Data NPH: Non-Posted Headers NPD: Non-Posted Data CplH: Completion Headers CplD: Completion Data
- Purpose: Separate tracking prevents one type (e.g., PD) from consuming credits needed for another (e.g., NPH).
- Calculation: TLPs consume credits based on type and size: Example: Memory Write (posted): 1 PH credit for the header. n PD credits for data, where n = Roundup(Length / 4 DW). E.g., Length = 10 bytes → Roundup(10 / 16) = 1 PD credit.
- Granularity: Credits align to 4 DW units for data, ensuring efficient buffer allocation.
4. Virtual Channel Management
- VC0: Initialized autonomously by hardware post-reset (no software required).
- Other VCs: Enabled/disabled by software via VC Capability structures (Section 7.9.1), resetting FC tracking.
- InitFC DLLPs: InitFC1, InitFC2: Used during VC initialization to establish credit limits. Sent post-reset or when enabling a new VC.
5. Credit Limits and Errors
- Limits: Non-Scaled FC: Max 2047 data credits (PD, NPD, CplD) and 127 header credits (PH, NPH, CplH). Flow Control Protocol Error (FCPE): Exceeding these limits triggers an error, reported per Section 6.2.
- Infinite Credits: If advertised during initialization (e.g., CREDIT_LIMIT = infinite), no further UpdateFC FCPs are needed for that type.
6. Virtual Channel Enablement
- Disabled VC: TLPs using a disabled VC are treated as Malformed TLPs, discarded, and reported (Section 6.2).
- Transmission Rule: No TLP transmission until VC initialization completes (credits established).
- Variables: CREDITS_CONSUMED: Tracks FC units consumed by transmitted TLPs (increments per TLP). CREDIT_LIMIT: Total credits advertised by the receiver (set via InitFC/UpdateFC).
- Gating: Transmission allowed if (CREDIT_LIMIT - CUMULATIVE_CREDITS_REQUIRED) mod 2^[Field Size] ≤ 2^[Field Size]/2. Infinite Credits: If CREDIT_LIMIT is infinite, gating is always satisfied (no credit check).
- Deadlock Avoidance: Ensures transmitter waits for sufficient credits, preventing buffer overflow.
- Variables: CREDITS_ALLOCATED: Total credits granted to the transmitter, updated via InitFC/UpdateFC DLLPs. CREDITS_RECEIVED (optional): Tracks credits consumed by received TLPs.
- Overflow Check: Optional equation: (CREDITS_ALLOCATED - CREDITS_RECEIVED) mod 2^[Field Size] ≥ 2^[Field Size]/2.
- Detects if transmitter exceeds allocated credits.
9. UpdateFC FCP Scheduling
- Conditions: UpdateFC DLLPs are sent when: Available credits for a type (e.g., PH) reach zero. Scaled FC is active, and credits fall below a threshold (implementation-specific).
- Purpose: Ensures transmitter knows current buffer availability, maintaining flow.
Virtual Channel (VC) Mechanism
The Virtual Channel (VC) mechanism in PCIe enables the differentiation of traffic flows by associating them with independent virtual channels, each identified by a Virtual Channel Identification (VC ID). This allows multiple traffic streams, labeled with Traffic Classes (TCs), to coexist on the same physical link with separate flow control, preventing a single flow from blocking others. The VC mechanism is foundational to Quality of Service (QoS) and efficient resource utilization in PCIe systems.
Key Points
1. Purpose and Functionality
- Traffic Differentiation: VCs use TC labels to categorize traffic (e.g., TC0 for general I/O, TC7 for high-priority), enabling differentiated handling across the PCIe fabric.
- Independent Resources: Each VC has dedicated queues/buffers and control logic, ensuring fully independent flow control. This eliminates bottlenecks where one traffic flow could stall others (e.g., flow-control-induced blocking).
- Mechanism: Traffic is mapped to VCs using TC labels, controlled by configuration software.
- Flexibility: 1:1 Mapping: One TC per VC (e.g., TC0/VC0, TC1/VC1). Multiple TCs per VC: Multiple TCs can share a VC (e.g., TC0-6/VC0, TC7/VC1) for cost/performance trade-offs.
- Default Mapping: TC0 is always mapped to VC0 (hardwired), mandatory for all devices.
3. VC Establishment and Identification
- VC ID: A unique identifier (0-7) assigned to each VC resource within a port, set by configuration software.
- Support: VC0 (with TC0) is mandatory; additional VCs (1-7) are optional. Ports supporting >1 VC must implement VC or MFVC (Multi-Function VC) Capability structures.
- Rules: VC IDs must be unique within a port and match across both sides of a link. VC0 is fixed as the default VC.
- Non-VC Devices: Devices without VC capability must: Generate requests only with TC0. Accept and preserve non-TC0 labels in requests/completions, mapping all TCs to VC0 (for switches).
- VC-Capable Devices: Must support configurable TC/VC mappings via capability structures.
5. Implementation Details
- Buffering: Minimum buffering is architecturally defined, but additional buffering is implementation-specific and may vary per VC (e.g., more for VC0 than VC1). Buffers can be reassigned dynamically if fewer VCs are enabled (e.g., VC1 buffers to VC0).
- Switch/MFD: Switches require dedicated VC resources per port. Multi-Function Devices (MFDs) may use MFVC for QoS across functions.
- Configuration: Software scans VC/MFVC Capability registers to set up VCs across links.
6. TC/VC Mapping Examples
- VC0: TC0-7/VC0 (single VC).
- VC0, VC1: TC0-6/VC0, TC7/VC1 (two VCs).
- VC0-VC3: TC0-1/VC0, TC2-4/VC1, TC5-6/VC2, TC7/VC3 (four VCs).
- VC0-VC7: TC0/VC0, ..., TC7/VC7 (eight VCs, 1:1 mapping).
- Mandatory Support: TC0/VC0 is required for all devices.
- Independent Flow Control: Each VC has its own credit pool, managed via DLLPs (Data Link Layer Packets) with VC ID.
- No Ordering: No ordering required between different TCs or VCs, enhancing flexibility.
- Malformed TLPs: Transactions with unmapped TCs (to enabled VCs) are treated as Malformed TLPs and discarded.
- Port Independence: Switches and Root Complexes support independent TC/VC mappings per port/RCRB.
8. Flow Control Integration
- Point-to-Point: Flow control is link-specific, not end-to-end, tracking buffer space between adjacent devices.
- VC-Specific: Each VC maintains independent credits, conveyed via DLLPs with VC ID, ensuring no inter-VC blocking.
- TL Role: Manages flow control credits, gating TLP transmission based on available credits.