RapidIO

RapidIO

RapidIO - the unified fabric for Performance Critical Computing
Year created 2000 (2000)
Width in bits Port widths of 1, 2, 4, 8, and 16 lanes
Number of devices Sizes of 256, 65,536, and 4,294,967,296
Speed

Per lane (each direction):

  • 1.x: 1.25, 2.5, 3.125 Gbaud
  • 2.x: added 5 and 6.25 Gbaud
  • 3.x: added 10.3125 Gbaud
  • 4.x: added 12.5 and 25.3125 Gbaud
Style Serial
Hotplugging interface Yes
External interface Yes, Chip-Chip, Board-Board (Backplane), Chassis-Chassis

The RapidIO architecture is a high-performance packet-switched, interconnect technology. RapidIO supports messaging, read/write and cache coherency semantics. RapidIO fabrics guarantee in-order packet delivery, enabling power- and area- efficient protocol implementation in hardware. Based on industry-standard electrical specifications such as those for Ethernet, RapidIO can be used as a chip-to-chip, board-to-board, and chassis-to-chassis interconnect. The protocol is marketed as: RapidIO - the unified fabric for Performance Critical Computing,[1] and is used in many applications such as Data Center & HPC, Communications Infrastructure, Industrial Automation and Military & Aerospace that are constrained by at least one of size, weight, and power (SWaP).

History

RapidIO has its roots in energy-efficient, high-performance computing. The protocol was originally designed by Mercury Computer Systems and Motorola (Freescale) as a replacement for Mercury's RACEway proprietary bus and Freescale's PowerPC bus.[2] The RapidIO Trade Association was formed in February 2000, and included telecommunications and storage OEMs as well as FPGA, processor, and switch companies. The protocol was designed to meet the following objectives:

The RapidIO Specification Revision 1.1, released in 2001, defined a wide, parallel bus. This specification did not achieve extensive commercial adoption.

The RapidIO Specification Revision 1.2, released in 2002,[3] defined a serial interconnect based on the XAUI physical layer. Devices based on this specification achieved significant commercial success within wireless baseband,[4] imaging and military compute. [5]

The RapidIO Specification Revision 2.0, released in 2008,[6] added more port widths (2×, 8×, and 16×) and increased the maximum lane speed to 6.25 GBd / 5 Gbit/s. Revision 2.1 has repeated and expanded the commercial success of the 1.2 specification.[7]

The RapidIO Specification Revision 3.0, released in 2013,[8] has the following changes and improvements compared to the 2.x specifications:

The RapidIO Specification Revision 4.0 was released in 2016.[9] has the following changes and improvements compared to the 3.x specifications:

RapidIO used in wireless infrastructure

RapidIO fabrics enjoy dominant market share in global deployment of cellular infrastructure 3G, 4G & LTE networks with millions of RapidIO ports shipped[10] into wireless base stations worldwide. RapidIO fabrics were originally designed to support connecting different types of processors from different manufacturers together in a single system. This flexibility has driven the widespread use of RapidIO in wireless infrastructure equipment where there is a need to combine heterogeneous, DSP, FPGA and communication processors together in a tightly coupled system with low latency and high reliability.

RapidIO used in data center / HPC analytics

Data Center and HPC Analytics systems have been deployed using a RapidIO 2D Torus Mesh Fabric,[11] that provides a high speed general purpose interface among the system cartridges for applications that benefit from high bandwidth, low latency node-to-node communication. The RapidIO 2D Torus unified fabric is routed as a torus ring configuration connecting up to 45 server cartridges capable of providing 5Gbs per lane connections in each direction to its north, south, east and west neighbors. This allows the system to meet many unique HPC applications where efficient localized traffic is needed.

Also, using an open modular data center and compute platform,[12] a heterogeneous HPC system has showcased the low latency attribute of RapidIO to enable real-time analytics.[13] In March 2015 a top-of-rack switch was announced to drive RapidIO into mainstream data center applications.[14]

RapidIO in aerospace

The interconnect or "bus" is one of the critical technologies in the design and development of spacecraft avionic systems that dictates its architecture and level of complexity. There are a host of existing architectures that are still in use given their level of maturity. These existing systems are sufficient for a given type of architecture need and requirement. Unfortunately, for next generation missions a more capable avionics architecture is desired; which is well beyond the capabilities levied by existing architectures. A viable option toward the design and development of these next generation architectures is to leverage existing commercial protocols capable of accommodating high levels of data transfer.

In 2012, RapidIO was selected by the Next Generation Spacecraft Interconnect Standard (NGSIS) working group to serve as the foundation for standard communication interconnects to be used in spacecraft. The NGSIS is an umbrella standards effort that includes RapidIO Version 3.1 development, and a box level hardware standards effort under VITA 78 called SpaceVPX or High ReliabilityVPX. The NGSIS requirements committee developed extensive requirements criteria with 47 different elements for the NGSIS interconnect. Independent trade study results by NGSIS member companies demonstrated the superiority of RapidIO over other existing commercial protocols, such as InfiniBand, Fibre Channel, and 10G Ethernet. As a result, the group decided that RapidIO offered the best overall interconnect for the needs of next-generation spacecraft.[15]

RapidIO specification 3.1

The RapidIO Specification Revision 3.1, released in 2014,[16] was developed through a collaboration between the RapidIO Trade Association and NGSIS. Revision 3.1 has the following enhancements compared to the 3.0 specification:

RapidIO specification 4.0

The RapidIO Specification Revision 4.0 was released in 2016.[17] has the following changes and improvements compared to the 3.x specifications:

PHY roadmap

The RapidIO roadmap aligns with Ethernet PHY development. RapidIO specifications for 50 GBd and higher links are under investigation.[20]

Terminology

Link Partner
One end of a RapidIO link.
Endpoint
A device that can originate and/or terminate RapidIO packets.
Processing Element
A device which has at least one RapidIO port
Switch
A device that can route RapidIO packets.

Protocol overview

The RapidIO protocol is defined in a 3-layered specification:

System specifications include:

Physical layer

The RapidIO electrical specifications are based on industry-standard Ethernet and Optical Interconnect Forum standards:

The RapidIO PCS/PMA layer supports two forms of encoding/framing:

Every RapidIO processing element transmits and receives three kinds of information: Packets, control symbols, and an idle sequence.

Packets

Every packet has two values that control the physical layer exchange of that packet. The first is an acknowledge ID (ackID), which is the link-specific, unique, 5-, 6-, or 12-bit value that is used to track packets exchanged on a link. Packets are transmitted with serially increasing ackID values. Because the ackID is specific to a link, the ackID is not covered by CRC, but by protocol. This allows the ackID to change with each link it passes over, while the packet CRC can remain a constant end-to-end integrity check of the packet. When a packet is successfully received, it is acknowledged using the ackID of the packet. A transmitter must retain a packet until it has been successfully acknowledged by the link partner.

The second value is the packet's physical priority. The physical priority is composed of the Virtual Channel (VC) identifier bit, the Priority bits, and the Critical Request Flow (CRF) bit. The VC bit determines if the Priority and CRF bits identify a Virtual Channel from 1 to 8, or are used as the priority within Virtual Channel 0. Virtual Channels are assigned guaranteed minimum bandwidths. Within Virtual Channel 0, packets of higher priority can pass packets of lower priority. Response packets must have a physical priority higher than requests in order to avoid deadlock.

The physical layer contribution to RapidIO packets is a 2-byte header at the beginning of each packet that includes the ackID and physical priority, and a final 2-byte CRC value to check the integrity of the packet. Packets larger than 80 bytes also have an intermediate CRC after the first 80 bytes. With one exception a packet's CRC value(s) acts as an end-to-end integrity check.

Control symbols

RapidIO control symbols can be sent at any time, including within a packet. This gives RapidIO the lowest possible in-band control path latency, enabling the protocol to achieve high throughput with smaller buffers than other protocols.

Control symbols are used to delimit packets (Start of Packet, End of Packet, Stomp), to acknowledge packets (Packet Acknowledge, Packet Not Acknowledged), reset (Reset Device, Reset Port) and to distribute events within the RapidIO system (Multicast Event Control Symbol). Control symbols are also used for flow control (Retry, Buffer Status, Virtual Output Queue Backpressure) and for error recovery.

The error recovery procedure is very fast. When a receiver detects a transmission error in the received data stream, the receiver causes its associated transmitter to send a Packet Not Accepted control symbol. When the link partner receives a Packet Not Accepted control symbol, it stops transmitting new packets and sends a Link Request/Port Status control symbol. The Link Response control symbol indicates the ackID that should be used for the next packet transmitted. Packet transmission then resumes.

IDLE sequence

The IDLE sequence is used during link initialization for signal quality optimization. It is also transmitted when the link does not have any control symbols or packets to send.

Transport layer

Every RapidIO endpoint is uniquely identified by a Device Identifier (deviceID). Each RapidIO packet contains two device IDs. The first is the destination ID (destID), which indicates where the packet should be routed. The second is the source ID (srcID), which indicates where the packet originated. When an endpoint receives a RapidIO request packet that requires a response, the response packet is composed by swapping the srcID and destID of the request.

RapidIO switches use the destID of received packets to determine the output port or ports that should forward the packet. Typically, the destID is used to index into an array of control values. The indexing operation is fast and low cost to implement. RapidIO switches support a standard programming model for the routing table, which simplifies system control.

The RapidIO transport layer supports any network topology, from simple trees and meshes to n-dimensional hypercubes, multi-dimensional toroids, and more esoteric architectures such as entangled networks.

The RapidIO transport layer enables hardware virtualization (for example, a RapidIO endpoint can support multiple device IDs). Portions of the destination ID of each packet can be used to identify specific pieces of virtual hardware within the endpoint.

Logical layer

The RapidIO logical layer is composed of several specifications, each providing packet formats and protocols for different transaction semantics.

Logical I/O

The logical I/O layer defines packet formats for read, write, write-with-response, and various atomic transactions. Examples of atomic transactions are set, clear, increment, decrement, swap, test-and-swap, and compare-and-swap.

Messaging

The Messaging specification defines Doorbells and Messages. Doorbells communicate a 16-bit event code. Messages transfer up to 4KiB of data, segmented into up to 16 packets each with a maximum payload of 256 bytes. Response packets must be sent for each Doorbell and Message request. The response packet status value indicates done, error, or retry. A status of retry requests the originator of the request to send the packet again. The logical level retry response allows multiple senders to access a small number of shared reception resources, leading to high throughput with low power.

Flow control

The Flow Control specification defines packet formats and protocols for simple XON/XOFF flow control operations. Flow control packets can be originated by switches and endpoints. Reception of a XOFF flow control packet halts transmission of a flow or flows until an XON flow control packet is received or a timeout occurs. Flow Control packets can also be used as a generic mechanism for managing system resources.

CC-NUMA

The Globally Shared Memory specification defines packet formats and protocols for operating a cache coherent shared memory system over a RapidIO network.

Data streaming

The Data Streaming specification supports messaging with different packet formats and semantics than the Messaging specification. Data Streaming packet formats support the transfer of up to 64K of data, segmented over multiple packets. Each transfer is associated with a Class of Service and Stream Identifier, enabling thousands of unique flows between endpoints.

The Data Streaming specification also defines Extended Header flow control packet formats and semantics to manage performance within a client-server system. Each client uses extended header flow control packets to inform the server of the amount of work that could be sent to the server. The server responds with extended header flow control packets that use XON/XOFF, rate, or credit based protocols to control how quickly and how much work the client sends to the server.

System initialization

Systems with a known topology can be initialized in a system specific manner without affecting interoperability. The RapidIO system initialization specification supports system initialization when system topology is unknown or dynamic. System initialization algorithms support the presence of redundant hosts, so system initialization need not have a single point of failure.

Each system host recursively enumerates the RapidIO fabric, seizing ownership of devices, allocating device IDs to endpoints and updating switch routing tables. When a conflict for ownership occurs, the system host with the larger deviceID wins. The "losing" host releases ownership of its devices and retreats, waiting for the "winning" host. The winning host completes enumeration, including seizing ownership of the losing host. Once enumeration is complete, the winning host releases ownership of the losing host. The losing host then discovers the system by reading the switch routing tables and registers on each endpoint to learn the system configuration. If the winning host does not complete enumeration in a known time period, the losing host determines that the winning host has failed and completes enumeration.

System enumeration is supported in Linux by the RapidIO subsystem.

Error management

RapidIO supports high availability, fault tolerant system design, including hot swap. The error conditions that require detection, and standard registers to communicate status and error information, are defined. A configurable isolation mechanism is also defined so that when it is not possible to exchange packets on a link, packets can be discarded to avoid congestion and enable diagnosis and recovery activities. In-band (port-write packet) and out-of-band (interrupt) notification mechanisms are defined.

Form factors

The RapidIO specification does not discuss the subjects of form factors and connectors, leaving this to specific application-focussed communities. RapidIO is supported by the following form factors:

Software

Processor-agnostic RapidIO support is found in the Linux kernel.

Applications

The RapidIO interconnect is used extensively in the following applications:

RapidIO is expanding into supercomputing, server, and storage applications.

Competing protocols

PCI Express is targeted at the host to peripheral market, as opposed to embedded systems. Unlike RapidIO, PCIe is not optimized for peer-to-peer multi processor networks. PCIe is ideal for host to peripheral communication. PCIe does not scale as well in large multiprocessor peer-to-peer systems, as the basic PCIe assumption of a "root complex" creates fault tolerance and system management issues.

Another alternative interconnect technology is Ethernet. Ethernet is a robust approach to linking computers over large geographic areas, where network topology may change unexpectedly, the protocols used are in flux, and link latencies are large. To meet these challenges, systems based on Ethernet require significant amounts of processing power, software and memory throughout the network to implement protocols for flow control, data transfer, and packet routing. RapidIO is optimized for energy efficient, low latency, processor-to-processor communication in fault tolerant embedded systems that span geographic areas of less than one kilometre.

SpaceFibre is a competing technology for space applications.[21]

See also

References

  1. http://www.rapidio.org
  2. Fuller, Sam (27 December 2004). "Preface". RapidIO: The Embedded System Interconnect. John Wiley & Sons Ltd. ISBN 0-470-09291-2. Retrieved 9 October 2014.
  3. "RapidIO Standard Revision 1.2". www.rapidio.org. RapidIO Trade Association. 26 June 2002. Retrieved 9 October 2014.
  4. "Integrated Device Technology 2011 Annual Report" (PDF). www.idt.com. Integrated Device Technology Inc. 6 June 2011. p. 4. Retrieved 9 October 2014.
  5. Jag Bolaria (October 15, 2013). "RapidIO Reaches for the Clouds". www.linleygroup.com. The Linley Group. Retrieved 9 October 2014.
  6. "RapidIO Standard Revision 2.0". www.rapidio.org. RapidIO Trade Association. 23 February 2005. Retrieved 9 October 2014.
  7. "Integrated Device Technology 2014 Annual Report" (PDF). www.idt.com. Integrated Device Technology Inc. 28 May 2014. pp. 5, 35. Retrieved 9 October 2014.
  8. "RapidIO Standard Revision 3.0". www.rapidio.org. RapidIO Trade Association. 10 November 2013. Retrieved 9 October 2014.
  9. "RapidIO Standard Revision 4.0". www.rapidio.org. RapidIO Trade Association. June 2016. Retrieved 15 August 2016.
  10. http://www.rcrwireless.com/20121203/opinion/reader-forum-cloud-radio-access-small-cell-networks-based-rapidio
  11. http://www.hpcwire.com/2014/09/24/paypal-finds-order-chaos-hpc/
  12. http://prodrive-technologies.com/prodrive-technologies-announces-datacenter-hpc-system-dccp-280-rapidio-10-gigabit-ethernet/
  13. http://www.businesswire.com/news/home/20141118005342/en/IDT-Orange-Silicon-Valley-NVIDIA-Accelerate-Computing#.VQqdHuF0Uso
  14. http://prodrive-technologies.com/prodrive-technologies-launches-prsb-760g2-large-rapidio-networks/
  15. Patrick Collier (14 October 2013). "Next Generation Space Interconnect Standard (NGSIS): A Modular Open Standards Approach for High Performance Interconnects for Space" (PDF). Reinventing Space Conference. p. 5. Retrieved 9 October 2014.
  16. "RapidIO Standard Revision 3.1" (PDF). www.rapidio.org. RapidIO Trade Association. 13 October 2014. Retrieved 18 October 2014.
  17. "RapidIO Standard Revision 4.0". www.rapidio.org. RapidIO Trade Association. June 2016. Retrieved 15 August 2016.
  18. "IEEE Standard IEEE Std 802.3bm™-2015 Amendment IEEE Std 802.3™-2012 as amended by IEEE Std 802.3bk™-2013 and IEEE Std 802.3bj™-2014 "Amendment 3: Physical Layer Specifications and Management Parameters for 40 Gb/s and 100 Gb/s Operation over Fiber Optic Cables"". IEEE. February 16, 2015. Retrieved 24 October 2016.
  19. "IEEE Standard IEEE Std 802.3bj -2014 Amendment to IEEE Std 802.3 -2012 Physical Layer Specifications and Management Parameters for 100 Gb/s Operation Over Backplanes and Copper Cables"". June 12, 2014. Retrieved 24 October 2016.
  20. "RapidIO Roadmap". www.rapidio.com. RapidIO Trade Association. 10 June 2012. p. 4. Retrieved 9 October 2014.
  21. "SpaceFibre Overview" (PDF). STAR-Dundee. Retrieved 21 October 2014.
This article is issued from Wikipedia - version of the 10/28/2016. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.