The ECP also helps the missions of other agencies like the National Institutes of Health , National Science Foundation , National Oceanic and Atmospheric Administration , and the National Aeronautics and Space Administration. Such large-scale systems require interconnections between thousands of computing nodes. Traditional networks offer fixed links where optical fibers connect electronic packet switches with high-speed point-to-point optical links. Standard topologies to interconnect computing nodes include 3D and 4D Torus and fat-trees. where the network architecture is hierarchical. At the bottom in the edge layer, hosts are connected to Top-of-rack switches. Spots in the network can be barely or heavily loaded, and throughput bottlenecks may occur. Hence, the necessity of reconfigurable networks for bandwidth steering arises, to allocate higher bandwidth resources in hotspots on demand. To enable these flexible networks, we combine traditional electronic packet switches with optical switching technologies. In that way, we get the benefits of both fabrics to develop a reconfigurable optical network. The topology can be dynamically modified with an optical switch. Finally, routing and forwarding is managed with flow updates, so we can orchestrate the traffic to go through the newly configured physical links. We do not use traditional routing protocols, but a Software Defined Networking control plane that allows more flexibility in the definition of the routing schemes and forwarding rules. In the following subsections we briefly introduce the related work and enabling technologies relevant to different modules of the test bed that was implemented in this research. Finally, we discuss the purpose and contribution of our investigation.All-optical switches, hydroponic flood table such as optical micro electro mechanical systems, MachZehnder switches, microresonator ring and wavelength routing are alternatives to electronic switches.
They offer lower power consumption, scalability, fast reconfiguration, low latency and cost savings with silicon photonics integration. However, optical switching technologies have not been commercially deployed in datacenters and HPC systems due to several challenges, including the lack of optical buffers, the scalability limitation posed by physical layer technology , and the design of a scalable control plane that can orchestrate fast reconfiguration operations at large scale to follow the bursty nature of the traffic. Therefore, it is necessary to investigate hybrid switching systems that combine the benefits of traditional electronic devices and optical technologies.Our test bed combines aMEMS optical switch and SDN-enabled electronic packet switches, as described later in Chapter 3.In the Software Defined Networking approach, the control plane is decoupled from the hardware. It leaves only the data plane in the chassis, while the controller is software-based and uses Application Programming Interfaces to send instructions to the hardware to handle network traffic. A centralized control plane is useful to manage complex networks. Google’s Software Defined Wide Area Network , B4, which connects their data centers globally, is an example of a successful deployment of a commercial Software Defined Network . B4 maximizes the average bandwidth, and gives complete management across all network. It also enables traffic engineering, which derives in other benefits. One is multi-path routing, that leverages resources based on application priority, and assigns bandwidth on demand. Google reports an improvement of 3× in link utilization efficiency compared to previous standard practices . On average, links run at 70% of their capacity in the long term.
In addition, Google adopted SDN in their data centers when they realized that traditional decentralized routing protocols were not the solution to challenges such as creating routes through a broad, fixed and multi-path network. Management in such systems is easier to achieve if the network is modeled as a single device with multiple ports instead of a collection of several individual switches running routing protocols. LIGHTNESS is one example of research project that explored and demonstrated the integration of SDN control plane, virtualization of network resources, and an all-optical dynamic data center network. With a modified version of OpenFlow, a customized controller and the abstraction of the Optical Packet Switch node, which was implemented with an FPGA, they achieved optical packet switching . Furthermore, by assigning priority to flows of different applications, Quality of Service can be guaranteed. Real-time monitoring is another feature that was implemented by collecting statistics from the network agents through OpenFlow . From these case studies we can conclude that SDN is the key to control and to use resources efficiently in complex networks. TCP/IP is the name commonly used to refer to the set of network protocols behind the internet. TCP stands for Transmission Control Protocol and Internet Protocol. This is a practical implementation of the theoretical reference model, OSI . Due to the complexity of computer networks, organizing the protocols in independent layers with specific purposes makes it easier to implement, provision and debug communication links all around the world. The OSI reference model has seven layers, while TCP/IP condenses all in fewer layers. So far we have introduced research projects in technologies for reconfigurable optical networks, with applications in data centers and High Performance Computing . Specifically in the physical layer. Now we will cover the impact of link reconfiguration in upper layers. We focus on transport layer, following the TCP congestion control mechanism and the reliable data transfer between hosts when there is an event in the optical paths in the middle. A retransmission timer is used to guarantee data delivery when the receiver does not confirm reception of packets. It is defined as Retransmission Timeout, RTO, in internet standards.
We discuss this timer in our results in Chapter 4. Network traffic in our testbed is generated with iPerf, a tool that can be configured with TCP or UDP. The first one is known for reliable communication among hosts, thus we choose that protocol in our experiments. Nevertheless, it was originally designed for WANs, where the data Round-Trip Time is in the order of hundreds of millisecond due to large geographic distances between nodes, but 2-3 orders of magnitude higher than RTT in datacenters. Such a large RTO in this context impacts in latency and throughput. Reducing the RTO might be a solution, but the challenge is that most systems do not have high resolution timers. Furthermore, varying the RTO helps to act fast against packet loss, but it yields to spurious retransmissions. Researchers have found that optical reconfigurations raise RTO events. RTT in datacenters is generally in the sub-millisecond scale, while optical switching can reach tens of milliseconds if using MEMS technologies. TCP waits for a certain time , ebb and flood table then it attempts to send data again. The default RTO varies across operating systems. It is set to 200 ms in Linux, 300ms in Windows . A new trend in transport protocols points towards programmable NICs and FPGAs, to leverage the optimized latency that state of the art distributed datacenter applications require. The High Precision Congestion Control and NanoTransport projects present transport protocols and congestion control schemes in dedicated programmable hardware. This topic is out of the scope of this thesis, and is left as future work to test novel transport protocol schemes in reconfigurable optical networks.An optical reconfiguration operation consists of the following steps: decide when to make the reconfiguration, select a new topology based on specific optimization goals, and finally migrate the traffic to the new path. Service halts arise amid topology migrations, in particular when a new light path is provisioned dynamically. That is, when the old link is removed and the new one is configured afterwards. To reduce the outage impact, which depends on the link setup time, a make-before-break approach is advised. If the optical channels are placed in the network prior to the reconfiguration, they will be ready to carry traffic instantly. However, this technique involves using additional resources, because more links must be operational simultaneously. Quality-of-Service and network performance in general are improved if bandwidth is reserved before the traffic transition, since the length of disturbance is minimized. Hitless reconfiguration in optical networks was defined in 1996 as the reconfiguration process where not even a single ATM cell is lost. ATM stands for Asynchronous Transfer Mode, a standard in telecommunications for data transfer between user-network or network-network nodes, prior to the wide adoption of IP based networks. Bit Error Rate and Forward Error Correction are useful metrics to monitor the quality of a link in an Optical Transport Network . Specific thresholds must be specified for minimizing or preventing packet loss. As long as these metrics are within the desired range, the link is considered optimal .
Research in hitless optical reconfiguration have demonstrated a switching of less than 1µs, without deteriorating the real-time BER, using fast tunable lasers. Other studies implemented a testbed as well, with commercial transceivers and a channel spacing of a minimum frequency step of 0.5GHz to keep BER in a desired range. By adding SDN elements to manage reconfiguration in an AWGR-based optical network via wavelength tuning, the total switching time reached only millisecond scale, and the packet loss was reduced by 50%. For the rest of this thesis, we consider hitless reconfiguration as the update in the topology that generates 0% packet loss, end-to-end. Reconfigurable optical networks have been studied in recent years, in both simulations and testbeds. On the experimental side, we find that SDN-controlled systems are practical to prevent packet loss and throughput drop due to light path updates. With SDN in hybrid networks, we can perform the reconfiguration in different steps: take the traffic from the ports to be updated, reconfigure the topology with the optical switch, synchronizing the transceivers in the new physical routes, and finally route the traffic through the new paths. In terms of reconfiguration latency, we find systems with delays of different orders of magnitude.ProjecTor is another testbed with a similar architecture, with a reconfiguration latency of 12µs. In the nanosecond level we find Sirius. These projects are mainly prototypes that integrate several blocks to build a reconfigurable optical network. Our testbed was built with commercial solutions such as an EdgeCore electronic packet switch, DWDM transceivers, a Polatis MEMS optical switch, along with Linux and a python-based SDN controller . If the reader is interested in the state of the art of optical technologies for datacenters and future trends, I recommend the paper ”Prospects and Challenges of Photonic Switching in Data Centers and Computing Systems”. Simulators are tools that researchers use to demonstrate models, algorithms, devices and other elements of networking and computing systems to improve their performance for large scale and high performance applications. To validate the simulation studies, it becomes essential to execute testbed experiments, which is the main focus of this thesis. We describe the testbed in terms of architecture, infrastructure and software tools. First, we will discuss the general design of the system, the purpose of each block and how they interact with others. Second, we will discuss how the system is implemented with servers, switches, wires, fibers, racks and any other physical device. Third, we will explain the software tools we used to orchestrate our experiments. We cover virtualization, network monitoring and traffic generators that we used in our investigation. In this chapter we start with the architecture of our testbed. Similar to Emulab, we connected physical and virtual hosts to different networks on top of a single infrastructure. The first is the management network, which helps to connect remotely to our testbed elements from anywhere. It is practical as it allows researchers from our team to handle experiments through the UC Davis VPN and an internet access point without being connected with a wire to the hosts nor restricted to UC Davis LAN . The second is the control network, which orchestrates our experiments and collect statistics from our computing hosts and network nodes. The control and application planes are encompassed here. The third is the experimental network data plane, a reconfigurable optical network composed of electronic and optical switches with 10 Gbps transceivers and optical fibers connecting our computing nodes and carrying data with rates up to 10 Gbps. Figure 2.1 illustrates the general architecture of our testbed. The data plane incorporates four electronic packet switches , an optical switch and optical fibers that interconnect four computing nodes .