By Sai Shankar Narasimhan.
Recently, an increasing number of robotic applications have adopted remote assistance through teleoperation, ranging from remote manipulation for surgery to emergency takeover of autonomous vehicles. Teleoperation is even used to control food delivery robots from command centers hundreds of miles away. What has contributed to this trend?
We identify the following two crucial reasons:
Though teleoperation has a lot of potential benefits, there are practical concerns that need to be addressed. For example, while using public communication networks, stochastic communication delays could potentially lead to the violation of key safety properties.
Therefore, in this work, we ask the following key question: How do we ensure safe networked control over wireless networks with stochastic communication delays? This blog will briefly explain our proposed solution and experimental results, including a real-world demonstration of safe networked control using F1/10th cars while transmitting sensor data through our university’s public Wi-Fi network.
To motivate the requirements of a safe networked control system, let us take a look at our proposed solution, shown in the image below.
We implemented our proposed solution on a real-world leader-follower setup. The leader is uncontrolled and can exhibit stochastic motion. The follower is controlled remotely by a teleoperator or using a DNN. The sensor observations are transmitted to the remote server or cloud through a public Wi-Fi connection (step 1). The teleoperator or the DNN generates a control command, which is transmitted back to the follower (step 2). We propose a shield that restricts or modifies an unsafe control command based on its communication delay magnitude (step 3). Finally, the follower executes the “safe” control command (step 4).
To develop a safe networked control system, as shown in the figure above, we require the following:
Typically, Markov Decision Processes (MDPs) are used to model the interaction between a robot and its environment. To extend MDPs to the networked control setting, we developed Delayed Communication Markov Decision Processes (DC-MDP) without any conservative assumptions about the delay transitions.
Our key intuition is that by augmenting the states of the basic MDP (model of the interaction in the absence of delay) with fixed-length action buffers containing the number of actions corresponding to the actual delay, we can represent system states with arbitrary delays. Additionally, given the delay transition probability, we can accurately obtain the transition probability between the system states for two consecutive time steps with arbitrary delays. We refer the readers to Section 5.A of the manuscript for further details.
We use Linear Temporal Logic (LTL) to formally represent our desired notion of safety for the system as safety specifications. We refer the readers to Section 3 of our manuscript for a detailed discussion on LTL and safety specifications. Given an MDP and a safety specification, the preferred approach to ensure safety is shielding. For any given system state, shielding allows those actions to be executed for which the maximum safety probability is 1.0. This ensures that the system is safe. Additionally, executing shielded actions ensures that the system always remains in a safe state. The literature on shielding is vast and will not be covered in this blog.
With the DC-MDP and the shield for the required safety specification, the naive solution is to apply shielding to the DC-MDP and ensure absolute safety for the networked control system. However, we note that this adversely affects the system’s task performance. This is attributed to the fact that ensuring absolute safety always considers the worst-case scenario, which is the maximum delay case. However, the maximum delay case is a very rare occurrence. Hence, we propose ε-shields that can trade off safety for task efficiency. We refer the reader to Algorithm 1 in our manuscript for the definition and synthesis of ε-shields. Intuitively, ε is a value between 0 and 1, and increasing ε increases safety. Our proposed ε-shields can guarantee any desired safety probability. This flexibility allows the user to trade off safety to achieve higher task efficiency.
We test our approach on the following two simulation environments:
Now, we will describe the two main takeaways from our experiments with these two simulation environments.
We show similar results with the real-world demonstration (check the figure below). In the absence of shielding, we observe safety violations due to the communication delay while using our university’s public Wi-Fi network for sensor data transmission. Additionally, note that the distance between both cars is lower in the presence of ε-shields (Random Delay), highlighting better task efficiency.
Our work, “Safe Networked Robotics with Probabilistic Verification”, provides a novel approach to ensure safety for networked control systems. Through multiple simulation and real-world experiments, we demonstrate how our approach can gracefully trade off safety for efficiency while dealing with stochastic communication delays.