TY - JOUR
T1 - Achieving consensus in distributed software architectures for satellite missions
AU - Carvajal-Godinez, Johan
AU - Guo, Jian
AU - Gill, Eberhard
N1 - Publisher Copyright:
Copyright © 2018 by the International Astronautical Federation.
PY - 2018
Y1 - 2018
N2 - Spacecraft buses using distributed software architectures have been adopted in space missions design due to their increased performance and reliability. However, to achieve reliability in highly distributed systems, fault-tolerant mechanisms must be implemented to mitigate the anomalous behavior of components and software processes. One of the most significant challenges in designing distributed software is related to the consensus of processes running in parallel. For instance, consider the attitude determination and control subsystem is trying to estimate the current satellite's attitude state. For that purpose, it must request measurements to multiple sensors connected to the spacecraft data bus, and it needs to organize this information in the right chronological order for proper state estimation. A wrong data sequence can lead to an increased pointing error during satellite operations. Consensus protocols in distributed software architectures are mainly focused on voting mechanisms, which work well when the process in charge of the decision making (leader) does not fail. In an execution environment with faulty processes, the system can reach decisions that do not reflect the actual status of the system due to corrupted or missing information. This paper describes and analyzes software consensus scenarios in spacecraft with synchronous and asynchronous data buses. These scenarios include state estimation with networked components and, temporal consistency of telemetry packets. The work presents a comparison of performance characteristics for different consensus strategies. It takes as a reference the Paxos algorithm family to establish an optimal configuration for achieving consensus in distributed software architectures for satellite systems. The proposed approach presents an agent-based implementation of the Paxos Algorithm to reach a consensus under intermittent failures, as well as analyzing scalability issues. Finally, the work proposes the adoption of software design patterns that guarantee consensus on time-critical processes such as attitude determination and control. The results enable to define and evaluate software performance and reliability concerning the criticality of its processes to achieve consensus. It also facilitates fault detection and recovery capabilities by design, during the software development phase of the satellite. Finally, a set of software design rules is provided that can be used to improve the resilience of satellite's onboard software.
AB - Spacecraft buses using distributed software architectures have been adopted in space missions design due to their increased performance and reliability. However, to achieve reliability in highly distributed systems, fault-tolerant mechanisms must be implemented to mitigate the anomalous behavior of components and software processes. One of the most significant challenges in designing distributed software is related to the consensus of processes running in parallel. For instance, consider the attitude determination and control subsystem is trying to estimate the current satellite's attitude state. For that purpose, it must request measurements to multiple sensors connected to the spacecraft data bus, and it needs to organize this information in the right chronological order for proper state estimation. A wrong data sequence can lead to an increased pointing error during satellite operations. Consensus protocols in distributed software architectures are mainly focused on voting mechanisms, which work well when the process in charge of the decision making (leader) does not fail. In an execution environment with faulty processes, the system can reach decisions that do not reflect the actual status of the system due to corrupted or missing information. This paper describes and analyzes software consensus scenarios in spacecraft with synchronous and asynchronous data buses. These scenarios include state estimation with networked components and, temporal consistency of telemetry packets. The work presents a comparison of performance characteristics for different consensus strategies. It takes as a reference the Paxos algorithm family to establish an optimal configuration for achieving consensus in distributed software architectures for satellite systems. The proposed approach presents an agent-based implementation of the Paxos Algorithm to reach a consensus under intermittent failures, as well as analyzing scalability issues. Finally, the work proposes the adoption of software design patterns that guarantee consensus on time-critical processes such as attitude determination and control. The results enable to define and evaluate software performance and reliability concerning the criticality of its processes to achieve consensus. It also facilitates fault detection and recovery capabilities by design, during the software development phase of the satellite. Finally, a set of software design rules is provided that can be used to improve the resilience of satellite's onboard software.
KW - Consensus Algorithms
KW - JADE
KW - Multi-Agent Systems
KW - Paxos
UR - http://www.scopus.com/inward/record.url?scp=85065303639&partnerID=8YFLogxK
M3 - Artículo de la conferencia
AN - SCOPUS:85065303639
SN - 0074-1795
VL - 2018-October
JO - Proceedings of the International Astronautical Congress, IAC
JF - Proceedings of the International Astronautical Congress, IAC
T2 - 69th International Astronautical Congress: #InvolvingEveryone, IAC 2018
Y2 - 1 October 2018 through 5 October 2018
ER -