For example, assume we have this code:
char data[]; // data to send
int n; // number of bytes
...
send(data, n);
...
receive(data, n);
...
after weaving, it looks like:
...
data[n]=CRC(data, n);
n:=n+1;
send(data, n);
...
receive(data, n);
if(data[n-1]CRC(data, n-1)) ERROR();
...
And also note that this helps immediate recovering from data flow errors, not control flow errors (for detection of control flow errors, signature monitoring can be used).
An approach for clk synchronization (Assumption is that all clocks are running at the same speed): At the beginning of each cycle (or perhaps more frequently?), monitor processor broadcasts its clock value () to all processors. The travelling time of this message is . So, it is delivered in .
Each processor makes this adjustment: . Moreover in heartbeat messages, all processors can send their clock values to the monitor as feedback so that monitor can apply any control scheme to adjusts the clocks better such that the monitor sends to processor 1, to processor 2 and so on.
After the monitor receives all heartbeats at around time (i.e. it resets all watchdogs related to processors), the monitor may send acknowledgments (transmission time is ). There are two kinds of ACKs: Monitor can request a rollback to the previous state, or it just says okay, perhaps including information for clock synchronization. Therefore, the recovery time after failure occurs is . This is the cost of recovery and our real-time constraints should tolerate this.