## Efficient Metastability-Containing Gray Code 2-sort

#### Moti Medina

Joint work with Christoph Lenzen

Max Planck Institute For Informatics, Saarbrucken, Germany



**ASYNC 2016** 

#### In The Beginning there was Metastability



- Physics
  - Analog world (voltage) to 0-1 → non 0-1 equilibria = <u>M</u>eta<u>s</u>tability (Ms).
  - [Marino 81] Ms cannot be <u>avoided</u>, detected, or resolved.
  - $\uparrow$  clock rate  $\rightarrow$   $\uparrow$  Pr. Metastability.
  - ↑ clock rate → ↑ #cc until metastability resolves.
  - $\uparrow$  #transistors  $\rightarrow$   $\uparrow$  Pr. Metastability.
  - Common solution: Wait!



- Crossing clock domains
  - Signal arrives out of FF critical segment → Ms.

• We want to eliminate this reason of Ms.

#### Idea: Clock Synchronization 1<sup>st</sup> shot: a naïve "solution"

- 2 clocks
- Master-Slave approach.
- Slave clock measures Master's clock.
  - Analog to Digital conversion = a TDC.
- Slave clock: Unary to Bin trans and update slave's clock.
- Problem: Marino  $\rightarrow$  Metastability.
- Conversion  $\rightarrow$  Arbitrary clock update.
- Problem: Clock synch. fails.



#### 2<sup>nd</sup> shot & Metastability-Containing

- "If you can't beat them, join them" → Contain metastability
- Unary string → analog correction of the oscillator.
  - We didn't lose any precision due to Ms
    = metastable containing (Ms-C)
- Master-slave approach → nice property: bounded phase shift → crossing clock domains doesn't incur metastability.
- Problem: the master clock fails → lost availability of a clock, i.e., not a fault tolerant system.



5-14-12

#### Our Goal



- Can we design a system which is
  - fault-tolerant, and
  - No loss of precision due to metastability?
    - i.e., Containing metastability



- Yes to Fault-Tolerance! The Lynch-Welch [88] Algorithm
- #number of faulty clocks: *f*
- #number of clocks: n = 3f + 1
- Example: f = 2, n = 7
- Algorithm's idea (animation):



Implementation:

- In each clock (repeat forever)
  - Measure phase diff. from all the other clocks using TDCs.
  - **Sort** these measurements.
  - Adjust self clock to  $\frac{diff[f+1]+diff[n-f]}{2}$



#### The System

- Each clock has a n-1 TDCs.
- Each clock has a <u>sorting network</u>.





### Our Time to Digital Converter (TDC) [Fuegger, Lenzen, Polzer 2016]

- (Ideal) TDC Specification:
  - Input: analog signals  $a(t_1)$ ,  $b(t_2)$ ,  $t_2 > t_1$
  - Output: Gray code representation of  $[t_2 t_1]$
- Corollary [Marino]: There is no ideal TDC!
- [FLP2016] : TDC s.t. output contains 1 MS bit,
  - i.e., in unary  $1^*0^* \cup 1^*M0^*$ ,
  - Mapping to Gray-code in the next slide.



• <u>Our question revisited</u>: Can we implement the LW algorithm? We cannot wait...

#### Gray code & Valid (input/output) strings

- 3-bit Reflected Binary Gray Code
- Valid (input/output) strings



- Valid Unary to Valid Gray code is 1-2-1 & one bit is MS in both.
- g[1] = M → g[2:3] is the "biggest" number in two bits Gray code

#### Our Question

- We need to sort!
- [LW88] does not solve the Ms-C issue.
- Sorting in a Ms-C manner + unary strings = exp. size circuits
  - Trivial min/max
  - Who is bigger? 100 or M00 ? 1M0 or 100?



• Challenge: Find Ms-C Polynomial size circuits.



#### Metastability (Ms) model

- Worst case metastability gate propagation.
- Logical masking:

Μ

Cost and Dealy of Combinational circ.:

- Standard definitions.
- Cost = sum of basic gates' costs
- Delay = heaviest path from an input to an output.

+ = OR $\cdot = AND$ 

#### Ms-C 2 – *sort* - Specification

- For *u*, *v* unary strings
  - $\max_{un} \{u, v\} = u + v$
  - $\min_{un} \{u, v\} = u \cdot v$
  - $\rightarrow$  Ms-C
- For *u*, *v* valid Gray code inputs
  - Let  $\hat{u} \leftarrow un(u), \hat{v} \leftarrow un(v)$
  - $\max_{rg}\{u, v\} = rg\left(\max_{un}\{\hat{u}, \hat{v}\}\right)$
  - $\min_{rg}\{u, v\} = rg\left(\min_{un}\{\hat{u}, \hat{v}\}\right)$

A MS-C Gray code 2 - sort(B)combinational circuit is defined as follows:

- Input: *B* bit valid gray code strings *g*, *h*
- Output: B bit valid gray code strings g', h'
- Functionality:
  - $g' \triangleq \max_{rg} \{g, h\}$
  - $h' \triangleq \min_{rg} \{g, h\}$

•  $\rightarrow$  Ms-C

#### Our main result



- We propose an Ms-C circuit of
- Polynomial cost  $O(B^2)$ , and
- Linear depth O(B)
- that computes the 2 sort(B) of two B-bit Gray code valid inputs.

#### Ms-C Multiplexers

- (2:1)-CMUX
  - Problem: output is Ms even if there is no ``choice''.
  - Solution: keep invariant s.t. if a = b then we ``don't care'' about s



• The recursive extension keeps the invariant.





# 2 – *sort(B)* Efficient & Ms-C Design: High Level Idea

- Recursive scheme.
- Directly operates on the Gray code representation.
- Simply branching on the first bit of the code → not Ms-C.



- Observation
  - 1<sup>st</sup> bit Ms→ stable bits are the max number.
- Observation + Ms-C Mux
  - both branching options are the same
  - $\rightarrow$  no additional Ms bits!
- Base case:  $B = 1, u, v \in \{0, 1\}$ 
  - $\max_{rg}{u, v} = u + v$
  - $\min_{rg}\{u, v\} = u \cdot v$

 $\begin{array}{ll} 00 & + = OR \\ 01 & \cdot = AND \end{array}$ 

#### 2 - sort(B) Design & Proof (by example) $\frac{11}{10}$



#### Conclusion

- Allowing for computations to take place **before** metastability is resolved.
- Analog measurement precision is kept.
- Building an asynchronous system without synchronizers.
  - By implementing the LW algorithm





#### Further research & Open questions

- More efficient 2 sort(B).
- log B delay Blog B cost!
- Design a 2 *sort*(*B*) that operates on Redundant Binary encoding .
- Super linear delay lower bound for Ms-C combinational circuits?
- Replace "Waiting" with Ms-C circuits.



