# Voltage Scaling on C-Elements: A Speed, Power and Energy Efficiency Analysis

Matheus Trevisan Moreira, Ney Laert Vilar Calazans

Pontifical Catholic University of Rio Grande do Sul (PUCRS)

Hardware Design Support Group (GAPH) - Faculty of Computer Science (FACIN) – Porto Alegre – Brazil

matheus.moreira@acad.pucrs.br, ney.calazans@pucrs.br

*Abstract* — This work reports an evaluation of speed, energy consumption, leakage power, and silicon area tradeoffs of three different transistors topologies for C-elements, basic devices for building asynchronous circuits. The evaluation considers the devices operating under supply voltages that vary from nominal 1V to 0.05V. Analog simulations provide precise measurements and the obtained results identify the lowest voltage at which each C-element operates correctly. Results suggest that operating at near-threshold voltages provides the best speed-energy and speed-leakage efficiencies. Also, they point the van Berkel topology as the most suited C-element implementation for low voltage operation, as it presents lower power and energy figures as well as higher speed, regardless of the supply voltage.

Keywords— voltage scaling, C-element, low voltage, low power design, leakage power reduction.

# I. INTRODUCTION

The growing demand for battery-powered mobile electronics requires improvements in techniques for low power design. In this context, circuits that operate at near-threshold or even at subthreshold regimes gained the attention of the low power VLSI research community. In fact, according to Radfar et al. [1] and Hanson et al. [2], voltage scaling is the most effective solution to cope with increasing power constraints. However, as the same authors state, the major problem for such circuits is their vulnerability to process, voltage and temperature (PVT) variations. In this context, asynchronous design becomes appealing. Differently from synchronous, asynchronous circuits may relax timing assumptions, because they employ no clock signal for controlling event sequencing in the circuit. In fact, in circuits designed with the asynchronous paradigm, event sequencing and control rely on local handshake protocols [4]. This characteristic enables designers to ignore wire and gate delays with no interference in functionality, which makes asynchronous circuits much more robust to PVT variations than synchronous circuits [3]-[5].

The C-element [3] is a basic sequential device that enables different asynchronous templates to be implemented in silicon. Its primary use is for implement event synchronization and both, sequential and combinational asynchronous logic can employ it. In fact, it is the authors' design experience that Celements can account to up to 60% of the total area of typical asynchronous modules [6]. This paper analyzes the electrical behavior of three different CMOS transistor topologies for Celements, when operating at low supply voltage levels. The evaluated implementations are those most common in literature, the Martin, the Sutherland and the van Berkel Celements, and are available at the layout level in an in-house 65nm standard-cell library called ASCEnD [7]. The obtained results identify the lowest voltage applicable to C-elements while still maintaining correct functional behavior for a wide range of temperatures. Propagation delay, energy per transition and leakage power measurements indicate the supply voltage that provides the best speed, energy and leakage power tradeoffs. The proposed evaluation also considers silicon area, helping to tune voltage scaling on C-elements in asynchronous circuits, which furthers low power design space exploration.

The rest of this work comprises four sections. Section II presents basic concepts on asynchronous design and the addressed component: the C-element gate. Section III explores the conducted experiments and shows obtained results. Section IV discusses the results and related work, while Section V draws some conclusions and sets directions for future work.

# II. THE C-ELEMENT

Most of the asynchronous design techniques proposed to date require components other than ordinary logic gates and flip-flops available in current standard cell sets [3]-[5]. These include e. g. weak-conditioned half-buffers, event forks, join and merge elements (discussed in detail in [5]). Although most of these may be built from logic gates, this is inefficient and often rely on timing assumptions that are not practical for asynchronous design. A fundamental device that enables to build such elements more effectively is the C-element [3]-[5]. The importance of C-elements is that they can enable asynchronous communication. Figure 1(a) depicts the truth table and Figure 1(b) shows a transition diagram for an ordinary, 2-input C-element. As Figure 1 shows, its output only switches when all inputs have the same logical value. In other words, when inputs A and B are at logic '0', the output Q goes to logic '0' and when the inputs are at logic '1', Q goes to logic '1'. However, when inputs are different, the output keeps its previous logic value. The asynchronous state transition diagram of Figure 1(b) represents all valid transitions of the C-element and has vertices containing values of inputs and output in the order ABQ<sub>i</sub>. It is possible to build and use several alternative similar behaviors, e.g. by individually negating the inputs or the output, increasing the number of inputs and associating differentiated logic behavior to one or more inputs. This last characteristic produces the so-called asymmetric C-elements,

which are discussed, for example in [3] and [4]. On the rest of this paper the discussion restricts attention to CMOS implementation of the Figure 1 C-element, as it is the most basic and most used device. Most of the discussion can be extended in a straightforward way to any other particular C-element gate type.



Figure 1 – Simple 2-input C-element specification: (a) truth table and (b) asynchronous state transition diagram using the order ABQ<sub>i</sub>.

There are many different ways to implement C-elements in CMOS technologies, as discussed in many works such as those described in references [8] to [20]. However, three basic transistor topologies stand as the most accepted and employed in practical circuits: Martin [21], Sutherland [22] and van Berkel [23] C-elements. Therefore, this work approaches these three C-element topologies to evaluate voltage scaling effects over them. Figure 2 shows the associated symbol for the C-element and the transistor level schematic of the three implementations: in Figure 2(a) Martin's, in Figure 2(b) Sutherland's and in Figure 2(c) van Berkel's.



Figure 2 – Three alternative CMOS transistor topologies for C-elements: (a) Martin's, (b) Sutherland's and (c) van Berkel's.

These three implementations are available with different driving strengths (capability of charging/discharging output loads) in an in-house standard-cell library, called ASCEnD, implemented in the 65nm CMOS STMicroelectronics (STM) technology through a specific design flow [7]-[9]. The C-elements of this library are all designed to the layout level and count with timing, power and functional models to support automated design and analysis of asynchronous integrated circuits. Figure 3, for instance, shows the layout of a small drive (X2) version of the Martin, Sutherland and van Berkel C-elements. Also, RC extracted views are available for different fabrication processes, based on the extracted parasitics from the layouts. All the experiments reported in this paper are based on RC extracted views for a typical process.



III. EXPERIMENTS

#### A. Minimum Operating Voltage

The first experiment detected the minimum voltages that can be applied to each C-element without interfering in their correct behavior. The experiment investigated scenarios for varying temperatures and a fixed fan-out of four (FO4) output load. Minimum voltages were estimated by simulating all transition arcs of each C-element (as showed in Figure 1(b)) for each temperature/voltage scenario. When at least one arc does not generate the correct output or a static state is not able to maintain correct functionality, the scenario is defined as not functional. Also, generated signals must have voltages in well defined regions, for logic '1' (from 90% to 100% of the power supply) or for logic '0' (from 0% to 10% the power supply). If a signal presents a voltage level in the undefined region (from 10% to 90%), the scenario is also defined as not functional. In summary, the minimum voltage is defined as the lowest voltage at which the C-elements can operate without jeopardizing their correct logical/electrical behavior. The obtained results are summarized in Figure 4, where six drives are analyzed.

| Drive $\downarrow$ / Temp. $\rightarrow$ | 125°C | 100°C | 75℃  | 50°C | 25°C | 0°C  | -25°C | -50°C |     |
|------------------------------------------|-------|-------|------|------|------|------|-------|-------|-----|
| X2                                       | 0.15  | 0.15  | 0.15 | 0.2  | 0.5  | 0.6  | 0.65  | 0.65  | (a) |
| X4                                       | 0.15  | 0.15  | 0.15 | 0.2  | 0.5  | 0.6  | 0.65  | 0.7   |     |
| X7                                       | 0.15  | 0.15  | 0.15 | 0.2  | 0.45 | 0.55 | 0.6   | 0.65  |     |
| X9                                       | 0.15  | 0.15  | 0.15 | 0.2  | 0.2  | 0.35 | 0.5   | 0.6   |     |
| X13                                      | 0.15  | 0.15  | 0.15 | 0.2  | 0.2  | 0.25 | 0.45  | 0.5   |     |
| Drive $\downarrow$ / Temp. $\rightarrow$ | 125°C | 100°C | 75°C | 50°C | 25°C | 0°C  | -25°C | -50°C |     |
| X2                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  | (b) |
| X4                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |
| X7                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |
| X9                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |
| X13                                      | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |
| Drive $\downarrow$ / Temp. $\rightarrow$ | 125°C | 100°C | 75°C | 50°C | 25°C | 0°C  | -25°C | -50°C | ]   |
| X2                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  | (c) |
| X4                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |
| X7                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |
| X9                                       | 0.15  | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |
| X13                                      | 0.2   | 0.15  | 0.15 | 0.15 | 0.2  | 0.2  | 0.25  | 0.25  |     |

Figure 4 – Minimum voltage for maintaining correct functionality of the three C-elements: (a) Martin's, (b) Sutherland's and (c) van Berkel's.

Clearly, the higher the temperature is the lower is the minimum operating voltage. Results suggest that the Sutherland and the van Berkel C-elements are typically preferable for operating with low voltage supply, as they tolerate lower voltages than Martin's. These results can be explained analyzing the transistors arrangement of each C-element implementation. Recalling Figure 2(a), in the Martin C-element, there is a conflict-solving situation for every output transition. For instance, suppose that inputs A and B are at logic '0'. In this case, transistors P0 and P1 are conducting and

transistors NO and NI are turned off, generating a direct path the connects internal nodes nd0 and Vdd. Thus, the output inverter (P2 and N2) is writing logic '0' to output Q and the feedback inverter (P3 and N3) is maintaining the output logic value stable, writing logic '1' to node *nd0* through the direct path to Vdd created by P3. Now, assume that input B switches to logic '1'. At this point, there is no connection from the internal node nd0 to Vdd or Gnd, because P1 and N1 are both cut off. Then, the logic value is kept by the inverters loop, formed by P2-N2 and P3-N3. Next, assume that input A also switches to logic '1', making NI conduct and creating a direct path from node *nd0* to *Gnd*. However, there is also a direct path from node *nd0* to *Vdd* that is still active (through *P3*). There is thus an instantaneous short circuit, caused by the path from Vdd to Gnd through conducting transistors N1, N0 and P3. The C-element operates correctly when the resistivity of the path composed by transistors N0 and N1 is smaller than the path crossing P3. Similarly, to ensure correct behavior when the output switches to logic '0', the path formed by P0 and P1 must have smaller resistivity than the path crossing N3.

Typically, transistors P3 and N3 are designed with minimum size, to reduce their interference in the functionality of the C-element, while transistors P0, P1, N0 and N1 are larger, to drive the output inverter. The bigger the driving strength is the bigger these transistors need to be. The transistors of the Martin C-elements of ASCEnD were designed to guarantee the correct behavior at typical voltages. However, for low voltages, its behavior is compromised, especially for lower drive implementations. This is because, the lower the voltage is the faster the series of transistors P0 and P1 or N0 and NI will saturate, which generates a path more resistive than the one composed by P3 or N3. Additionally, the lower is the drive, the smaller are transistors P0, P1, N0 and N1, which worsens the situation. For instance, at typical temperature conditions (25°C), the minimum voltage for X2 and X4 Martin C-elements is 0.5 V, for the X7 it is 0.45 V and for the X9 and X13 it is 0.2 V.

The Sutherland and van Berkel C-elements are not susceptible to the race condition arising in the Martin Celement. This is because when these C-elements are switching their respective outputs, the feedback inverter is cut-off. Recall Figure 2(b). For the Sutherland topology, when the output switches to logic '0', transistors P0, P1, P3 and P4 are all turned off, preventing any connection of internal node nd0 to Vdd. Similarly, when the output switches to logic '1', transistors N0, N1, N3 and N4 will be turned off, preventing that node *nd0* to have a direct path to *Gnd*. As Figure 2(c) shows, the same occurs in the van Berkel C-element with transistors *P0*, *P1*, *P2*, *P3* and *P5* and *N0*, *N1*, *N2*, *N3* and *N5*. In fact, Figure 4 shows the C-element drive does not have the same effect in Sutherland's and van Berkel's as in Martin's. Thus, the former are better for semi-custom low voltage design.

### B. Energy, Leakage and Speed

Another experiment measured the energy consumption and the propagation delay for each transition arc of the C-elements, for the same scenarios of Section III.A. Also, leakage power was measured for all static states in these scenarios.

Figure 5 shows the measured energy per transition (EPT) for each drive of each C-element implementation varying the supply voltage and keeping the temperature fixed (25°C). Indeed, all results presented in the remaining of this Section assume a temperature of 25°C. Other temperatures do not change the results qualitatively, only quantitatively. Therefore they are omitted. The EPT is the average of the energy consumed by all arcs of each implementation. For the Martin C-element, in drives X2, X4 and X7, the lowest operational voltage presents very high EPT. This is due to the conflict condition. Albeit the resistivity of the paths is balanced well enough to provide correct functionality, it keeps the conflict for a relatively long period, which leads to excessive energy consumption. Fine grain optimizations in transistor dimensions could improve the obtained results. However, this is not interesting for semi-custom approaches. By analyzing the charts of Figure 5, it is clear that Sutherland and van Berkel Celements typically present lower EPT than Martin's for a same drive in all cases, roughly 20%. This is because these Celements employ a mechanism for cutting the connection of the internal node *nd0* with *Vdd* or *Gnd* during output transition arcs, as explained in Section III.A. Also, van Berkel's presents EPT values slightly lower than Sutherland's. In this way, the obtained results suggest that the former C-elements can lead to better energy efficiency, regardless the operating voltage.

Similarly, the measured average leakage power for all static states appears in Figure 6, for each C-element. As the charts show, Martin's implementation is the one that presents larger leakage power values, while Sutherland and van Berkel Celements present equivalent results. These results are a reflection of the size of transistors in each implementation. Because of the conflict during transition arcs, the Martin Celement requires larger transistors, which leads to its excessive leakage power.



Figure 5 - Average EPT for varying voltage supplies for each drive of the three C-elements: (a) Martin's, (b) Sutherland's and (c) van Berkel's.

The average number of Giga transitions per second (GTPS) serves here to define the C-elements' relative speed. The measurement of these values depends on the average propagation delay of all C-elements transition arcs. Figure 7 presents the results obtained for the three topologies. As the charts show, the measured GTPS for the Martin and the Sutherland C-elements are similar, while for van Berkel it is roughly 20% larger. This is due to the arrangement of transistors in the latter. When a van Berkel C-element switches the output, two paths connect the internal node *nd0* to *Vdd* or Gnd in parallel (P0, P1, P2 and P3 or N0, N1, N2 and N3), see Figure 2(c). In Martin and Sutherland topologies, this connection occurs through a single path (P0 and P1 or N0 and *N1*), see Figure 2(a) and Figure 2(b).

# C. Speed-Energy, Speed-Leakage and Speed-Area Efficiency

Another perspective of the obtained results shows a fairer comparison of C-elements. Three cost-benefit functions were defined to evaluate speed, leakage, energy and area tradeoffs: speed-energy, speed-leakage and speed-area.

The ratio between the measured GTPS and EPT defines the speed-energy efficiency function. Using this, it is possible to evaluate the speed of the C-elements without overlooking the associated energy consumption. Figure 8 shows the speedenergy efficiency values, in GTPS/EPT, measured for all Celements. As the charts show, the van Berkel C-element is the one that presents highest GTPS/EPT values, followed by Sutherland's. The Martin topology presented the worst speedenergy cost-benefit. The van Berkel achieves optimizations of roughly 82%, in the best case, 46% in average and 11% in the worst case, when compared to Sutherland. In comparison to Martin, these values are 700%, 240% and 75%, respectively.

Voltage (V)

(a)

Also, Martin C-elements, in lower driving strengths (X2-X7), reach the measured optimum GTPS/EPT when operating at roughly 0.85 V. For higher driving strengths (X9 and X13), the optimum occurs at roughly 0.75 V. However, Sutherland and van Berkel C-elements reach optimum power efficiency when supplied with 0.55 V, for all driving strengths. Moreover, as the charts in Figure 8 show, the lower the drive of the Celement is the best is its speed-energy efficiency. In this way, results suggest that energy optimizations can be achieved by employing low drive C-elements whenever feasible.

The ratio between the measured GTPS and leakage power (LKP) produces the speed-leakage efficiency function definition. With this function it is possible to evaluate the speed of C-elements without overlooking the associated leakage power. Figure 9 shows the speed-power efficiency values, in GTPS/LKP, measured for all C-elements considered here. As the charts show, the van Berkel C-element is again the best, since it presents highest GTPS/LKP, followed by Sutherland's. The Martin implementation presents the worst speed-leakage cost-benefit. Van Berkel's displays optimizations of roughly 51%, in the best case, 32% in average and 15% in the worst case, when compared to Sutherland's. In comparison to Martin's, these values are 256%, 92% and 28%, respectively.

As Figure 9(a) shows, optimum efficiency for the Martin Celement can be obtained at 0.7 V for lower driving strengths and at 0.6 V for higher driving strengths. Also, as Figure 9(b) and Figure 9(c) show, for Sutherland and van Berkel, optimum efficiency is obtained at 0.55 V. In addition, similarly to the speed-energy efficiency, the lower is the drive of the Celement, the best is its speed-leakage efficiency, suggesting that improvements in the static power of circuits can be obtained by employing low drive C-elements.

Voltage (V)

(c)





Voltage (V)

332







The speed-area efficiency function allows evaluating the obtained speed for each implementation without ignoring silicon area. The ratio between GTPS and the total silicon area of each C-element defines this last evaluated function. Figure 10 shows the area of each implementation, according to the information available in the ASCEnD library. As the chart shows, the higher the drive is the larger is the required area. The speed-area function was defined as the ratio between the GTPS and these area results for each C-element. Figure 11 shows the obtained results. As the charts show, small drive

Martin C-elements are the ones that present best GTPS/Area values. This is expected, given their low area, as Figure 10 makes clear. However, albeit the van Berkel C-element requires more silicon area in all cases compared to the Sutherland C-element, it still presents the best speed-area efficiency. This is due to the higher GTPS provided by van Berkel, as Figure 7 shows, meaning that although the Sutherland topology provides area reduction, these are not as substantial as the GTPS improvements provided by van Berkel's.

#### IV. **RELATED WORK AND RESULTS DISCUSSION**

Several works available in literature explored the tradeoffs of the addressed C-elements. In [13] and [15], Shams et al. present such a comparison and propose delay and energy consumption models. Elissati et al. conducted a similar study [19], where a self-timed ring serves as a circuit to compare Celement implementations. However, experiments in these works assume only nominal supply voltage operation.



Yancey and Smith in [18] propose a new C-element implementation, the differential C-element. These authors present some simulation results, suggesting that the implementation is more tolerant to low voltages than the classic Martin, Sutherland and van Berkel C-elements. However, results for voltages other than nominal are just a few and the work does not evaluate speed, energy, leakage power, or area tradeoffs.

The drawback of most previous C-element comparison works is that none of these examined the tradeoffs and effects of voltage scaling on C-element implementations in detail. This work contribution is to present this comprehensive evaluation of speed, energy, leakage power and area impact of three of the most employed C-elements in practical asynchronous circuits, under varying supply voltage. Indeed, it started as an extension of a previous work [6] that accounted only for nominal supply voltage conditions, and did not provide data to evaluate voltage scaling. As Section III shows, the main suggestion here is that the van Berkel topology is the most adequate one for low voltage applications. Although it has slightly worse speed-area efficiency (yet, note that for high drives, it is actually better), it provides the highest speed-energy and speed-leakage efficiency in all cases. Thus, its use for low power applications using voltage scaling techniques is strongly recommended.

Besides, results suggest that the best speed-energy and speed-leakage efficiency are obtained when operating at voltages near the threshold. These results help designers using C-elements, as they enable another operation mode: best cost benefit considering speed, energy and leakage. This can be used for coping with many contemporary problems such as battery-based systems, low power budgets and green computing challenges. Finally, as Figure 8 and Figure 9 show, further optimizations can be achieved by employing low drive C-elements, enabling a better design space exploration for low power application. This occurs because these C-elements present better efficiency than higher drives ones.

### V. CONCLUSIONS

This paper provides a detailed evaluation of the electrical behavior for three of the most common C-element topologies: Martin's, Sutherland's and van Berkel's, for varying voltage supplies. Results suggest that the evaluated C-elements present best speed, energy and leakage cost benefits when operating near the threshold voltage. Also, measurements point that the most efficient topology for low voltage applications is van Berkel's, as it presents the overall best speed-energy and speedleakage efficiency with acceptable losses in speed-area efficiency. In this way, we advise low voltage asynchronous circuits designers to choose the van Berkel topology in general.

Future work includes the analysis of other devices that can be employed in asynchronous circuits' design, such as null convention logic gates [3], differential C-elements, and precharged half-buffers [3] under varying voltage supplies. An evaluation of handshake components [3] under varying voltage supply is under way. These employ C-elements and are essential for QDI design. The goal is to evaluate circuit and system level effects of voltage scaling in QDI circuits. Finally, an evaluation of voltage scaling in C-elements with varying threshold voltages is another relevant future work, since this can lead to further optimization in asynchronous circuit design.

#### **ACKNOWLEDGEMENTS**

This work was partially supported by the CAPES-PROSUP (under grant 11/0455-5) and FAPERGS (under grant 11/1445-0). Authors acknowledge the support of CNPq under grants 310864/2011-9 (N.Calazans) and 142079/2013-8 (M.Moreira).

#### REFERENCES

- [1] M. Radfar et al. "Recent Subthreshold Design Techniques." Active and Passive Electronic Components, 2012, 11p.
- [2] S. Hanson et al. "Ultralow-voltage, minimum-energy CMOS." IBM Journal of Research and Development, 50(4-5), 2006, pp. 469-490.
- [3] P. A. Beerel et al. "A Designer's Guide to Asynchronous VLSI." Cambridge University Press, 2010, 337 p.
- [4] J. Sparsø and S. Furber. "Principles of Asynchronous Circuit Design A Systems Perspective." Kluwer Academic Publisher, 2001, 360 p.
- [5] A. J. Martin and M. Nyström. "Asynchronous Techniques for Systemon-Chip Design." Proceedings of the IEEE, 94(6), Jun. 2006, pp. 1089-1020.
- [6] M. T. Moreira, et al. "Impact of C-elements in Asynchronous Circuits." In: ISQED, 2012, pp. 438-444.
- [7] M. T. Moreira and Ney L. V. Calazans. "Design of Standard-Cell Libraries for Asynchronous Circuits with the ASCEnD Flow." In: ISVLSI, 2013.
- [8] M. T. Moreira et al. "A 65nm Standard Cell Set and Flow Dedicated to Automated Asynchronous Circuits Design." In: SoCC, 2011, pp. 99-104.
- [9] M. T. Moreira et al. "Adapting a C-Element Design Flow for Low Power." In: ICECS, 2011, pp. 45-48.
- [10] T.-Y. Wuu and S. B. K. Vrudhula. "A design of a fast and area efficient multi-input Muller C-element." IEEE Transactions on Very Large Scale Integration Systems, 1(2), 1993, pp. 215-219.
- [11] S. L. Lu. "Improved design of CMOS multiple-input Muller-Celements." Electronics Letters, 29(19), 1993, pp. 1680-1682.
- [12] A. Kondratyev et al. "Basic Gate Implementation of Speed-Independendent Circuits." In: 31st DAC, 1994, pp. 56-62.
- [13] M. Shams et al. "A comparison of CMOS implementations of an asynchronous circuits primitive: the C-element." In: ISLPED, 1996, pp. 93-96.
- [14] M. Shams et al. "Optimizing CMOS implementations of the C-element." In: ICCD, 1997, pp. 700-705.
- [15] M. Shams et al. "Modeling and comparing CMOS implementations of the C-element." IEEE Transactions on Very Large Scale Integration Systems, 6(4), 1998, pp. 563-567.
- [16] T.W. Kwan and M. Shams. "Multi-GHz energy-efficient asynchronous pipelined circuits in MOS Current Mode Logic." In: ISCAS, 2004.
- [17] T. W. Kwan and M. Shams. "Design of high-performance power-aware asynchronous pipelined circuits in MOS current-mode logic." In: ASYNC, 2005, pp. 23-32.
- [18] S. Yancey and S. C. Smith. "A differential design for C-elements and NCL gates." In: MWSCAS, 2010, pp. 632-635.
- [19] O. Elissati et al. "Optimizing and Comparing CMOS Implementations of the C-element in 65nm Technology: Self-Timed Ring Case." In: PATMOS, 2010.
- [20] H. K. O. Berge et al. "Muller C-elements based on minority-3 functions for ultra-low voltage supplies." In: DDECS, 2011, pp. 195-200.
- [21] A. J. Martin. "Formal program transformations for VLSI circuit synthesis." In: Formal Development of Programs and Proofs, E. W. Dijkstra, ed., Addison-Wesley, 1989, pp. 59-80.
- [22] I. E. Sutherland. "Micropipelines." Communications of the ACM, 32, Jun. 1989, pp. 720-738.
- [23] K. van Berkel. "Beware the isochronic fork." Integration, the VLSI Journal, 13(2), Jun. 1992, pp. 103-128.