# International Journal of Engineering Technology and Management Sciences 

Website: ijetms.in Special Issue: 1 Volume No. 7 April - 2023

# Optimization of Synchronizers with Comparison 

K.Manjunath ${ }^{1}$, C.Tejaswini ${ }^{2}$, M.Sai Vardhan ${ }^{3}$, V.Rudra Prathap Reddy ${ }^{4}$, A.Sreedhar ${ }^{5}$<br>1,2,3,4,5Department of ECE, Annamacharya Institute of Technology \& Sciences, Tirupati-517520


#### Abstract

Real-world synchronizer circuit failure probabilities can be calculated using bisection with restarts[1]. Building on the bisection with restarts technique, further subsequent research [2] demonstrated how lime- varying, linear dynamics can be obtained for non-linear synchronizer circuits. Here, we demonstrate how the component- wise contributions to the synchronizer performance of this linear model can be separated. This makes it possible for the device's dimensions to be automatically optimized to reduce the likelihood of failure. In order to compare existing designs fairly, we can optimize each circuit before comparing them. The component- wise study demonstrates how each device contributes to metastability resolution over the allocated synchronization time, which clarifies the variations between designs.


Keywords: Synchronizer, bisection methods, time-varying linear dynamics, optimization, synchronization time.

## Introduction

Synchronizers are a common component of contemporary integrated circuits. Most semiconductors have a lot of clock domains that frequently operate at various frequencies to maximize power efficiency and performance. An interface between these domains needs synchronizers. Asynchronous communication networks are employed in Globally- Asynchronous, Locally-Synchronous (GALS) designs [3]-[5] to connect synchronous andasynchronous modules, and synchronizersare necessary for signals entering synchronous domains. At the interfaces between off-chip I/O links and on-chip modules, synchronizers are also utilized. Thousands or even hundreds ofsynchronizers can be found on large chips. Each synchronization procedures must havefailure probabilities between $10^{-15}$ and 10${ }^{25}$ or less in order to achieve acceptable reliability.
This paper adds two substantial expansions to [2]. Secondly, we note that the linearmodel from [2] can be further broken down into a linear combination of contributions from the various circuit components. Hence, we can explain how each transistor or logic gate contributes to the "instantaneous gain" of the synchronizer as a whole as well as the gain as a whole.

(a) Pass gate latch

(b) Jamb Latch

(c) Robust Synchronizer

# International Journal of Engineering Technology and Management Sciences 


(d) StrongArm Latch

Fig 1. Latches for synchronizers studied in this paper.
Section IV builds on this and shows how, given the contribution of each transistor to metastability resolution, we can automatically compute the derivative of thefailure probability with respect to transistor parameters, in particular, device width. This enables automatic optimization ofdevice size and comparison of different synchronizer circuits. Using this approach, Section, V presents a comparison of synchronizer constructed from pass-gate latches, one built from jamb latches, the robust-synchronizer circuit proposed in [8], and a synchronizer based on the Strong Armlatch [9].

## II. RELATED WORK

## II. 1 Four latches:

The four sample synchronizer circuits used in this paper are shown in Figure 1 (note: inall schematics, signal B indicates the logicalnegation of signal). A pass-gate latch is depicted in Figure 1a. Data values flow fromnode d through nodes x and y to the output q while clk is high; the latch is visible. In contrast, the fwd and bwd inverters form a loop when clk is low, and q maintains the value that d had at the falling edge of clk.
A jamb latch [10], as seen in Figure lb, makesan effort to enhance the synchronizer's performance by separating the pass-gate from the feedback loop of the two inverters. Instead, by asserting a high value on the reset input, each latch in the synchronizer isfirst put into a pre-determined, reset state. The latch is "one catching" if d and clk are both high because node x is pulled down bytransistors ml and m 2 and node q goes high as a result. Because NMOS and PMOS devices can vary differently under PVT (process, voltage, and temperature) variation, care must be taken to ensure that the latch will function properly for all device parameters and operating conditions.
The "robust synchronizer" from [8] is depicted in Figure 1c. A metastability filter is made up of the transistors m 4 through m 7 . Keep in mind that m 4 and m 6 are in cut-off until node x's voltage is at least one NMOS threshold lower than node y's. The voltages at nodes x and y will be virtually equal when the cross-coupled inverters, fwd and bwd, are in metastability, and nodes xf (i.e., x "filtered") and it will be high. The resolved value is produced on q when the cross-coupled pair exits metastability. Either xf or yf drops low, while the other remains high. We now examine the latch's metastable behavior in more detail. As aresult, the NMOS transistors in these inverters have a higher gate-to-source voltage. The pace of metastability resolution is improved because transistor transconductance improves with gate- to-source voltage. The average impact on power consumption is minimal since the additional current is only used while the cross-coupled inverters are in a metastable state.
Last but not least, Figure 1d displays the Strong Arm latch from [9]. A timed sense amplifier is essentially what the latch is.Transistors m 9 and ml 0 respectively precharge nodes x and y high when clk is low. Any difference between the complementary input signals, d and dB , on an upward rising edge results in a difference between the voltages on nodes $w$ and $w B$. As a result, $x$ and $y$ eventually settle into opposing states. The state of the RS latch, which is made up of two NAND gates, is determined by the settling of $x$ and $y$. This RS latch maintains its status even if clk drops again. Hence, a positive edge triggered flip-flop is implemented by theStrong Arm latch. It should be noted that the flip-flop is implemented using two latches in the other configurations in the figure 1.

# International Journal of Engineering Technology and Management Sciences 

## II.2. Synchronizer Analysis

The earliest studies on the analysis ofmetastability in synchronizers were done by Kinnement \& Edwards [11] and Molnar \& Chaney [12]. Each individual latch was the subject of early synchronizer analysis. Examples include Kinniment and Woods[13] and Couranz and Wann [6]. This results in a theoretical analysis where:

$$
\begin{equation*}
\mathrm{P}\{f a i l\}=T w f c l k e^{-t s / r} \tag{1}
\end{equation*}
$$

The $T_{w}$ (input window),fclk(clock frequency),ts(settlement time), and "r" refer to the metastability resolution time- constant of the bistable element in the latch, such as a pair of cross-coupled inverters, and $\mathrm{P}\{$ failure \} is the probability of a synchronizer failure for a single transition of its input. By locating the bistable element's metastable point, a linear, small- signal circuit model may be derived at this location, and its greatest eigenvalue can be used to calculate the parameter r . $\mathrm{T}_{\mathrm{w}}$ is frequently defined as the range of input transition periods that obstructs the latch's set-up and hold window. In practice, it is an empirical correction term to account for everything else that happens during synchronization other than the bistable element loitering near its metastablepoint. Further elaboration of these single- latch models and their application to severalpractical synchronizer circuits is given. Real synchronizers are made up of manylatches. These circuits are time-varying dynamical systems, and the clock phase changes affect the circuit dynamics. Thefollowing technique (which assumes a low-to-high transition on d near the sampling clock edge) is the foundation of the bisection with restarts algorithm [1], which is used to discover metastability failures forsynchronizer circuits:

- Identify an input for which the circuit settles to output a low value by a specified sampling time, tcrit. Call this input tlo--notethat the synchronizer settles low for anyinput transition time $>$ tlo.
- Identify another input, thi such that the synchronizer settles high at tcrit, when the input transitiontime is
<thi- Note that thi < tlo
- Perform a bisection search starting with thi and tlo, as the, endpoints of the search interval to find an input, tfail forwhich the circuit does not produce a valid logical output.-
This bisection model can help withunderstanding how synchronizers work, butit is difficult to implement with circuit simulators like HSPICE [7] because the numerical computations become unreliable for inputs close to "perfect" metastability, which can lead to conflicting findings. The key finding of the bisection with restartstechnique is that bisection can locate two very similar starting conditions, where one results in the synchronizer settling low and the other results in it settling high. As the simulation goes on, these two trajectories will separate from one another.
In order to use the states of the synchronizer for the two trajectories at time trestart as initial conditions for a new round of bisection, the method finds a restart time when the trajectories have drifted sufficiently. Each of these rounds is referred to as an epoch.
Time trestart is also selected to allow for theconstruction of a small-signal linear model along the trajectory. These small-signal models are implicit in [1], where the ratio ofthe voltage difference for the trajectories at the start and end of each epoch is used to estimate the "time-to-voltage" gain of the starting epoch. The synchronizer's overall time-to- voltage gain [15], [16] can be calculated by multiplying these ratios. Assuming that input transitions are uniformly randomly distributed over the clock period, this time-to- voltage gain can be used to determine the synchronizerfailure probability.
[2] This model serves as the foundation for the current work, so we will review their key findings here. A system of ordinary differential equations (also known as an ODE) can be used to simulate a circuit.
where $v(t)$ is the voltage state at time $t ; f$ is the circuit model, derived using modified- nodal analysis; andv0 is the initial voltages state of the circuit. The derivative function, $f$ is time varying to mode] external inputs such as $d$, the signal to be synchronized, and the clock. We write $\mathrm{d}(\mathrm{t}, \mathrm{tin})$ to denote the data input signal; tin is a parameter that gives the time of the input transition. For the sake of simplicity, we define a signal transition as occurring when a signal passes the halfway point between
ground and Vdd.Similarly, we set the synchronizer metastable by writing clk $(\mathrm{t}, \mathrm{P})$ to the valueof a clock with period $P$ at time $t$; we presum that the clock edge of interest happens at time $t=0$. To resolve metastability by the deadline, tcrit, the method of [2] computes a unit vector, $u(t)$, that describes the "useful" component of the voltage vector at time $t$. This is the direction for the voltage vector $\mathrm{v}(\mathrm{t})$ at time t . A synchronizer's time-to- voltage gain is described asfollows:

$$
g(t)=\frac{\partial}{\partial t_{\text {in }}} u(t)^{T} \mathbf{v}(t)
$$

$$
\begin{align*}
\frac{d}{d t} g(t) & =\lambda(t) g(t)+\rho(t) \\
\lim _{t \rightarrow-\infty} g(t) & =0 \tag{4}
\end{align*}
$$

The instantaneous gain of the synchronizer is intuitively given by $\lambda(t)$ because of positive feedback in the synchronizer circuit. The initial transformation of the input transition time, $\mathrm{t}_{\mathrm{i}}$, into the voltage state of the circuit at time $t$ is modelled by the function $\rho(t)$. While Equation 4 would be prettier if we could assume $\mathrm{g}(0)=0$, the synchronizer state may, in fact, depend on the behaviour of d slightly before the initial clock edge. In practice, we can assume $g(t)=0$ when $t$ is a few gate delays before the initial clock edge.

$$
\begin{align*}
\frac{d}{d t} \mathbf{v}(t) & =f(\mathbf{v}(t), t) \\
\mathbf{v}(0) & =\mathbf{v}_{0} \tag{5}
\end{align*}
$$



Fig 2: $\mathrm{h}(\mathrm{t})$ and $\mathrm{p}(\mathrm{t})$ for a 2 flip-flop, pass-gatesynchronizer
With a two-flip-flop synchronizer, shown in Figure 2, $\mathrm{p}(\mathrm{t})$ and $\lambda(\mathrm{t})$ are shown. Each flip-flop is made up of two pass- gate latches that are transparent when the clk polarity is switched. The inhomogeneous portion, $\mathrm{p}(\mathrm{t})$,quickly approaches zero. The keycharacteristics of the synchronizer are captured by the homogeneous coefficient, $\lambda(\mathrm{t})$. To illustrate the implications of metastability "travelling" from one latch to the next, consider how $\lambda(\mathrm{t})$ dips at eachclock- edge. In the following section, we demonstrate how $\lambda(t)$ can be broken down into the sum of contributions of the various synchronizers parts. This decompositionserves as the foundation for the optimization strategy presented in Part IV and is a potent tool for comprehending the reasons behind synchronizer behavior.

# International Journal of Engineering Technology and Management Sciences 

## III. ELEMENT WISE METASTABILITY ANALYSIS

The main finding of our research is that the time derivative of the voltage state and the Jacobian operator for this derivative can be represented as simple sums of the contributions from the circuit components when employing modified-nodal analysis,the most popular approach of circuit analysis. We start with a few preliminaries on the formulation of the circuit model. Section III. 1 sketches the key parts in the derivation of $g(r)$ from [2] that we build upon in our element-wise analysis - a reader who wants more details is referred to[2]. We then show how $g(t)$ can be expressed in a element-wise form, and shown an example using the pass-gate latch.

## III.1. Preliminaries

As written in Equation 2 A circuit's timeevolution can be expressed as $d / d t \mathrm{v}=\mathrm{f}(\mathrm{v}, \mathrm{t})$. Circuits with MOSFET transistors, capacitors, and either constant or time- varying voltage sources are taken into consideration in this work (e.g. for clk, d, and Vdd). To represent the current flowing from the transistors into each node of the circuit, we write $I_{\text {fet }}(\mathrm{v}, \mathrm{t})$. Because some circuit nodes are connected to voltagesources that are not included in the state- vector v and instead have values that are calculated from the value oft, the function $\operatorname{Ifet}(\mathrm{v}, \mathrm{t})$ depends on 1 . The capacitance matrix for the circuit is represented by the letters $\mathrm{C}(\mathrm{v}, \mathrm{t}) . \mathrm{C}(\mathrm{v}, \mathrm{t}) \mathrm{i}, \mathrm{j}$ is the capacitance between nodes I and j at time t for the $\mathrm{i} \neq \mathrm{j}$ system, and $\mathrm{C}(\mathrm{v}, \mathrm{t}) \mathrm{i}, \mathrm{i}$ is the total capacitance connected to node i . MOSFETs' inter- terminal capacitances are non- linear functions of the terminal voltages. In order to account for nodes that are connected to voltage sources, C is a function of v and depends on t . According to Kirchoff'scurrent law, the current flowing from the node into the capacitors is $\mathrm{C}(\mathrm{v}, \mathrm{t}) d / d t \mathrm{v}(\mathrm{t})$, which must equal Ifet(v,t),Thus,

$$
\begin{equation*}
\frac{d}{d t} v(t)=C(v, t)^{-1} I_{f e t}(\mathbf{v}, \mathbf{t}) \tag{5}
\end{equation*}
$$

Let $f(v, t)=C(v, t)^{-1} \operatorname{Ifet}(\mathrm{v}, \mathrm{t})$ asdescribed above.
The Jacobian operator for $f$ plays a central rolein the analysis:

$$
\begin{equation*}
\operatorname{Jac}(\mathbf{f}, \mathbf{v}, \mathbf{t})_{i, j}=\mathbf{d} / \mathbf{d v j} \mathbf{f}(\mathbf{v}, \mathbf{t})_{\mathbf{I}} \tag{6}
\end{equation*}
$$

We write $J_{f}(t)$ as a short hand for $\operatorname{Jac}(f, v(t), t)$.
We now summarize key quantities forcomputing $\lambda(t)$ and $g(t)$

- $\mathrm{S}(\mathrm{t} 1, \mathrm{t} 2)$ : how a change in the circuitstate at time tlaffects the circuit's state at time $\mathrm{t} 2 \mathrm{~S}(\mathrm{t} 1, \mathrm{t} 2)$ is a $\mathrm{n} \times \mathrm{n}$ matrix
- Tcrit: the time at which the synchronizer output must be well-defined and settled. Teola: time of the "end-of- linear-analysis". We choose teola to satisfy:

1. At time teola, the settle-low and settle-high trajectories are sufficientlyseparated so that numerical integration of the circuit model correctly resolves theiroutcomes for times $t \geq$ teola, and
2. These trajectories are close enough so that small- signal linear analysis is valid for $\mathrm{t} \leq$ teola.

- $u(t)$ : a unit vector corresponding to "useful" separation of trajectories at time teola. Typically, we use thedifferencebetween the inverter outputs of the final (or next-to-last) cross- coupled inverter pair. - $u(t)$ : The pre-image of $u$ at time $t$ for $t \leq$ teola - a unit vector. Intuitively, $u(t)$ is the vector that describes what part of $\beta(\mathrm{t})$ is "useful" for synchronizer resolution attime teola (and thus at tcrit).
$u(t)=\left(S\left(t, t_{\text {eola }}\right)^{-1} \tilde{u}\right)\left\|S\left(t, t_{\text {eola }}\right)^{-1} \tilde{u}\right\|^{-1}$
$\lambda(t)$ and $p(t)$ : We now have the ingredients for the simple model for $g(t)$ :
While $p(t)$ is essential near the initial clock edge to establish the initial imbalance of the synchronizer, for practical synchronizers, $p(t)$ rapidly approaches 0 after that clock edge - see Figure 2, Thus, most of our attention will be on decomposing $\lambda(t)$.


## III.2. Computing Failure Probabilities

While [2] offered a thorough framework for analysing synchronizer behaviour, it lacked a straightforward method for calculating the likelihood of synchronizer failure. Equation 1 in this section is derivedfrom Equation 7.

$$
P\{\text { failure }\}=\frac{2 \Delta V_{\text {eola }} f_{c l k}}{g\left(t_{\text {coola }}\right)}
$$

Consider the simple case of a single latch synchronizer. Figure 3 shows $g(t)$ for a single pass-gate

latch. We observe that the slope of $\log \mathrm{g}(\mathrm{t})$ is nearly constant for the settling time and use a leastsquaresregression to obtain go and $\lambda \mathrm{\lambda o}$ such that:
Let $v^{l o}(t)\left(\right.$ resp. $\left.\mathrm{v}^{h i}(\mathrm{t})\right)$ denote the trajectories that settle low (resp. high) just in time, and In the linear region for $\log \mathrm{g}(\mathrm{t})$, we get

$$
\begin{align*}
\Delta \mathbf{v}(t) & \approx 2 \Delta V_{\text {eola }} e^{\lambda_{0}\left(t-t_{\text {eola }}\right)} \\
\tau & =1 / \lambda_{0} \\
\Delta V_{\text {crit }} & =2 \Delta V_{\text {eola }} e^{\lambda_{0}\left(t-t_{\text {eola }}\right)}  \tag{10}\\
T_{w} & =\Delta V_{\text {crit }} / g_{0}
\end{align*}
$$

Equation 1 is produced by substitutingEquations 9 and 11 into Equation 8. In a graph, the value of $\mathrm{g}(\mathrm{t})$ produced by extending the line-of-best-fit back to $\mathrm{t}=0$ is known as $\mathrm{g}_{0}$.
Fig.3: Pass-gate latch gain and line-of-best-fit

## III.3. Element-wise lambda(t) and $g(t)$

From Equation 5, but we need to be careful about the notation: $\mathrm{C}(\mathrm{v}, \mathrm{t})$ isa n x n vector; therefore

$$
\begin{aligned}
& J_{f}(t)= \\
& \quad-C(\mathbf{v}(t), t)^{-1} \mathbf{J} \mathrm{Jac}(C, \mathbf{v}(t), t) C(\mathbf{v}(t), t)^{-1} I_{\operatorname{Jet}}(\mathbf{v}(t), t) \\
& +C(\mathbf{v}(t), t)^{-1} \mathbf{J a c}\left(I_{f e t}, \mathbf{v}(t), t\right)
\end{aligned}
$$

$\operatorname{Jac}(\mathrm{C}, \mathrm{v}, \mathrm{t})$ is a $\mathrm{n} \times \mathrm{n} \times \mathrm{n}$ tensor. To make thecalculation explicit, let we define Cd and CD in the same way, and define:
This figure shows pgl. of stage $I$ which connects stage 0 and stage 1 of a pass-gate flip-flop.

$$
\begin{aligned}
\lambda(t) & =u(t)^{T} J_{f}(t) u(t) \\
\rho(t) & =u(t)^{T} \frac{\partial}{\partial t_{n}} f(\mathbf{v}, t) \\
g(t) & =u(t)^{T} \beta(t) \\
\frac{d}{d t} g(t) & =\lambda(t) g(t)+\rho(t)
\end{aligned}
$$

Fig.4.:Pass-gate synchronizer with offset coupling inverters

The signal go is the output of inverter out from Figure la, pgl 1 and x 1 is the pass-gate and node x of the following pass-gate latch in the chain.
Observing that $p(t)$ converges to 0 after the initial clock edge, we chose thomo So that $|\rho(t)| \mu|\lambda(t) g(t)|$ for $t \geq$ thomo. From equation 7, we get
Device d's contribution to the resolution of metastability at time $t$ is described by the functions $\lambda \mathrm{d}(\mathrm{t}) \cdot \lambda \mathrm{d}(\mathrm{t})$ calculates the effect of agate or other circuit module by selecting D as its constituent devices. The cumulative contribution of a device or module is described by the functions $\operatorname{gd}(\mathrm{t})$ and

$\mathrm{gD}(\mathrm{t})$. Here is a straightforward example to demonstrate this. Part IV expands on the per-device analysis for automatic device sizing.

## III.4. Active Pass-Gates

Now it's time for a short narrative. You can skip to Section IV if you don't enjoy stories. According to some synchronizer lore, if a synchronizer had latches with differing metastable voltages, the first latch would have to exit metastability before the second latch became opaque for metastability to

$$
\begin{aligned}
& \qquad J_{C}(\mathbf{v}, t, j)=\frac{\partial}{\partial \mathbf{v}_{j}} C(\mathbf{v}, t) \\
& \text { then } \\
& -C(\mathbf{v}(t), t)^{-1} \mathrm{Jac}(C, \mathbf{v}(t), t) C(\mathbf{v}(t), t)^{-1} I_{f e t}(\mathbf{v}(t), t) \\
& \text { is the } n \times n \text { matrix whose } j^{t h} \text { column is } \\
& \qquad J_{c}(\mathbf{v}(t), t, j) C(\mathbf{v}(t), t)^{-1} I_{\text {fet }}(\mathbf{v}(t), t)
\end{aligned}
$$

shift from one latch to the next. According to folklore, the latch state is"exiting" rather than "trapped at metastability," which should lower the likelihood of failure.
Of course, we knew this was wrong. The gain-bandwidth product of the cross- coupled inverters is highest at or near their balance point. When the first latch exits metastability early to establish metastability in the next latch, gain will be lost. We set- up a test case by using coupling inverters (out in Figure 1a) with a different P:N sizing ratio than the cross- couple pair. Imagine our horror when this resulted in a slight reduction in the failure probability.


Fig. 5.: $g(t)$ for a pass-gate synchronizer, with and without offset
The plot of $A(t)$ and the voltage waveforms confirmed that the latch before the offset inverter does indeed exit metastability early(see the plots for $q$ in the top panel ofFigure 4).
See the charts for $q$ in the top panel ofFigure 4 for evidence that the latch before the offset inverter does indeed leave metastability early. As seen in the middle panel of Figure 4, the pgl pass-gates favored the design with an offset in the inverter thresholds (the sizing with matchedP:N ratios is called "bench" in the figure).Because Vgs is so near to the threshold voltage and the body effect just makes things worse, it turns out that thepass-gates conduct fairly weakly when theirsources are at the metastable voltage for the synchronizer. The source voltages can be swung closer to ground to improve the efficiency of the NMOS transistors. We were even more shocked to see a tiny window

# International Journal of Engineering Technology and Management Sciences 


#### Abstract

of time where the pass-gate contributed positively to the instantaneous gain-it was functioning as an amplifier! Yes, it was. The source of the NMOS device was moving towards ground, andthe gate was almost at Vdd. The NMOS transistor operates as a common gate amplifier when the voltage on q is much below clk. There are anomalies to find even after the surprise amplifier. Figure 5 displays the twodesigns $g(t)$ values. The coupling inverter exhibits a lower Miller capacitance in theoffset case because it has a smaller gain, which causes the gain to initially increase more quickly for the offset design. The passgate pgl in the second latch, as previously mentioned, is what causes the upward surge. We have not pinpointed the precise reason of the downward spike that occurs in thecross-coupled pair of the second latch. Allof them together gave the synchronizer with the offset a tiny advantage. This experiment served as a reminder of how frequently transistors in synchronizers operate veryclose to their cut-off positions. Unless these are identified, the performance of the synchronizer may be significantly different than expected.


## VI. AUTOMATIC TRANSISTORSIZING

The gradient-descent method for optimising transistor widths to reduce failureprobability is presented in this section. The fundamental strategy is straightforward: foreach transistor, ascertain how altering itswidth affects its failure probability; modify transistor widths to lower the failure probability; and repeat until theimprovements are below a user- specified threshold. To be more precise, we (I) create the ODE model for the circuit with w as a parameter, which is a vector of transistor widths; (II) calculate the gradient of $\mathrm{P}\{$ fail $\}$ with respect to w ; and (III) apply gradient steps up until the improvement threshold is attained. Three issues arise with this approach: (1) $P\{f a i l\}$ tends to have an exponential dependence on parameters which undermines the linear- approximations that underlie gradient descent optimization; (2) Jacobian matrices are used throughout analysis, taking the derivatives of these withrespect to a vector results in a large numberof tensor operations; and (3) optimization needs correctness constraints to avoid producing designs that are dangerously close to being wrong.
To avoid the exponential sensitivities of $\mathrm{P}\{$ fail \}, we formulate the problem as one ofmaximizing $\operatorname{logg}$ (teola)-RewritingEquation 15, we get

| Pass-gate | $2.1 \mathrm{e}-56$ | StrongArm | $1.5 \mathrm{e}-42$ |
| :--- | :--- | :--- | :--- |
| Jamb, 20\% | $3.1 \mathrm{e}-56$ | Robust, 20\% | $3.8 \mathrm{e}-73$ |
| Jamb, 60\% | $1.6 \mathrm{e}-15$ | Robust, $60 \%$ | $2.3 \mathrm{e}-71$ |

$$
\log g(t) \approx \log g\left(t_{\text {homo }}\right)+\sum_{d \in \mathscr{D}} \int_{z=t_{\text {homo }}}^{t} \lambda_{d}(z) d z
$$

More challenging are the tensors that appearwhen computing the derivatives of Jacobian matrices with respect to the width vector. MATLAB's automatic - differentiation (AD) package fromINTLAB [17] is used in our implementation. We had to write our own code for tensor operations, despite the fact that this INTLAB offers good supportfor operations with vectors and matrices. Since our code is based on the INTLAB core, it is not affected by specifics of the device models used or the circuit being studied. The fact that INTLAB uses operator overloading to implement the forward AD algorithm is a more seriousflaw. This adds a slowing element to the code that, in our opinion, a backward algorithm could eliminate. Even though some work has been done onbackward AD for ODEs [18],[191]:[18] does not give some of the operationswe need, including Hessians; and [19] examines AD for higher-order derivatives but does not provide implementations.
As a final design constraint, we added requirements for the jamb-latch and robust synchronizer's correctness as well as a minimum transistor width and maximum overall width. A pull-down path (such asml and nit in Figures lb and 1c) must prevail in both configurations over the pull-ups in the inverter bwd. The fwd inverter and transistor m3 operate similarly. At eachoptimization step, the optimizer computes the minimum safe widths for thesetransistors, adds a user-specified margin, and adjusts the widths if needed to ensurea correct design.

## V.SYNCHRONIZER COMPARISON

We applied the analysis methods outlined in the preceding sections. Our transistor models are a streamlined version of the EKV [21] with parameters that have been fitted to the PTM [22] 45nm model. Miller capacitances and other nonlinearcapacitances between signal nodes are included in our circuit models since the capacitance model accounts for nonlinearcapacitances between all terminal pairs of aMOSFET device. Our reported figures are for synchronizers that are running nearlytwice as quickly as they should be since ourmodels do not account for connectioncapacitance. This holds true for all of the designs, thus the relative performance comparisons shouldn't be much impacted. Two-flip flops make up everysynchronizer. Each latch in the Strong-Armlatch synchronizer implements a positive edge-triggered flip-flop, but in the other designs, each flip- flop is made up of two latches, with the master being updated when the clock is low and the slave being updated when the clock is high. To create designs that could work for a genuine cell- library, all latches for each synchronizer were optimised to have the same transistor dimensions. On the other hand, because we sought to evaluate the circuit design andnot the specifics of a particular set of design rules, we offered the optimizationmethod continuous possibilities of transistor widths. We established a total width budget of four times the number of transistors times the minimum width and a minimum width of 200 nm for each transistor. With tcr it set to 120 ps after the second rising clock edge, all designs were tested at 1.6 GHz clock frequency.
The results are summarised in Table I. The NMOS pull-down networks must be able tooverwhelm a PMOS pull-up in order to enable the jamb latch and robust synchronizers. We optimised designs with a $20 \%$ margin, which means the NMOS network can sink $20 \%$ more current than the minimum required, and a $60 \%$ margin to accommodate for PVT fluctuation. The strong synchronizer [8] comes out on top. According to the failure probability in Table I, the robust synchronizer's $r$ should be around 4 ps , which logically equates to the 20.7 ps figure given in [8] for a 180 nm implementation. Remarkably, the robustsynchronizer's foundation, the jamb-latch [10], performs quite poorly in the $45 \mathrm{nmtechnology} \mathrm{employed} \mathrm{in} \mathrm{this} \mathrm{paper}$.
Below, we go over each design in greater depth. We used the pass-gate latch as our starting point because it isa straightforward design that is well-known to mostdesigners. All transistors in the inter stage coupling inverters and pg were reduced to their smallest size by the optimizer. Onpage 3, the transistors are a little bit bigger. We were astonished to find that pg1somewhat increases the gain of the synchronizer as metastability switches between latches, as discussed in Section 111.4.
The jamb latch delivered both an anticipatedoutcome and a surprise. The projected outcome was that when developed in a 45 nm (or smaller) technology, the NMOS pull-down transistors would need to be quite large. The NMOS devices don't have a significant advantage over their PMOSequivalents because to velocity saturation. The main reason why the jamb latch design performs poorly is the wide widths of the NMOS transistors. We were surprised tofind out that there are two different ways to move metastability between latches based on the $\mathrm{P}: \mathrm{N}$ ratio of the output inverter for each latch. When stage I is metastable, the "typical" condition is that the output inverterdrives the following d input of stage $i+1$ to avalue that is too low to switch the nextstage. When stage I resolves to raise d just as the timer for stage I+1 expires, metastability spreads. In the alternative scenario, when stage I is metastable, theoutput inverter of stage I is high enough to flip stage $\mathrm{i}+1$. When stage I resolves to drive d low shortly after the clock for stage I +1 goes high, metastability spreads. As a result, stage I +1 experiences a runt pull-down pulse, which leads to metastability.
The jamb latch design and the resilient synchronizer both have two forms of metastability propagation. The fact that the optimizer uses minimal width pull-ups in the fwd and bwd inverters is a more intriguing finding. In the absence of metastability, those pull-ups guarantee accurate logical operation. Largertransistors ma and m 9 offer substantiallyhigher pull-up current when resolving metastability since their gates are grounded by boost B. The inverters' tiny pull-ups make it possible for the pull- down networks, or $\mathrm{ml}, \mathrm{m} 2$, and m 3 , to berelatively compact. Although it isn't mentioned in the publication, based on the transistor sizes utilized in C8], it appears that they were aware of this

# International Journal of Engineering Technology and Management Sciences 

Website: ijetms.in Special Issue: 1 Volume No. 7 April - 2023
characteristic. The discovery of an undocumented design feature made us very happy.
It was initially unexpected that the Strong Arrn design performed poorly because the Strong Arm is renowned for performing well and using little power in logiccircuits. It appears to be well suited for resolving metastability given its sense- amp approach. A closer look reveals that it has additional loads on both sides of the cross-coupled pair as well as additional series transistors on its pull -down paths, both of which cause problems.

## V.CONCLUSIONS

We have presented the metastability analysismethods that quantify the contribution of each transistor and logic element of a synchronizer to the overall performance. In particular, the time-varying models from [2]can be decomposed into a sum of contributions from each component. We have implemented the algorithm and applied it to real synchronizer designs. Our techniques take into account the implications of non-linear circuit behaviourwhen metastability progresses across latches as well as the small-signal linearbehaviour of latches in metastability.
We demonstrated examples of how the pass-gates in a straightforward synchronizer can function as active amplifiers that aid in the metastability solution. We demonstrated that, depending on the transistor sizes in thelatch circuit, the well-known jamb latch canconvey metastability between stages in two qualitatively different ways.
Automatic device scaling is made possible by the capacity to take into account the contribution of each transistor to the resolution of the metastability. Thisallowed us to compare the designs and optimize four well- known synchronizer circuit topologies. We discovered design characteristics as a result that, to the best of our knowledge, have never been published. For instance, we found that Zhou's robust synchronizer's passive PMOS pull- ups [8] result in transistor size that significantly increases the design's resistance to PVT fluctuation, particularly 11/v IOS-vs-PMOSskew. The tolerance to low operating voltages is emphasized in Zhou's paper, although the advantage of tolerating PVT volatility is not mentioned.
Further research is needed in a number of areas. Initially, we want to build numerically integrable backwards automatic differentiation algorithms. The forward technique is now the bottleneck for optimization, and a backwards computationshould be able to reduce the time it takes to determine ideal transistor sizing by an order of magnitude. Currently, a full optimization run can take up to a day. We emphasize that our method successfully applies gradient descent to a numericallyintegrated result, where the objective function and gradient were computed using Aland the integrand was obtained using AD.A quicker optimizer would allow for more experiments, even though our current implementation shows that this deepintegration of scientific and symbolic computation is feasible. The wagging synchronizer [23], the pseudo-NMOS synchronizer [24], and the voltage- boosted synchronizer [25] are all in trigging possibilities.

## REFERENCES

[1]S. Yang and M. R. Greenstreet, "Computing synchronizer failure probabilities," in Proceedings of the 13 Design, Automation and Test, Europe Conference, Apr. 2007, pp. 1361-1366. [Online].Available:http://d].acm.org/citation.cfm?id=1266366.12666 63
[2]_Reiher, M. Greenstreet, and I. Jones, "Explaining metastability in real synchronizers," in Proceedings of the 24" International Symposium on Asynchronous Circuits and Systems,May2018, pp.5967.[Online].Available:http://dx.doi.org/10.1109/ASYNC.2018.00024
[3]D. M. Chapiro, "Globally-asynchronous, locally-synchronous systems," Ph.D. dissertation, Department of Computer Science, Stanford Univer-sity, Oct. 1984, tech. Report STAN-CS- 84-1026. [4]P. Techan, M. Greenstreet, and G.Lemieux, "A survey and taxonomy of GALS design styles," IEEE Design and Test, vol. 24, no. 5, pp. 418-428, 2007.
[5]Y. Thonnart, P. Vivet, and E Clermidy, "A fully-asynchronous low- power framework for GALS NoC integration," in Proceedings of Design, Automation and Test in Europe (DATE), Mar. 2010, pp.

# International Journal of Engineering Technology and Management Sciences 

33-38.
[6]G.R.Couranz and D.F.Wann, "Theoretical and experimental behavior of synchronizers operating in the metastableregion," IEEE Transactions on Computers, vol.C-24, no. 6, pp. 604-616, June 1975. [Online]. Available: http:/dx.doi.org/10.1109/T-C.1975.224273 Synopsis, Inc., "HSPICE: the gold standard for accurate circuit simulation," web page,2017.[Online].Available:https://www.synopsys.'com/verification/ams-verification/circuit simulation/hspice. Html
[7]J. Zhou, D. Kinniment et al, "A robustsynchronizer," in IEEE Computer Society Annual Symposiumon Emerging VLSI Technologies and Architectures (ISVLSI), Mar. 2006. [Online]. Available: https://ieeexplore.ieee.org/abstract/document/1602487/
[8]D. W. Dobberpuhl, "Circuits and technology for Digital's StrongARM and ALPHA microprocessors," in Proceedings of the 17" Conference on Advanced Research in VLSI, Sep. 1997, pp. 2-11.
[9]C. Dike and E. Burton, "Miller and noise effects in a synchronizing flip-flop," IEEE Journal of Solid-State Circuits, vol. 34, no. 6, pp. 849-855, 1999.
[10] D. 1. Kinniment and D. B. Edwards, "Circuit technology in a large-scale computer system" in Proceedings of the Conference on Computers - Systems and Technology, Oct. 1972, pp. 441-450. [Online]. Available:http://iwww.async.org.uk/David. Kinniment/Research/papers/system1972.PDF
[11] C. Molnar and T. Chaney, "Anomalous behavior of synchronizer and arbiter circuits," IEEE Transactions on Computers, vol. 22, no. 4, pp.. 421-422, 1973. [Online]. Available:http://doiAececomputersociety.org/10.1109/1-C.1973.223730
[12] D. Kinniment and J. Woods, "Synchronisation and arbitration circuits in digital systems," Proceedings of the IEE, vol. 123, no. 10, pp.. 961-966, Oct. 1976. [Online]. Available:http://www.async.org.uk/David.Kinniment/Research/papers/IEE1976.pdf
[13] D. J. Kinniment, A. Bystrov, and A. V. Yakovlev, "Synchronization circuit performance," IEEE Journal of Solid-State Circuits, vol. 37,no. 2, pp. 202-209, 2002.
[14] M. Abas, A. Bystrov, D. J. Kinniment, O.V. Maevsky, G. Russell, and A. V. Yakovlev, "Time difference amplifier," Electronics Letters, vol. 38, no. 23, pp. 1437-1438, Nov 2002. [Online]. Available: http:\#/dx.doi.org/10.1049/¢]:20020961
[15] N. M, Alahmadi, G. Russell, and A.Yakovlev, "Time difference' amplifier designwith improved performance parameters," Electronics Letters, vol. 48, no. 10, pp. 562-563, May 2012.[Online].Available:http:/fdx.doi.org/10.1049/e1.2011.3330
[16] S. Rump, "INTLAB - INTervalLABoratory," in Developments in Re- liable Computing, T. Csendes, Ed. Dordrecht: Kluwer Academic Publishers, 1999, pp. 77-104, http://www.ti3.tu- harburg de/rump/.
[17] B. Carpenter, M. D. Hoffman, M. A. Brubaker, D. Lee, P Li, and M. Betancourt, "The Stan math library: Reverse-mode automatic differentiation in C++." arXiv:1509.07164,2015. [Online].Available:https://arxiv.org/abs/1509.07164
[18] M. Wang, "High order reverse mode ofautomatic differentiation," Ph.D. dissertation,Purdue University, Jan. 2017.
[19] C. Yan, "Reachability analysis-based circuit-level formal verification," Ph.D. dissertation, The University of British Columbia, 2011.
[20] C. Enz, "An MOS transistor model for RFIC design valid in all regions of operation," IEEE Transactions on Microwave Theory and Techniques, vol. 50, no. 1, pp. 342-359, Jan 2002. [Online].Available:https:/fieeexplore.ieee.org/abstract/document/98 1286
[21] Y. Cao, "PTM: predictive technologymodel" http:/ptm.asu.edu,2008.
[22] M. Alshaikh, D. Kinniment, and A. Yakovlev, "A synchronizer design based onwagging," in 2010 International Conference on Microelectronics, Dec 2010, pp. 415-418.
[23] S. Yang, I. W. Jones, and M. R. Greenstreet, "Synchronizer performance in deepsub-micron technology," in Proceedings of the 17 International Symposium on Asynchronous Circuits and

Systems, 2011, pp. 33-42. [Online]. Available:dx.doi.org/10.1109/ASYNC.2011.19 Y. Li, P. L-I.
Chuang, A. Kennings, and M. Sachdev, "Voltage-boosted synchronizers," in Proceedings of the 25th Edition on Great Lakes Symposium on VLSI. New York, NY, USA: ACM, 2015, pp. 307312.[Online].Available:https://doi.org/10.1145/2742060.2742075

