[1]\fnmJian \surWu

[1]\orgdivSchool of Electronic Engineering, \orgnameBeijing University of Posts and Telecommunications, \orgaddress\cityBeijing, \postcode100876, \countryChina

Integrated photonics modular arithmetic processor

\fnmYuepeng \surWu [email protected] \fnmHongxiang \surGuo [email protected] \fnmBowen \surZhang [email protected] \fnmJifang \surQiu [email protected] \fnmZhisheng \surYang [email protected] [email protected] *

Abstract

Integrated photonics computing has emerged as a promising approach to overcome the limitations of electronic processors in the post-Moore era, capitalizing on the superiority of photonic systems. However, present integrated photonics computing systems face challenges in achieving high-precision calculations, consequently limiting their potential applications, and their heavy reliance on analog-to-digital (AD) and digital-to-analog (DA) conversion interfaces undermines their performance. Here we propose an innovative photonic computing architecture featuring scalable calculation precision and a novel photonic conversion interface. By leveraging Residue Number System (RNS) theory, the high-precision calculation is decomposed into multiple low-precision modular arithmetic operations executed through optical phase manipulation. Those operations directly interact with the digital system via our proposed optical digital-to-phase converter (ODPC) and phase-to-digital converter (OPDC). Through experimental demonstrations, we showcase a calculation precision of 9 bits and verify the feasibility of the ODPC/OPDC photonic interface. This approach paves the path towards liberating photonic computing from the constraints imposed by limited precision and AD/DA converters.

keywords:

Optical Computing, Integrated Photonics, Modular Arithmetic

1 Introduction

Since the breakdown of Dennard scaling of integrated circuits around 2006, making individual computing units faster and more energy-efficient has become increasingly challenging [1]. This has led to current processors being limited to clock frequencies of a few GHz and commonly experiencing low utilization due to restricted power dissipation [2]. These challenges are the direct consequence of the fundamental physics problems that CMOS transistor technology faces, which results in computing performance being a major bottleneck for many potential applications [3]. Therefore, many efforts have been made to improve the conventional transistor-based computing paradigm with an alternative platform. One of the promising approaches is integrated photonic computing, which enables operation at unprecedented bandwidths [4]. To fully harness the superiority of the photonic platform and construct a competitive computing system, the primary challenge is to find a natural and efficient combination between the arithmetic method and controllable physical effect in integrated photonics, just like the combination between Boolean algebra and transistors (as illustrated in Fig. 1a).

Currently, most published integrated photonics computing research utilizes intensity as the controllable physical quantity, i.e., “calculation quantity”, and adopts arithmetic methods that mimic their electronic counterparts. Some researches aimed at replacing transistor logic gates in the electric Boolean digital system with photonic ones [5, 6]. However, optical intensity signals suffer the loss from both the material absorption and the intrinsic property of non-reciprocal Boolean operation [7] , which makes it challenging to be cascaded to form a photonic arithmetic unit with sufficiently high bit-width. On the other hand, more researches adopted the paradigm of neuromorphic computing [8, 9, 10], which aims to implement matrix-vector multiplication [11, 12, 13], convolution operation [14, 15, 16], or neural network model inference [17, 18, 19, 20] on integrated optical devices with ultra-fast calculation speed. However, the analog intensity signal is susceptible to noise, which limits their calculation precision due to relatively poor signal-to-noise ratio (SNR) [21]. The calculation precision of these published optical neuromorphic systems cannot meet the general requirements [22, 23], and there is no efficient method to improve their calculation precision as compared to digital computation systems [24]. Consequently, its demonstrated workloads are severely restricted to tasks of simplistic functionality, undermining the potential of integrated photonics computing for broader application domains. Additionally, to ensure compatibility with existing digital devices, an optical neuromorphic system operating at ultra-high bandwidth requires high-precision analog-to-digital converters (ADCs) and digital-to-analog converters (DACs) that can match its speed. This requirement poses significant energy consumption challenges [25, 26] while also imposing formidable limitations on the performance of the integrated photonics computing system. These “precision challenge” and “converter challenge” have hindered the practical implementation of the intensity-based methods and have necessitated the pursuit of alternative calculation quantity and arithmetic methods for integrated photonics computing.

Phase is another controllable physical quantity in integrated photonics apart from the intensity and has not yet been fully explored in the field of photonic computing, which can be controlled more efficiently by integrated phase modulator (PM) [27, 28] and exhibiting intrinsic periodicity. That feature of the optical phase aligns naturally with the modular arithmetic in which numbers are folded by a specific modulus [29]. For modulus arithmetic, leveraging a non-traditional numerical representation format, i.e., residue number system (RNS), allows for the decomposition of high-precision computations into multiple parallel low-precision operations [30]. The correspondence between the modular arithmetic and the optical phase provides a foundation for leveraging the properties of modular arithmetic in the efficient extension of calculation precision in integrated photonic computing systems. Here, we propose an integrated phase-based photonics modular arithmetic processor (PPMAP) architecture aimed at addressing the precision and converter challenges in photonics computing. By taking advantage of the unique characteristics of phase, its components, PPMAP units, enable accurate execution of basic modular arithmetic operations and support multi-operand operations that are challenging to implement in electronic processors. The precision of this architecture can be effectively enhanced using the RNS theory. In a proof-of-concept experiment, the three-unit PPMAP achieved 9-bit calculation precision, which is unprecedented for intensity-based photonic computing architecture. Moreover, based on the multi-operand characteristics of PPMAP architecture and our previous work on phase-shifted optical quantizer[31, 32], we demonstrate the accurate loading and extraction of information into and from phase domain with the seamlessly integrated photonics conversion interfaces, which make advancement in elimination the constraints of AD/DA converters. Further simulations indicate that PPMAP can achieve comparable reliability to electronic processors under reachable SNR conditions.

Refer to caption — Figure 1: (a) General structure of emerging photonic computing. For a controllable physical phenomenon, once a “controlling quantity” is specified, the “calculation quantity” evolves rapidly based on a specific physical relationship. By exploiting this relationship and its inherent correspondence between a specific arithmetic method, efficient calculations can be achieved through the loading and extraction of digital information via converters. (b) Schematic diagram of the PPMAP architecture. (c) The mapping between operands and phase when take modulo-12 as an example, along with a demonstration of basic modular arithmetic performed based on this mapping.

2 Results

2.1 PPMAP architecture

The basic structure of the PPMAP unit is depicted in Fig.1b. A beam splitter (BS) evenly splits the power of an incident monochromatic laser into two beams, which propagate through two separate paths. Each path is equipped with a set of cascaded phase modulators (PMs). Finally, the phase difference between the signals in the two paths is extracted by a phase-to-digital converter (PDC) linked at the end. Since both beams originate from the same laser source, in the absence of a voltage bias on the modulator, they exhibit a stable static phase difference denoted as $\varphi_{0}$ . When a voltage is applied to one of the modulators, it introduces a dynamic phase difference denoted as $\Delta\varphi$ . In the case that simultaneously applies voltage to the PMs, the overall dynamic phase difference is the sum of their individual contributions.

To perform modular arithmetic operations under modulus $m$ , we assign $m$ discrete phase points from the analog quantity $\Delta\varphi\in[0,2\pi)$ to represent the $m$ elements in the set $\{0,\ 1,\ 2,\ldots,m-1\}$ . These points should include the phase of 0 as the identity element and be uniformly spaced with an interval of $2\pi/m$ to satisfy the closure property, which ensures that the result of the addition operation remains within the same set [29]. Consequently, the integers in modulo $m$ are mapped to the discrete dynamic phase differences $\Delta\varphi_{\mathbb{Z}}=\{0,2\pi/m,4\pi/m,...,2\pi-2\pi/m\}$ . This mapping allows for modular addition or subtraction operations by loading operands into the corresponding phase modulators on the same or different paths, as shown in Fig.1c, which illustrates an example of $m=12$ . To perform a modular addition operation with operands “10” and “7”, followed by subtraction with “3”, we regulate two PMs on the same path and one PM on the other path to induce phase changes of $5\pi/3$ , $7\pi/6$ , and $\pi/2$ , respectively. The closure property allows us to derive the results once we identify the specific discrete phase points from $\Delta\varphi_{\mathbb{Z}}$ . The PDC then determines the resulting phase $\Delta\varphi=11\pi/6$ , corresponding to the calculation result of ” $|10+7-3|_{12}=2$ ”. With that efficient implementation of modular addition, we can implement modular multiplication with the same structure through the method of index-sum multiplication that performs modular multiplication by simply a modular addition and table lookup operation [33].

Additionally, by leveraging the theory of residue number system (RNS), which represents the integer operand, denoted as $X$ , by its remainders of a set of pairwise coprime moduli $\{m_{1},...,m_{N}\}$ , i.e., the RNS format:

X_{\rm RNS}=X\%\{m_{1},\dots,m_{N}\}=\{|X|_{m_{1}},\dots,|X|_{m_{N}}\}

(1)

, the original operation with high bit-width operands can be broken into multiple parallel modular arithmetic operations with much lower bit-width operands and different moduli, as long as the result is within the range $M=\prod_{i=1}^{N}m_{i}$ . By orchestrating multiple PPMAP units with different moduli according to the RNS theory, as shown in Fig.1b, we construct the PPMAP architecture with bit-width of $\lfloor\log_{2}\prod_{i=1}^{N}m_{i}\rfloor$ , thereby realizing calculation with scalable precision.

2.2 High-precision integrated photonic calculation

Fig.2a illustrates the integrated photonics chip utilized for the principle verification experiments, which was fabricated on a commercially available 180-nm SOI platform via the multi-project wafer (MPW) process and corresponds to the fundamental structure depicted in Fig.1b. The continuous wave (CW) laser is coupled into the integrated waveguide through a grating coupler and is evenly separated via a subsequent $1\times 2$ multimode interferometer (MMI). Those two laser beams then respectively traverse two adjacent waveguides where four thermal PMs load operands into the produced phase difference. The magnitude of the phase difference induced by a specific voltage value applied to the PM can be precisely controlled by the method described in AppendixA. Then, the two laser beams with a phase difference carrying the calculation result of those operands enter into an MMI-structured optical PDC (OPDC) to identify and quantify phase information. Fig.2b displays a micrograph of the OPDC with five output channels where the two input laser beams take place interference based on the self-imaging effect [34]. That makes the optical power at each output channel capture different profiles of the phase information $\Delta\varphi$ . The relationship between the phase changes induced by varied control voltages applied to a modulator, PM0, (as depicted in Fig.2c) and the optical power of the five channels is shown in Fig.2d. Performing threshold decision on the optical power signal of a channel using comparators divides the 360-degree phase range into multiple phase intervals. This division occurs due to a change in the binary codewords (0s and 1s) generated by that channel. By performing decision on the optical power of each channel, the specific phase interval to which the current phase belongs can be determined according to the obtained Gray code digital output [32]. For accurately discriminating the information encoded in the phase, we introduce an additional static phase bias $\varphi_{b}$ and optimize it along with the decision thresholds (see AppendixB).

Fig.2c-f demonstrates the experimental process of the loading and extracting phase information for the case of $m=10$ . The voltages corresponding to different discrete phases in $\varphi_{b}+\Delta\varphi_{\mathbb{Z}}$ are applied to the modulator in varying time slots. By recovering the phase value from the multiple-channel optical power signals in Fig.2e, it can be observed that the phase is controlled to be in the middle of the expected phase intervals, as shown in Fig.2f. Furthermore, based on the digital results of the threshold decision to the optical power of multiple channels, the specific discrete phase and its corresponding integer value for each time slot are discerned. The methods of phase information loading and extraction described above are applicable to different moduli. This flexibility enables the instantiation the PPMAP units corresponding to different moduli using a single integrated photonic device.

We further evaluated the practicality of PPMAP architecture by experimentally implementing a 9-bit three-operand addition operation using moduli $\{10,9,7\}$ . During the experiment, the same physical PPMAP unit depicted in Fig2 performed all the calculations involving different moduli, i.e., implementing three PPMAP units utilized one device via time division multiplexing. Moreover, the optimization of $\varphi_{b}$ and $\vec{d}$ for each modulus was performed beforehand. We selected seven sets of independent operand samples for this demonstration and performed $A+B+C$ operation on each set separately. The bit width of the binary operands is $\left\lfloor\log_{2}\left(10\times 9\times 7\right)\right\rfloor=9$ bit. After converting into RNS format, the original multi-operand addition operation transformed into three corresponding modular arithmetic operations with different moduli of $\{10,9,7\}$ .

For each specific modulus, the operations on different sample sets perform consecutively at varying time slots. For implementing the operation of a single set in the phase domain, we simultaneously apply the voltages corresponding to the converted operands to multiple PMs. As shown in Fig.3a, we apply a constant voltage to PM0 to assign it the responsibility of providing the bias phase $\varphi_{b}$ . PM1, PM2, and PM3 are also assigned to accurately map operands A, B, and C to their respective phase change amounts. The mapping method in the case of $m=9$ is illustrated in Fig.3b as an example: since PM2 and PM0 locate on identical waveguides and $\varphi_{b}$ has been provided by PM0, we directly control the voltage of PM1 to induce a phase change of $\Delta\varphi_{X}=X\cdot 2\pi/m$ for operand $X$ . On the other hand, since PM1 and PM3 are located on different waveguides than PM0, it is necessary to control the voltage to generate a phase change equal to $2\pi-\Delta\varphi_{X}$ when mapping X to phase $\Delta\varphi_{X}$ . The applied voltage signals are shown in Fig.3c, and their consequent calculation results in the phase domain are illustrated in Fig.3d. For a specific time slot, the phase values fall within the phase interval corresponding to the exact computed results, performing the desired calculations in the phase domain. Fig.3e displays the integer signal generated by the OPDC, which infers the precise extraction of the computation result in RNS format from the phase domain.

The RNS result can be mapped into the binary format using the reverse conversion, such as the Chinese Remainder Theorem (CRT) algorithm as demonstrated in Fig.3. Calculations performed in RNS format agree with binary ones as long as the final result does not exceed $M=630$ . It is worth mentioning that the relatively complex reverse conversion process is unessential. It can be avoided by carefully selecting the workload and employing advanced residue interaction methods [30]. Moreover, recent research in computer architecture verifies that, including memory hierarchy, all computer parts can efficiently run within the RNS format [35].

2.3 Optical Digital-to-Phase Converters

To truly resolve the “converter challenge”, in addition to the OPDC, dedicated electronic-photonic DAC modules that can convert digital operands into the analog phase domain are crucial. Since a DAC essentially performs a weighted summation with fixed weight on a digital operand, denoted as $X$ , and concurrently transfers the result into analog quantity $\Psi$ :

X=\left(x_{L-1}x_{L-2}\ldots x_{0}\right)_{2}\rightarrow\Psi_{X}\left(\sum_{i=0}^{L-1}2^{i}x_{i}\right)

(2)

where $x_{i}=0\text{ or }1$ and $L$ represents the bit-width of $X$ , we can leverage the just-demonstrated multi-operand feature of PPMAP architecture to implement the optical digital-to-phase converter (ODPC).

An ODPC comprises a collection of PMs positioned along the paths of the PPMAP unit. It receives the digital signal of one operand from the parallel bus. In the basic design of the ODPC, it is equipped with $L$ PMs that are respectively controlled by each binary digit $x_{i}$ in the $X$ . Each PM in the ODPC is engineered to exhibit a specific phase response to the voltage amplitude of the digital signal. In particular, when the phase response of the $i^{th}$ PM is designed as $\Delta\varphi_{i}=2\pi|2^{i}|_{m}/m$ , the OPDC enables the direct transformation of the $L$ synchronously applied digital signal into the resulting summation in the phase domain:

\Delta\varphi_{X}=\sum_{i=0}^{L-1}x_{i}\Delta\varphi_{i}=\sum_{i=0}^{L-1}x_{i}|2^{i}|_{m}\frac{2\pi}{m}=\frac{2\pi}{m}\sum_{i=0}^{L-1}|2^{i}x_{i}|_{m}=\frac{2\pi}{m}|X|_{m}

(3)

When the digital operand is in RNS format, satisfying $X<m$ , the OPDC produces the phase value $\Delta\varphi=2\pi X/m$ , performing the role of a DAC in the phase domain. Furthermore, when the digital operand $X$ exceeds the modulus $m$ , the ODPC performs the digital-to-analog conversion and the format conversion simultaneously.

The experimental demonstration of a 4-bit ODPC along with an OPDC, as shown in Fig.4, validated the successful operation of this photonic conversion interface. In this experiment, digital operands representing 4-bit binary codes ranging from 0 to 15 were sequentially transmitted to the corresponding phase modulators PM0, PM1, PM2, and PM3. With $m=7$ , the phase modulators PM1, PM2, and PM3 generated phase changes of 0 or $2\pi|2^{i}|_{7}/7$ based on the values of the digital signals (0 or 1) at each bit position $i$ , while PM0 generated phase changes of $\varphi_{b}$ or $\varphi_{b}+2\pi|2^{i}|_{7}/7$ , providing an extra bias phase simultaneously. The experimental results show that the OPDC successfully performed the modulo-7 remainder calculation on the 4-bit binary operands and accurately transferred the results to analog phase values.

In the generalized ODPC design, low-bit-width DACs are utilized to achieve a larger overall bit width with the given number of PMs (see Appendix C). As integrated intensity modulators, such as microring resonators, exhibit poor linearity, achieving accurate mapping of operands with a given precision into transmissivities typically requires the utilization of higher precision DACs [36]. In contrast, the favorable linearity of phase modulation and the characteristics of phase summation enable the decomposition of high-bit-width digital-to-analog conversion tasks into parallel low-bit-width tasks, which can significantly reduce system costs and delays [26] and endow the PPMAP for more efficient processing capabilities.

3 Discussion

The PPMAP architecture offers superior calculation precision compared to intensity-based photonics computing methods. While intensity-based methods heavily rely on the system’s SNR for calculation precision, PPMAP leverages the RNS theory to distribute the calculation precision across multiple physical implementations. This decomposition substantially reduces the dependence on SNR and enables a relatively high calculation precision. Section 2.2 has demonstrated the superior calculation precision of PPMAP (9 bits) compared to intensity-based methods. Now, the focus is on further analyzing the impact of noise on both approaches by conducting simulations to compare their performance in a given 15-bit multiplication scenario.

In the simulations, the specific-setted PPMAP contains four units corresponding to the chosen moduli set of $\{7,11,19,23\}$ to implement the 15-bit index-sum multiplication. Each PPMAP unit is equipped with OPDC with $\{3,5,9,11\}$ channels where additive Gaussian white noise applies to its optical power signal. Under different SNR conditions, 40,000 pairs of randomly generated integers were selected as the operands for the multiplication operation. The simulation results presented in Fig.5a indicate that the relative error and relative frequency of errors in PPMAP decrease significantly with increasing SNR. At an SNR of 31 dB, error-free 15-bit calculation was achieved in all 40,000 samples. On the other hand, the intensity-based method’s improvement in relative error with respect to SNR is less prominent, and it falls far short of achieving an 8-bit calculation precision.

Furthermore, to quantify the reliability of the PPMAP architecture, a probability model (Equation 4) was developed (detailed in AppendixD):

{\rm P}=1-\prod_{i=1}^{N}\prod_{j=1}^{K_{i}}\frac{1}{2}\left[1+{\rm erf}(\frac{\sqrt{3}\sin\left[\frac{\pi}{K_{i}}(j-\frac{1}{2})\right]}{2\sqrt{2}\cdot 10^{-\frac{{\rm SNR}}{20}}})\right]

(4)

As shown in Fig.5b, the predicted probabilities of system errors at different SNR levels aligned well with the simulated relative frequency of errors. With the addition of error correction mechanisms, such as Triple Modular Redundancy (TMR) [37], the error probability can be reduced to impressively low levels. For example, with an SNR of approximately 34 dB and employing TMR, the system can achieve an error probability as low as $3{\rm P}^{2}-2{\rm P}^{3}\approx\text{3e-20}$ [38], indicating a mean time between failure (MTBF) [39] of around ten years when continuously operating at a frequency of 100 GHz. This level of reliability is comparable with commercial electronic processors [40], which makes PPMAP a promising candidate for performing high-precision and reliable computation tasks.

4 Conclusion

We have proposed PPMAP, an integrated photonic computing architecture that performs modular arithmetic based on the optical phase and RNS. The architecture is capable of addition, subtraction, and multiplication and supports multi-operand operations. PPMAP can achieve relatively high precision, and a 9-bit calculation was experimentally demonstrated. Further simulation extends its calculation precision to 15 bits and indicates that its reliability can approach the level of commercial processors under the SNR condition of 34 dB when using error correction mechanisms. Moreover, PPMAP seamlessly integrates our proposed ODPC and OPDC, which holds the potential to overcome the long-standing limitations imposed by electronic AD/DA interfaces in photonic computing, and a proof-of-concept experiment demonstrated the feasibility of this photonic conversion interface. Further optimization of PPMAP’s functional module needs to be explored to enhance the overall system performance.

\bmhead

Supplementary information

The supplementary is located in the Appendix of this article.

\bmhead

Acknowledgments

Appendix A Phase Recovery from Multi-Channel Optical Power Signals

To recover the phase difference between two paths, $\varphi\triangleq\Delta\varphi+\varphi_{0}$ , from multi-channel optical power signals $\vec{I}$ , it is necessary to have knowledge of the corresponding multi-channel optical power states at different values of $\varphi$ beforehand. That requires utilizing voltage as an intermediate quantity and obtaining two mappings:

1.

Mapping between the applied voltage to PM and resulting $\vec{I}$
2.

Mapping between the voltage and its produced $\varphi$

An experimental procedure was conducted to access the relationship between $\vec{I}$ and $\varphi$ . The voltage applied to PM0 was varied from 0V to 10V with a voltage increment of 0.01V. Then the first mapping is obtained by recording the multi-channel optical power values at different voltage levels. The optical power for the $k$ -th channel and produced phase variance at voltage index $i$ are denoted as $I_{k,i}$ and $\Delta\varphi_{i}$ . Moreover, the establishment of the second mapping is based on the prior knowledge of the photonics device. Theoretically, the variation of output optical power in $K$ -channels optical quantizer follows a cosine function with respect to the phase [31] :

	$\displaystyle I_{k,i}$	$\displaystyle=\frac{P_{\rm chip}}{K}\cdot\left(1+\cos\varphi^{k}_{i}\right)=\frac{P_{\rm chip}}{K}\cdot\left[1+\cos(\Delta\varphi_{i}+\varphi_{0}^{k})\right],$		(5)
	$\displaystyle\varphi_{0}^{k}$	$\displaystyle=\begin{aligned} \left\{\begin{aligned} &\varphi_{0}-\frac{k}{2}\cdot\frac{2\pi}{K}&\text{for even }k\\ &\varphi_{0}+\frac{k-1}{2}\cdot\frac{2\pi}{K}&\text{for odd }k\\ \end{aligned}\right.\end{aligned}$		(5)

, where $P_{\rm chip}$ is the power of the input laser coupled to the chip. The multiple-channel signals are analyzed individually. Firstly, The signal from the $k$ -th channel, denoted as $I_{k}$ , is scaled to [-1,1] to conform to the input range of the arccosine function:

\widehat{I}_{k}=2\times\frac{I_{k}}{\max(I_{k})}-1

(6)

As the arccosine function is not monotonically changing with respect to the phase $\Delta\varphi$ , the relationship of $\varphi_{i}^{k}$ is estimated based on the absolute value of the variation of the arccos function:

\varphi_{i}^{k}=\left\{\begin{aligned} &\varphi_{i}^{k^{\prime}}\triangleq{\rm arccos}\ \widehat{I}_{k,i}&i=0\\ &\varphi_{i-1}^{k}+|\varphi_{i+1}^{k^{\prime}}-\varphi_{i}^{k^{\prime}}|&i\neq 0\end{aligned}\right.

(7)

As that accumulation-based approach introduces errors that accumulate over time, a further process is employed to enhance the precision. This process utilizes the ratio of normalized intensity signals with adjacent voltages, denoted as $R_{k,i}=\widehat{I}_{k,i+1}/\widehat{I}_{k,i}$ :

\varphi_{i}^{k}=\arctan\left[\frac{\cos\left(\delta_{i}^{k}\right)-R_{k,i}}{\sin\left(\delta_{i}^{k}\right)}\right]+N\pi

(8)

where $\delta_{i}^{k}=\varphi_{i+1}^{k}-\varphi_{i}^{k}$ is the finite difference of phase with respect to voltage, which along with the integer value $N$ is determined by the previous estimation obtained through Eq.7. This process provides finer estimates within the $\pi$ interval based on the coarse-grained results provided in the previous step. Given the availability of multiple channels, the final step involves an averaging of the phase relationships obtained from each channel :

\varphi=\varphi_{0}+\overline{\Delta\varphi_{i}}=\varphi_{0}+\frac{1}{K}\sum_{k=1}^{K}\varphi_{i}^{k}-\varphi_{0}^{k}

(9)

With the obtained mapping between $\vec{I}$ and $\varphi$ , the phase at a given moment can be determined from the measured optical power. Expressly, the current $\varphi$ is specified as the phase value corresponding to the previously recorded optical power with the minimum Euclidean norm compared to the currently measured optical power. It is worth noting that in Fig.2f, Fig.3d, and Fig.4, there are waveform spikes observed in the obtained phase signals at slot boundaries. These spikes result in significant deviations in 1 to 2 sampling points. These deviations are due to synchronization issues related to the multi-channel voltage sources and multi-channel optical detectors in our experimental setup.

Appendix B OPDC Threshold Optimization

Given the multi-channel optical power signals $\vec{I}(\varphi)=\left[I_{1}(\varphi),I_{2}(\varphi),\dots,I_{K}(\varphi)\right]$ , which were obtained from the experimental procedure in AppendixA, and the thresholds of comparators for the $K$ channels $\vec{t}=\left[t_{1},\dots t_{K}\right]$ . The intersection of $t_{k}$ with its corresponding $I_{k}(\varphi)$ determines two boundary points in the 360-degree phase range. The sorted boundaries of all channels is denoted as $\vec{p}_{{\rm expt}}=\vec{p}_{{\rm expt}}(\vec{I},\vec{t})$ , which represents the actual phase boundary points. And in the ideal scenario, the phase boundaries $\vec{p}_{{\rm ideal}}$ should shifted by half interval $\frac{2\pi}{2m}$ to the discrete phase points $\varphi_{0}^{{}^{\prime}}+\Delta\varphi_{\mathbb{Z}}$ , where the total static phase $\varphi_{0}^{{}^{\prime}}$ is adjusted by applied an extra bias phase $\varphi_{b}$ . Thus the ideal phase boundaries can be described as $p_{ideal}^{i}=\varphi_{0}+\varphi_{b}+i\cdot\frac{2\pi}{m}+\frac{\pi}{m}$ .

To accurately extract the phase information and reduce the probability of incorrect quantization operation, an optimization procedure was performed to find the optimal values of $\varphi_{b}$ and $\vec{t}$ that minimize the discrepancy between $\vec{p}_{{\rm ideal}}(\varphi_{b})$ and $\vec{p}_{{\rm expt}}(\vec{I},\vec{t})$ . This optimization aims to enhance the robustness of OPDCs by placing the most error-prone phase point in a less vulnerable position. The optimization problem can be formulated as follows:

\left[\varphi_{b}^{\ast},\vec{t^{\ast}}\right]_{\vec{I}}=\arg\min_{\varphi_{b},\vec{t}}\ \max_{i}\left\lvert p_{{\rm ideal}}^{i}(\varphi_{b})-p_{{\rm expt}}^{i}(\vec{I},\vec{t})\right\rvert

(10)

The optimization objective is to minimize the loss $\triangleq\lVert\vec{p}_{{\rm ideal}}(\varphi_{b})-\vec{p}_{{\rm expt}}(\vec{I},\vec{t})\rVert_{\infty}$ . Solving this optimization problem using gradient-based methods can be challenging. Therefore, a heuristic optimization method, the differential evolution (DE) algorithm [44] is employed. The DE solver provides multiple trial values for $\vec{t}$ at each iteration. A search algorithm was performed for each trial threshold to determine the minimized loss under a specific search range of $\varphi_{b}$ with a given precision. Then those minimum values were passed back to the DE solver for further exploration in the next iteration.

In the practical experiments, the DE solver had a population size of 20 and was run for 40 iterations. Fig.6 presented the iteration process of threshold optimization for $m=7$ and $m=10$ in the case of 5-channels OPDC. When modulus $m=2K$ , the intervals tend to be evenly divided. In our previous works [31], we adopted a uniform threshold approach, where the thresholds for all channels were set to half of the normalized power. This approach equally divides the phase range $2\pi$ into $2K$ intervals in the ideal case. However, as shown in Fig.6a, in real scenarios where manufacturing variations exist, the optimized threshold approach performs better than the uniform threshold approach. Moreover, the optimized threshold approach can adopt different moduli with the same optical device. When modulus $m<2K$ (Fig.6b), there is a need to merge the remaining phase intervals to unify them into uniformly divided intervals of $m$ . This merging process involves mapping multiple different codewords to the same integer.

Fig.7a,b displays the optimized results for $m=9$ and $m=7$ . It is worth noting that for $m=7$ , it is possible to utilize information from 5 optical power signals to quantify the phase information. However, signals from only channels 1, 3, 4, and 5 were employed to reduce the length of the codewords in our experimental setup. The experimental results demonstrate that this approach still yields satisfactory outcomes. Moreover, in Fig.7c, we attach the multi-channel optical power signal of the experiment in Section2.3, which uses the optimized parameters corresponding to figure7b. The time-domain waveforms and the digital results of threshold decision confirm the effectiveness of the optimization method.

Appendix C Generalized Design of ODPC

The incorporation of high linearity phase modulators, e.g., lithium niobate modulators [45], enhances the capabilities of the ODPC by allowing it to be combined with low-bit-width DACs, thus achieving a larger overall bit width in the limitation of given the number of PMs. To accomplish this, the $L$ -bit digital operand is partitioned into G groups corresponding to G DACs, where the $i$ -th DAC has a bit width of $W_{i}$ (with $\Sigma_{i=1}^{G}W_{i}=L$ ). The phase response of the $i$ -th PM to the least significant bit (LSB) voltage of the $i$ -th DAC is set to $\Delta\varphi_{i}^{{}^{\prime}}=2\pi|2^{w_{i}}|{m}/m$ , where $w_{i}=\Sigma_{j=1}^{i-1}W_{j}$ represents the bit position of the LSB of the i-th DAC in the original L-bit operand. By mapping the digital operand $X$ using the DACs and the corresponding PMs, the resulting $\Delta\varphi_{X}$ is corresponding in Eq.3:

\Delta\varphi_{X}=\sum_{i=1}^{G}\sum_{j=0}^{W_{i}-1}2^{j}x_{j}\Delta\varphi_{i}^{{}^{\prime}}=\frac{2\pi}{m}\sum_{i=1}^{G}\sum_{j=0}^{W_{i}-1}2^{w_{i}}|2^{j}x_{j}|_{m}=\frac{2\pi}{m}\sum_{j=0}^{L-1}|2^{j}x_{j}|_{m}=\frac{2\pi}{m}|X|_{m}

(11)

The basic design illustrated in Fig.4 is the specific case in which $W_{i}$ is limited to 1. The PPMAP unit, which incorporates the generalized ODPCs, is depicted in Fig.8a.

For a given electro-optic coefficient of the electro-optic material in the modulator and a given modulus, $m$ , the objective of the grouping strategy is to minimize the magnitude of the modulation phase, which contributes to reduceing the size and power consumption of the modulator [45].

According to Eq.11, the LSB phase response of the $i^{{\rm th}}$ modulator for the DAC should be $2\pi|2^{w_{i}}|_{m}/m$ . To achieve this, we need to choose a grouping method for the L-bit digital signal such that $|2^{w_{i}}|_{m}$ is as close to zero as possible.

Being close to zero implies being in proximity to either 0 or $m$ because a digital value $|2^{i}|_{m}$ close to m can be achieved by modulating the phase of the digital value $m-|2^{i}|_{m}$ on the opposite waveguide. That is achieved by subtracting a small number from a large number in the modulo operation. Fig.8b illustrates examples of bit grouping for moduli of 15, 19, 21, 30, 31, and 33, where the maximum modulation phase of any PM, $(2^{W_{i}}-1)\times\min\left(|2^{i}|_{m},m-|2^{i}|_{m}\right)$ , is designed to around $2\pi$ . We have observed that the bit weights $|2^{i}|_{m}$ can be close to zero by employing an appropriate grouping strategy. And for some moduli, $|2^{i}|_{m}$ exhibits periodicity. The specific analysis for the ODAC design depends on the bit width and performance requirements of the DAC, the characteristics of modulators, and the specific modulus.

Appendix D Analysis of Calculation Precision and Reliability

In Fig5a, the relative error (RE) is defined as the average absolute value to error over the dynamic range. The RE of a calculation with $n$ -bit precision is defined as the $n$ -bit relative quantization error of the calculation result:

{\rm RE}_{n}=\min_{a\in\mathbb{R}}\frac{\int_{a}^{\delta+a}|x|\cdot\frac{1}{\delta}\,dx}{\delta\cdot(2^{{n}}-1)}=\frac{1}{2^{{n+2}}-4}\approx 2^{-(n+2)}

(12)

And the theoretical RE of the intensity-based method under Gaussian noise with uniform intensity is given by:

{\rm RE}_{I}=\frac{\int_{-\infty}^{\infty}|x|f(x)\,dx}{A_{\max}}=\frac{2\int_{0}^{\infty}x\cdot\frac{1}{\sigma\sqrt{2\pi}}e^{\frac{x^{2}}{2\sigma^{2}}}\,dx}{A_{\max}}=\frac{2}{\sqrt{2\pi}}(\frac{\sigma}{A_{\max}})

(13)

where $f(x)$ represents the probability density function of the normal distribution and $A_{\max}$ is the maximum signal amplitude. The ratio of $\sigma$ and $A_{\max}$ is determined by its relationship with SNR and peak-to-average power ratio (PAPR):

{\rm SNR}=20\log_{10}(\frac{A_{\max}}{\sigma})-{\rm PAPR}

(14)

Consequently, the relationships of RE between SNR as well as the calculation precision expressed in the form of the bit length, $n$ , is inferred:

\log_{10}{\rm RE}_{I}=\log_{10}(\frac{2}{\sqrt{2\pi}})-\frac{{\rm PAPR}}{20}-\frac{{\rm SNR}}{20}=-(n+2)\cdot\log_{10}2

(15)

The logarithm of the RE decreases linearly with SNR and $n$ . Assuming the calculation results follow the uniform distribution, it has a PAPR of 4.77 dB (i.e., $A_{\max}=\sqrt{3}A_{{\rm rms}}$ ), Eq.15 can be simplified to:

{\rm SNR}\approx 6.02\cdot n+5.31

(16)

This equation shows that as $n$ increases, the required SNR increases linearly with a slope of 6.02, corresponding to the numerical simulation shown in Fig5a. It is worth noting that the constant term in the equation may differ depending on the specific metrics of calculation error used. However, the coefficient of the linear term remains similar. For example, in the reference [22], they directly employ ${\rm SNR}=20\log_{10}2^{n}=6.02\cdot n$ , which aligns with the derived relationship in Eq.16. That linear term indicates that as the calculation precision increases, the required SNR will reach an unattainable level and imply an exponential increase in energy consumption [21].

For the PPMAP architecture, the probability of the occurrence of errors, denoted as ${\rm P}$ , is equal to 1 minus that of the PPMAP system successfully operating, which means that there should be no errors in any of the $N$ PPMAP units. Moreover, the probability of the $i$ -th PPMAP unit not encountering errors, denoted as ${\rm P}^{{\rm unit}}_{i}$ , is determined by accurately achieving threshold decisions for the output optical power on all the $K_{i}$ channels. The success rate of the $j$ -th channel is represented as ${\rm P}^{\rm chan}_{i,j}$ .

\displaystyle{\rm P}=1-\prod_{i=1}^{N}{\rm P}_{i}^{{\rm unit}}=1-\prod_{i=1}^{N}\prod_{j=1}^{K_{i}}{\rm P}_{i,j}^{{\rm chan}}

(17)

In the simulation, each chosen modulus, i.e., 7, 11, 19, 23, is prime, denoted as $p$ . In this case, an index-sum modular multiplication operation is implemented via a modular addition operation with moduli $p-1$ [46], equal to twice the designed $K_{i}$ in each PPMAP unit. That makes for all discrete phase points in $\Delta\varphi_{\mathbb{Z}}$ , the relative distance, $\left\lvert d_{j}\right\rvert/A_{\max}$ , between their corresponding ideal output optical power and decision threshold in all $K_{i}$ channels are given by [32].

\left\{\frac{\left\lvert d_{j}\right\rvert}{A_{\max}}\bigg{|}\ j\in\mathbb{Z},1\leq j\leq K_{i}\right\}=\left\{\frac{1}{2}\sin\left[\frac{\pi}{K_{i}}(j^{{}^{\prime}}-\frac{1}{2})\right]\bigg{|}\ j^{{}^{\prime}}\in\mathbb{Z},1\leq j^{{}^{\prime}}\leq K_{i}\right\}

(18)

Considering the relative variance noise $A_{\max}/\sigma=\sqrt{3}A_{{\rm rms}}/\sigma=\sqrt{3}\cdot 10^{\frac{{\rm SNR}}{20}}$ , the success rate for the $j^{{}^{\prime}}$ -th channel in the $i$ -th unit:

{\rm P}_{i,j^{{}^{\prime}}}^{{\rm chan}}={\Phi}\left(\frac{\sqrt{3}}{2}\sin\left[\frac{(j^{{}^{\prime}}-\frac{1}{2})\pi}{K_{i}}\right]\cdot 10^{\frac{{\rm SNR}}{20}}\right)

(19)

where ${\Phi}(x)=[1+$ erf $(x/\sqrt{2})]/2$ , i.e., the cumulative distribution function of standard Gaussian distribution $\mathcal{N}(0,1)$ . Then considering the commutative law of the continued multiplication in Eq.17, $\Pi_{j=1}^{K_{i}}{\rm P}_{i,j}^{{\rm chan}}=\Pi_{j^{{}^{\prime}}=1}^{K_{i}}{\rm P}_{i,j^{{}^{\prime}}}^{{\rm chan}}$ , the error model described in Eq.4 can be derived.

The deviation between the simulated relative frequency and the probability model around 31 dB in Fig.5 is since only a few samples out of the 40,000 experienced errors near an SNR of 31 dB, which does not satisfy the prerequisites of the law of large numbers. Apart from this specific range, the relative frequency obtained in our simulations aligns well with our probability model, confirming our model’s accuracy.

Appendix E The Expansibility of OPDC

Although all the above experiment is carried out on the same PPMAP chip with five-channel OPDC, the number of the output channel in OPDC is adjustable. To demonstrate the expansibility of OPDC, we design and fabricate an OPDC module with nine output channels. Fig.9 shows its structural rending, micrograph, and test results. Taking the modulus equal to 18 as an example, the deviation between the experimental and ideal boundaries is relatively small, indicating that the expanded OPDCs can reliably extract the calculation result from phase information.

Besides horizontal expansion by adding the number of output channels, OPDC can be vertically expanded with a cascaded structure [47], which possesses relatively low wavelength sensitivity. Furthermore, the OPDC fabricated on a thin-film lithium niobate platform exhibits the same performance on phase identifying [48], indicating the potential for monolithic integration of the overall PPMAP system on lithium niobate process platform. It should be emphasized that the Gray code generated by OPDC requires an additional codeword conversion unit to transform into the natural binary code for subsequent calculations. However, specific-designed photonic structure [41, 42, 43, 20] has the potential to implement the fixed codeword conversion task.

Appendix F Experimental Method

We employed a Tunable Semiconductor Laser (Santec TSL 570) to provide the incident CW laser at the wavelength of 1535 nm, and a Multi-port Optical Power Meter (Santec MPM 210) to measure the multi-channel optical power signals. After the electrical packaging of the integrated photonic chip, metal wires connect the electrodes on the chip to the FPC (Flexible Printed Circuit) adapter board. We employed a multi-channel programmable voltage source (nicslab XDAC-40MUB-R4G8) to control the applied voltages on the electrodes. The experimental equipment and the host computer were connected via GPIB interface and Ethernet cables, which facilitated the transmission of control signals and the acquisition of experimental data. In-house software was employed to perform real-time monitoring of the optical power signals from each port, which allowed evaluate system stability and monitor the polarization state of the incident light for controlling the coupling efficiency.

References

\bibcommenthead
Hennessy and Patterson [2018] Hennessy, J., Patterson, D.: A new golden age for computer architecture: domain-specific hardware/software co-design, enhanced. In: ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA) (2018)
Esmaeilzadeh et al. [2011] Esmaeilzadeh, H., Blem, E., St. Amant, R., Sankaralingam, K., Burger, D.: Dark silicon and the end of multicore scaling. In: Proceedings of the 38th Annual International Symposium on Computer Architecture, pp. 365–376 (2011)
Thompson et al. [2022] Thompson, N.C., Ge, S., Manso, G.F.: The importance of (exponentially more) computing power. arXiv preprint arXiv:2206.14007 (2022)
Capmany and Novak [2007] Capmany, J., Novak, D.: Microwave photonics combines two worlds. Nature photonics 1(6), 319 (2007)
Yang et al. [2018] Yang, H., Khayrudinov, V., Dhaka, V., Jiang, H., Autere, A., Lipsanen, H., Sun, Z., Jussila, H.: Nanowire network–based multifunctional all-optical logic gates. Science advances 4(7), 7954 (2018)
Caballero et al. [2022] Caballero, L.P., Povinelli, M.L., Ramirez, J.C., Guimarães, P.S., Neto, O.P.V.: Photonic crystal integrated logic gates and circuits. Optics Express 30(2), 1976–1993 (2022)
Ying et al. [2020] Ying, Z., Feng, C., Zhao, Z., Dhar, S., Dalir, H., Gu, J., Cheng, Y., Soref, R., Pan, D.Z., Chen, R.T.: Electronic-photonic arithmetic logic unit for high-speed computing. Nature communications 11(1), 1–9 (2020)
Sebastian et al. [2020] Sebastian, A., Le Gallo, M., Khaddam-Aljameh, R., Eleftheriou, E.: Memory devices and applications for in-memory computing. Nature nanotechnology 15(7), 529–544 (2020)
Shafiee et al. [2016] Shafiee, A., Nag, A., Muralimanohar, N., Balasubramonian, R., Strachan, J.P., Hu, M., Williams, R.S., Srikumar, V.: Isaac: A convolutional neural network accelerator with in-situ analog arithmetic in crossbars. In: 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA), pp. 14–26 (2016)
Mehonic and Kenyon [2022] Mehonic, A., Kenyon, A.J.: Brain-inspired computing needs a master plan. Nature 604(7905), 255–260 (2022)
Nahmias et al. [2019] Nahmias, M.A., De Lima, T.F., Tait, A.N., Peng, H.-T., Shastri, B.J., Prucnal, P.R.: Photonic multiply-accumulate operations for neural networks. IEEE Journal of Selected Topics in Quantum Electronics 26(1), 1–18 (2019)
Filipovich et al. [2022] Filipovich, M.J., Guo, Z., Al-Qadasi, M., Marquez, B.A., Morison, H.D., Sorger, V.J., Prucnal, P.R., Shekhar, S., Shastri, B.J.: Silicon photonic architecture for training deep neural networks with direct feedback alignment. Optica 9(12), 1323–1332 (2022)
Pai et al. [2023] Pai, S., Park, T., Ball, M., Penkovsky, B., Dubrovsky, M., Abebe, N., Milanizadeh, M., Morichetti, F., Melloni, A., Fan, S., et al.: Experimental evaluation of digitally verifiable photonic computing for blockchain and cryptocurrency. Optica 10(5), 552–560 (2023)
Meng et al. [2023] Meng, X., Zhang, G., Shi, N., Li, G., Azaña, J., Capmany, J., Yao, J., Shen, Y., Li, W., Zhu, N., et al.: Compact optical convolution processing unit based on multimode interference. Nature Communications 14(1), 3000 (2023)
Zhou et al. [2023] Zhou, W., Dong, B., Farmakidis, N., Li, X., Youngblood, N., Huang, K., He, Y., David Wright, C., Pernice, W.H., Bhaskaran, H.: In-memory photonic dot-product engine with electrically programmable weight banks. Nature Communications 14(1), 2887 (2023)
Xu et al. [2022] Xu, S., Wang, J., Yi, S., Zou, W.: High-order tensor flow processing using integrated photonic circuits. Nature Communications 13(1), 7970 (2022)
Ashtiani et al. [2022] Ashtiani, F., Geers, A.J., Aflatouni, F.: An on-chip photonic deep neural network for image classification. Nature, 1–6 (2022)
Shen et al. [2017] Shen, Y., Harris, N.C., Skirlo, S., Prabhu, M., Baehr-Jones, T., Hochberg, M., Sun, X., Zhao, S., Larochelle, H., Englund, D., et al.: Deep learning with coherent nanophotonic circuits. Nature photonics 11(7), 441–446 (2017)
Ashtiani et al. [2022] Ashtiani, F., Geers, A.J., Aflatouni, F.: An on-chip photonic deep neural network for image classification. Nature 606(7914), 501–506 (2022)
Fu et al. [2023] Fu, T., Zang, Y., Huang, Y., Du, Z., Huang, H., Hu, C., Chen, M., Yang, S., Chen, H.: Photonic machine learning with on-chip diffractive optics. Nature Communications 14(1), 70 (2023)
Tait [2022] Tait, A.N.: Quantifying power in silicon photonic neural networks. Physical Review Applied 17(5), 054029 (2022)
Xu et al. [2021] Xu, X., Tan, M., Corcoran, B., Wu, J., Boes, A., Nguyen, T.G., Chu, S.T., Little, B.E., Hicks, D.G., Morandotti, R., et al.: 11 tops photonic convolutional accelerator for optical neural networks. Nature 589(7840), 44–51 (2021)
Krishnamoorthi [2018] Krishnamoorthi, R.: Quantizing deep convolutional networks for efficient inference: A whitepaper. arXiv preprint arXiv:1806.08342 (2018)
Garg et al. [2022] Garg, S., Lou, J., Jain, A., Guo, Z., Shastri, B.J., Nahmias, M.: Dynamic precision analog computing for neural networks. IEEE Journal of Selected Topics in Quantum Electronics 29(2: Optical Computing), 1–12 (2022)
[25] Murmann, B.: ADC Performance Survey 1997-2023. [Online]. Available: https://github.com/bmurmann/ADC-survey
Morales Chacón et al. [2022] Morales Chacón, O., Wikner, J.J., Svensson, C., Siek, L., Alvandpour, A.: Analysis of energy consumption bounds in cmos current-steering digital-to-analog converters. Analog Integrated Circuits and Signal Processing 111(3), 339–351 (2022)
Dong et al. [2022] Dong, M., Clark, G., Leenheer, A.J., Zimmermann, M., Dominguez, D., Menssen, A.J., Heim, D., Gilbert, G., Englund, D., Eichenfield, M.: High-speed programmable photonic circuits in a cryogenically compatible, visible–near-infrared 200 mm cmos architecture. Nature Photonics 16(1), 59–65 (2022)
Kharel et al. [2021] Kharel, P., Reimer, C., Luke, K., He, L., Zhang, M.: Breaking voltage–bandwidth limits in integrated lithium niobate modulators using micro-structured electrodes. Optica 8(3), 357–363 (2021)
Nicholson [2012] Nicholson, W.K.: Introduction to Abstract Algebra. John Wiley & Sons, ??? (2012)
Omondi and Premkumar [2007] Omondi, A.R., Premkumar, A.B.: Residue Number Systems: Theory and Implementation vol. 2. World Scientific, ??? (2007)
Liu et al. [2021] Liu, C., Qiu, J., Tian, Y., Tao, R., Liu, Y., He, Y., Zhang, B., Li, Y., Wu, J.: Experimental demonstration of an optical quantizer with enob of 3.31 bit by using a cascaded step-size mmi. Optics Express 29(2), 2555–2563 (2021)
Tian et al. [2018] Tian, Y., Qiu, J., Huang, Z., Qiao, Y., Dong, Z., Wu, J.: On-chip integratable all-optical quantizer using cascaded step-size mmi. Optics Express 26(3), 2453–2461 (2018)
Deng et al. [2021] Deng, B., Srikanth, S., Jain, A., Conte, T.M., DeBenedictis, E., Cook, J.: Scalable energy-efficient microarchitectures with computational error tolerance via redundant residue number systems. IEEE Transactions on Computers 71(3), 613–627 (2021)
Soldano and Pennings [1995] Soldano, L.B., Pennings, E.C.: Optical multi-mode interference devices based on self-imaging: principles and applications. Journal of lightwave technology 13(4), 615–627 (1995)
Srikanth et al. [2018] Srikanth, S., Rabbat, P.G., Hein, E.R., Deng, B., Conte, T.M., DeBenedictis, E., Cook, J., Frank, M.P.: Memory system design for ultra low power, computationally error resilient processor microarchitectures. In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 696–709 (2018). IEEE
Zhang et al. [2022] Zhang, W., Huang, C., Peng, H.-T., Bilodeau, S., Jha, A., Blow, E., Lima, T.F., Shastri, B.J., Prucnal, P.: Silicon microring synapses enable photonic deep learning beyond 9-bit precision. Optica 9(5), 579–584 (2022) https://doi.org/10.1364/OPTICA.446100
Hochschild et al. [2021] Hochschild, P.H., Turner, P., Mogul, J.C., Govindaraju, R., Ranganathan, P., Culler, D.E., Vahdat, A.: Cores that don’t count. In: Proceedings of the Workshop on Hot Topics in Operating Systems, pp. 9–16 (2021)
Lyons and Vanderkulk [1962] Lyons, R.E., Vanderkulk, W.: The use of triple-modular redundancy to improve computer reliability. IBM journal of research and development 6(2), 200–209 (1962)
Lienig et al. [2017] Lienig, J., Bruemmer, H., Lienig, J., Bruemmer, H.: Reliability analysis. Fundamentals of electronic systems design, 45–73 (2017)
Sari and Psarakis [2011] Sari, A., Psarakis, M.: Scrubbing-based seu mitigation approach for systems-on-programmable-chips. In: 2011 International Conference on Field-Programmable Technology, pp. 1–8 (2011). IEEE
Khoram et al. [2019] Khoram, E., Chen, A., Liu, D., Ying, L., Wang, Q., Yuan, M., Yu, Z.: Nanophotonic media for artificial neural inference. Photonics Research 7(8), 823–827 (2019)
Zhu et al. [2022] Zhu, H., Zou, J., Zhang, H., Shi, Y., Luo, S., Wang, N., Cai, H., Wan, L., Wang, B., Jiang, X., et al.: Space-efficient optical computing with an integrated chip diffractive neural network. Nature communications 13(1), 1044 (2022)
Mohammadi Estakhri et al. [2019] Mohammadi Estakhri, N., Edwards, B., Engheta, N.: Inverse-designed metastructures that solve equations. Science 363(6433), 1333–1338 (2019)
Ahmad et al. [2022] Ahmad, M.F., Isa, N.A.M., Lim, W.H., Ang, K.M.: Differential evolution: A recent review based on state-of-the-art works. Alexandria Engineering Journal 61(5), 3831–3872 (2022)
Zhu et al. [2021] Zhu, D., Shao, L., Yu, M., Cheng, R., Desiatov, B., Xin, C.J., Hu, Y., Holzgrafe, J., Ghosh, S., Shams-Ansari, A., Puma, E., Sinclair, N., Reimer, C., Zhang, M., Lončar, M.: Integrated photonics on thin-film lithium niobate. Adv. Opt. Photon. 13(2), 242–352 (2021) https://doi.org/10.1364/AOP.411024
Ramírez et al. [2003] Ramírez, J., Meyer-Bäse, U., Taylor, F., García, A., Lloris, A.: Design and implementation of high-performance rns wavelet processors using custom ic technologies. Journal of VLSI signal processing systems for signal, image and video technology 34, 227–237 (2003)
Tian et al. [2023] Tian, Y., Kang, Z., He, J., Zheng, Z., Qiu, J., Wu, J., Zhang, X.: Cascaded all-optical quantization employing step-size mmi and shape-optimized power splitter. Optics & Laser Technology 158, 108820 (2023)
Tu et al. [2023] Tu, D., Huang, X., Yu, H., Yin, Y., Yu, Z., Wei, Z., Li, Z.: Photonic sampled and quantized analog-to-digital converters on thin-film lithium niobate platform. Optics Express 31(2), 1931–1942 (2023)