Towards a Low-SWaP 1024-beam Digital Array: A 32-beam Sub-System at 5.8 GHz
Millimeter wave communications require multibeam beamforming in order to utilize wireless channels that suffer from obstructions, path loss, and multi-path effects. Digital multibeam beamforming has maximum degrees of freedom compared to analog phased arrays. However, circuit complexity and power consumption are important constraints for digital multibeam systems. A low-complexity digital computing architecture is proposed for a multiplication-free 32-point linear transform that approximates multiple simultaneous RF beams similar to a discrete Fourier transform (DFT). Arithmetic complexity due to multiplication is reduced from the FFT complexity of for DFT realizations, down to zero, thus yielding a 46% and 55% reduction in chip area and dynamic power consumption, respectively, for the case considered. The paper describes the proposed 32-point DFT approximation targeting a 1024-beams using a 2D array, and shows the multiplierless approximation and its mapping to a 32-beam sub-system consisting of 5.8 GHz antennas that can be used for generating 1024 digital beams without multiplications. Real-time beam computation is achieved using a Xilinx FPGA at 120 MHz bandwidth per beam. Theoretical beam performance is compared with measured RF patterns from both a fixed-point FFT as well as the proposed multiplier-free algorithm and are in good agreement.
Keywords
Approximate beamforming, multibeams, digital arrays.
1 Introduction
The efficient formation of far-field antenna patterns simultaneously across a multitude of directions is crucially important for wireless communications, radio astronomy, imaging, radar, and electronic warfare. Multibeam beamforming has been usually achieved in the microwave domain using analog techniques (e.g., Rotman lenses [37] and Butler/Nolan matrices [31, 17]). Emerging mmW systems are considering hybrid multibeam beamforming due to its power efficiency and excellent performance for a reasonably small number of antenna elements and user streams [45, 44]. Although digital beamforming requires the control of each individual antenna element in an antenna array, it is promising for the future due to its many inherent advantages, which include [18]: i) maximum flexibility/reconfigurabilty; ii) easy system updates and support for new beamforming algorithms as they emerge; iii) precise control of both the gain and phase of individual antenna elements thus giving better control of the beams; iv) maximum degrees of freedom from a given array; and v) reduced maintenance and calibration requirements.
Element-wise digital beamforming requires a dedicated receiver (or transmitter, in transmit mode) for each antenna element, which is usually a uniformly spaced linear or rectangular array of antennas. Multibeams can be generated by expanding the concept of a phased-array to multiple simultaneous directions by using the fact that each direction of propagation of a carrier wave is associated with two spatial frequencies across the two orthogonal coordinate axes of a rectangular array aperture. Multiple beam digital beamforming is desired at the lowest possible energy consumption for a given bandwidth, supply voltage, and technology node, which leads to domain-specific architectures that are optimized for low complexity and power consumption.
Here we propose approximate computing-based algorithms and computing architectures that achieve quasi-orthogonal RF beams without using any digital multiplier circuits. The multiplierless nature of the digital computing architectures allow low chip area/size, weight, and power consumption (SWaP) and avoid the need for digital multipliers that have high circuit complexity (transistor count) and power consumption. This is likely to become more critical as wireless systems move to sub-terahertz frequencies and much wider channel bandwidths than used currently [35]. Algorithms that are multiplierless thus lead to substantially reduced SWaP in real-time digital silicon implementations [7, 35].
Multibeam beamforming on linear/rectangular apertures is important for exploiting multi-directional channels in massive-MIMO systems, for example in 5G mm-wave wireless networks. Such systems rely on the combination of beamforming with MIMO theory [21, 36, 12, 45, 35, 44] and as frequencies move to THz ranges, the need for providing thousands of simultaneous beams will emerge due to the small size of the wavelength and physical antenna aperture. Recent work has described different phased array architectures targeting 5G applications [46, 27, 25, 26, 11, 39, 29]. However, most of the literature has been concentrated either on hybrid beamforming systems or fully-analog architectures due to the prohibitive processing complexity of fully-digital beamforming. Other important applications for microwave and mm-wave digital multibeam beamformers include emerging defense applications such as space-based low earth orbit communications, mesh networks between micro satellites, space-based Internet distribution to densely populated areas, and multi-domain mosaic warfare where reliable high-speed wireless connectivity is needed across multiple platforms (air, space, land, and sea), as well as for emerging terahertz imaging systems [35]. The demands of high-capacity wireless networks for such applications can be significantly more difficult to meet than commercial 5G standards since modern military platforms can travel at hypersonic speeds across hundreds of kilometers. Such demanding wireless channel conditions necessitate beamforming gain across wide bandwidths and narrow angles of propagation (i.e., sharp beams) to both thwart detection and also benefit from beamforming gain. Furthermore, 5G networks will eventually require digital beamforming to reduce the overhead associated with the current 3GPP beam search time in the 5G game structure—great reduction in beam pointing can be obtained by simultaneously searching the environment for the best pointing angle, but this is not yet supported in the 5G 3GPP standard [3].
Some recent work has focused on achieving element-wise fully-digital beamforming. The paper in [24] presents a low-power 8-element digital beamforming prototype based on bit-stream processing. The design uses a low-resolution architecture that replaces multipliers with multiplexers. This multiplexer-based architecture achieves lower power and smaller area than conventional digital beamformers, but the design is limited to a 20 MHz bandwidth with only two simultaneous output beams. Another recent paper [23] reports a 16-element 4-beam digital beamformer targeting large scale MIMO for 5G communications systems. It uses a similar multiplexer-based approach as in [24] with an interleaved architecture to support a 100 MHz bandwidth. The work in [32, 49] also report experimental verification of fully digital multibeam beamforming schemes targeting MIMO-based 5G implementations.
The paper [30] presents a spatial DFT-based digital multibeam beamforming implementation scheme for satellite communications. The earlier work of authors in [4] describes a low-complexity algorithm using the spatial DFT based approach for generating 16 simultaneous beams using a ULA. The work in [4] uses a 16-point DFT approximation to generate the simultaneous beams and presents the measured beams of its fully digital implementation targetting 5G MIMO applications. This paper describes a novel low-complexity algorithm for realizing a massive number of simultaneous sharp beams, which are vitally important in coping with the rapid increases in path loss expected in future mmW/sub-THz wireless systems. In particular, we propose a low-SWaP approach to generating 1024 beams using a aperture and ultra-low-complexity digital VLSI hardware. We propose a 32-beam subsystem based on a novel 32-point DFT approximation as the building block of such a system. The proposed 32-beam subsystem has been implemented at 5.8 GHz and the digital beams have measured and compared with those from exact-DFT-based beams. The measured beams have been used to derive the beam patterns of the corresponding rectangular aperture by assuming identical element patterns in all directions.
The paper is structured in as follows. Section I provides a introduction to the paper Section II, followed by the introduction, describes the role of the DFT for spatial filtering in multibeam transceivers. Section III describes a 32-point approximate DFT algorithm with zero multiplicative complexity. Section IV describes an experimental realization of a 32-element receive array that can implement the proposed approximate DFT algorithm for digital beamforming. Experimental results from this array are presented in Section V. Finally, Section VI concludes the paper.
2 Role of the DFT in Multibeam Transceivers
Multibeam beamforming on an () linearly spaced rectangular array can be achieved by uniformly sampling the spatial frequency domains to define a set of far-field plane-waves having spatial frequencies determined by setting where and . For this analysis we consider so that the same proposed -point transform can be used row-wise and column-wise in the rectangular aperture for generating two-dimensional (2D) beams. The spatial frequency points correspond to beams pointing at unique angle pairs indexed by . The corresponding spatially-sampled time-continuous plane waves at the terminals of the array elements can be expressed in a Fourier basis as where is the unmodulated carrier frequency, is a constant that sets the signal power, and is the complex modulated information component of the signal. Note that we assume that the bandwidth of is much smaller than , since our analysis is only valid for narrowband signals for which the so-called spatial-wideband effect is negligible [28, 47].
In receiver mode, the plane-waves present at the antennas are sampled in the spatial domain, amplified and filtered, down-converted to baseband (or an intermediate frequency (IF), and finally digitized by an analog-to-digital converter (ADC) present at each array location. The digitized signals at each location is complex, i.e., has in-phase () and quadrature () components. For example, down-conversion using a quadrature mixer (which is modeled as multiplication by ) leaves the spatial frequency components intact. As a result, the spatial spectrum of the wave remains localized at . The creation of a sharp RF beam for extracting directional information for a particular plane-wave therefore involves the application of a 2D spatial bandpass filter having the sharpest possible selectivity centered on a particular frequency pair in the spatial frequency domain. From filter theory, it is known that the discrete Fourier transform (DFT) realizes a filterbank of finite impulse response (FIR) filters with sharp bandpass responses that take the well-known response shape; the peak stopband magnitude for this shape has an asymptote of to dB for increasing filter order . Therefore, to simultaneously receive an array of signals, the multibeam beamformer must compute the 2D DFT spatially across the - dimensions of the array.
For transmit applications, the waves to be transmitted at simultaneous multiple directions are applied to the inputs of the 2D inverse DFT (IDFT), with the corresponding IDFT outputs being converted to analog using digital-to-analog converters (DACs), filtered, up-converted to the desired carrier frequency, and amplified before being applied to the input terminals of the transmit array. Thus, the computation of the 2D spatial DFT/IDFT for each new sample of the digital baseband signal is a straightforward technique for achieving a large number of simultaneous RF beams in both receive and transmit modes.
Fig. 1(a) shows the digital beamforming architecture for an uniform rectangular aperture (URA) that generates beams. The block diagram of an -element uniform linear array (ULA) subsystem that acts as a building block for the URA is shown in Fig. 1(b). The direct computation of the DFT of an -point vector of input values is defined as the matrix-vector multiplication where and are both -point complex column vectors and is the complex matrix, known as the DFT matrix [6]. The direct matrix-vector multiplication requires a number of complex arithmetic operation in , where is O-notation [20, p. 429]. However, because of the symmetries of the DFT matrix, it is possible to compute the matrix-vector product with less than complex arithmetic operations. Algorithms that exploit the DFT matrix structure are called fast Fourier transforms (FFTs). FFTs are a famous and well-explored class of fast algorithms based on sparse matrix factorizations, which ultimately reduce the arithmetic complexity to compute to be of . The complexity reduction from to is substantial as grows large, which explains the importance of the use of FFTs in place of the DFT.
If we consider a single-beam conventional digital beamforming scenario, that would need 3 real multiplications and 5 + 2(-1) = 7-2 addition operations per beam. Considering this complexity, if arbitrary beams are needed, then such system would demand for 3 real multiplications and real additions. The use of FFTs brings down this complexity to the order of . Using the proposed approach mentioned in section 3, we can obtain FFT-like 32 one-dimensional (1D) beams at no multiplications and with only addition operations. We believe that these kind of large number of beams are required for applications like millimeter wave 5G communications as described in the introduction of the paper. This method can be an overkill for applications that require only few number of beams. But for applications that do need multiple simultaneous beams, the proposed method provides an attractive solution with a much lower power consumption and area in VLSI implementations.
3 A 32-point DFT Approximation and Fast Algorithm for RF Beamforming
3.1 Approximate Computing
The implementation of FFT/DFT in fixed-point digital hardware always leads to errors in representing the twiddle factors [19] which are mostly irrational. Therefore, the implementation of an FFT in physical systems is not perfect and is always an approximation. If an approximation can be used with better overall performance than that which results from the error sources of the system, then it makes sense to adopt such an approximation with the guarantee that it does not cause other impairments.
The difficulty in proposing DFT approximations for larger sequences rely on the hardness of the deriving efficient fast algorithms for generated approximations [13], simply because the approximate transforms may not preserve the same symmetries and mathematical properties that exist in the exact DFT matrix.
3.2 32-point Approximate DFT (ADFT)
Here we describe a 32-point approximate DFT matrix for which the matrix-vector multiplication operation can be computed without multipliers. Let be the set . Let be the set of complex matrices such that the real and the imaginary parts are defined over the set . The approximate transform can be found according to a multi-criterion optimization considering the search space represented by the parametrized mapping below:
and objective functions given by the following selected matrix-based metrics: (i) Frobenius norm of the matrix difference, (ii) total error energy, (iii) average percent absolute error, and (iv) orthogonality deviation. The optimal solution for the DFT approximation can be found by the determination of the Pareto efficient solution set, which is the set of non-dominant solutions [16] using with steps of .
The optimal matrix resulting from the above optimization problem is given by
(1) |
(2) |
(3) |
(4) |
(5) |
Among the efficient solutions, the matrix exhibits the smallest total error energy of approximately . The Frobenius norm of the matrix error per matrix element , where is the Frobenius norm. This measurement is 54.9% lower than the error per element of the DFT approximation described in [43, 14, 4] and can be regarded as acceptable.
Fig. 2 shows a comparison of the frequency responses of all the bins for the 32-point proposed approximate DFT (ADFT) and the DFT. The shapes and locations of the main beams are almost identical to the exact DFT. The relative errors of the magnitude response of each filter response are largely confined to the stopbands away from the main lobe (i.e., deep side lobes), and are generally below the dB level. Fig. 2(c) shows the magnitude error plot of the filter bank responses of the proposed DFT approximation. The plot in Fig. 2(c) is computed by evaluating the difference of the magnitude responses of approximate and exact DFT transforms for each filter (i.e., DFT/ADFT bin). The plots in Fig. 3 show the bins in Fig. 2(c) that have the highest magnitude error. All other bins have a magnitude error that is smaller than dB. The deviations in the filter bank responses with respect to the DFT filter bank responses is a fact that arises due to filter coefficients not being ideal as they have been approximated by small integers. Thus, the performance level is mainly set by the size of the optimization search space.
It is also noted that the proposed approximation, due to its numerical structure, would not directly work with conventional windowing functions. However, these functions can be modified to achieve the desired windowing performance. The idea of the proposed transform is to generate multiple beams simultaneously, and would serve the applications that need simultaneous multiple look-directions with the sharpest beams.

3.3 Fast Algorithm for Computing the 32-point ADFT
A fast algorithm for computing the approximate transform in (1) to be used in place of usual FFTs can be derived by means of sparse matrix factorization in a decimation-in-frequency approach [6]. The matrix transform can be factorized as follows:
(6) |
where for are sparse matrices (factorization stages). The non-zero matrix elements of each matrix are given in the Appendix. The matrix factorization in (6) is not unique (i.e., can admit multiple different factorizations) unlike factorization of a composite integer [8]. The number of stages (i.e., sparse matrices) in the matrix factorization depend on the factorization method employed. The number of stages is not important as long as the overall number of elementary arithmetic operations in the factorized form is lower when compared to the direct non-factorized form of the matrix-vector product.
Notice that the entries of the sparse matrices only contain the elements from the set which imply trivial arithmetic operations.
Method |
|
|
||||
---|---|---|---|---|---|---|
Radix-2 FFT [6, p. 76] | 408 | 88 | ||||
Split-Radix FFT [15] | 388 | 68 | ||||
Winograd FFT [48] | 388 | 68 | ||||
Direct Computation | 1984 | 0 | ||||
Fast Algorithm | 348 | 0 |
Metric | Duhamel algorithm | ADFT | Change |
Area, A (mm2) | 0.856 | 0.465 | 46% |
Critical path delay, (ns) | 1.73 | 0.86 | 50% |
Frequency, (GHz) | 0.58 | 1.16 | 100% |
AT () | 1.481 | 0.400 | 73% |
AT2 () | 2.562 | 0.344 | 86% |
Dynamic Power, (mW/GHz) | 1303 | 580 | 55% |
Largest side-lobe level (dB) | 2.23 |

Given the fast algorithm in (6), the computational complexity associated with computing can be quantified. Let us consider the complex input signals which correspond to inputs being the I and Q outputs of the received signal from the array to the digital processor and evaluate the arithmetic complexity in terms of real operations. The arithmetic cost of each matrix in each factorization stage of (6) is evaluated as described in [6]. Because the coefficients of the real and imaginary parts of for are also in , only additions are required; multiplications and bit-shifting operations are absent. The additive cost is based on the number of nonzero elements the rows of each matrix, as detailed in [6]. Therefore, the matrices , , and require 60 real additions; the matrices , , and require 28 real additions; and the matrix requires 24 real additions. The only complex matrix in the factorization, , requires 60 real additions only. In total, the transform requires 348 real additions. Table 1 shows the real multiplicative and additive costs associated with several well-known FFT algorithms compared with the proposed algorithm. Table 1 also shows the additive complexity achieved through the proposed fast algorithm is 40% lower when compared to direct computation of .
3.4 Hardware Metrics of the Proposed ADFT Realization
The 32-point ADFT fast algorithm in (6) was realized as a digital core and synthesized using 45 nm CMOS free-PDK standard cells [1]. For comparison purposes, a 32-point FFT core based on the Duhamel algorithm was implemented digitally and synthesized using the same technology. Both the approximate and fixed point exact FFT digital cores assume inputs of 8-bit word length. The fixed-point exact FFT core was designed with 10-bit twiddle factors [19] which maintains a precision of in the phasing coefficients. The multiplications throughout the signal paths were handled such that they preserve at least the coefficient precision. Table 2 compares the following metrics for the two implementations: chip area , critical path delay , maximum clock frequency , area-time , area-time-squared , frequency- and voltage-normalized dynamic power consumption , and maximum side-lobe level. It can be seen that the proposed ADFT algorithm consumes 46% less area than the reference FFT-based design, while achieving a 50% drop in critical path delay. It is also noted that the metrics and are reduced by and , respectively, where the metric is important when area/cost is more important, is critical when speed performance is crucial. Note that the speed values mentioned in Table 2 are only based on synthesis results, i.e., do not consider layout effects that slow down the performance of physical implementations. However, such effects will be present in both designs, so the relative improvements in and metrics are expected to remain valid. The compromise is an dB increase in sidelobe level, which we assume to be tolerable in most RF beamforming applications where unwanted signals (jammers) can fall on larger sidelobes.
3.5 -Beam Beamforming Architectures for ULAs and URAs
Fig. 1 shows the top-level hardware architectures for realizing and simultaneous orthogonal beams for an -element and aperture respectively using an -point spatial DFT digital core as the basic signal processing block. The front-end is shown as a direct-conversion receiver chain followed by analog-to-digital conversion for digital beamforming. The digitized data can be converted to complex (-) form using a Hilbert transform. This can also be achieved by using a quadrature mixer in the RF chain, with the luxury of going to baseband directly at the cost of double the amount of ADCs.

The numerically simulated array factors resulting from a 32-element spatially Nyquist sampled ULA are given in Fig. 4(i). Fig. 4(i-b) shows the beams generated using the proposed 32-point ADFT algorithm and Fig. 4(i-a) shows the corresponding beams of the exact algorthm with (i-c) showing the error magnitude between them. Fig. 4(ii) shows three simulated example beams out of the 1024 beams generated by the proposed ADFT algorithm when it is applied to a -element URA. The first and second columns of Fig. 4(ii) show the beams corresponding to the exact and approximate DFT, respectively; the third and fourth columns of Fig. 4(ii) show the errors between the two algorithms in the elevation and azimuthal planes, respectively, which are small enough to be ignored for most microwave and mm-wave beamforming applications.
4 A 32-Beam ULA-based Multibeam Beamformer
The system architecture used for verifying the proposed low-complexity multibeam beamforming algorithm is shown in Fig. 5(a). This section explains the system design.
4.1 5.8 GHz Front-End Design
Frequency | 5.8 GHz |
---|---|
Substrate | Rogers RO4350B |
Dielectric constant | 3.66 |
Substrate thickness () | 0.508 mm |
Conductor (copper) thickness () | 35 m |
The RF front-end of the receive-mode beamformer is constructed by integrating a 32-element ULA at 5.8 GHz with 32 direct conversion RF receiver chains (on PCB) as shown in Fig. 5(b). The inter-element spacing of the array was set to , which is mm at 5.8 GHz.

Each antenna element of the ULA was designed as a vertical sub-array of patch antennas that employs passive beamforming at RF in the orthogonal (vertical) plane. This design improves the gain in the vertical plane, thus simplifying array factor measurements in the azimuthal plane. In principle, such analog beamforming is independent of the 1D/2D spatial FFT-based beamforming algorithms discussed in this paper. The sub-array is designed by feeding antenna elements in series along a uniform transmission line, and performing a parametric sweep to provide better impedance matching and performance [38]. Note that such analog beamforming does not affect the performance of the beamforming algorithm under consideration as it happens in the azimuthal plane. The specifications of the antenna array design are summarized in Table 3. The antenna outputs are directly fed into 32 heterodyne receivers designed on FR-4 PCBs using surface mount devices. The LO signals for each receiver are provided through a centralized LO scheme that consists of a 32-output power divider network connected to a low-phase-noise oscillator. The first stage of each receiver consists of a low-noise amplifier (LNA) that provides 16 dB gain at 5.8 GHz with a noise figure of 2.4 dB. The amplified signal is band-pass filtered within the frequency range 4.7-6 GHz, which helps to reject out-of-band interference and noise. The band-limited amplified signal is then passed through a mixer and low-pass filter to produce a downconverted low-IF input. The 32 downconverted low-IF signals are further amplified by 30 dB and then digitized in parallel using two ADC16x250-8 ADC cards (16 single-ended input channels, 8-bit, up to 250 MS/s per channel) [9]. The in-band gain and noise figure of the entire receiver are estimated to be 38.6 dB and 2.9 dB, respectively; the latter is dominated by the LNA.
The ADCs used a sample clock of 200 MHz for real-time hardware experiments. The same clock was also routed to the digital circuits implemented on the FPGAs. The clock frequency was chosen to be smaller than the maximum allowed for the digital design, which is limited by the critical path delay (CPD) and thus denoted by .
4.2 Digital Back-End
Digital processing of sampled signals was performed using the reconfigurable open architecture computing hardware (ROACH-2) platform [10] designed by the Collaboration for Astronomy Signal Processing and Electronics Research (CASPER). ROACH-2 is based on a Xilinx Virtex-6 FPGA chip; it also includes an integrated on-board processor that handles communications and control functions with the FPGA. The platform has 2 ZDOK interfaces (each supporting 42 pins) that connects high bandwidth input/output (I/O) to the FPGA. The ADC16x250-8 ADC cards mentioned previously were designed to be compatible with the ROACH-2 hardware and can be accessed using CASPER-supplied software routines. These routines, which are available at [2], also allow the ADCs to be calibrated.
The overall architecture of the digital beamforming test-setup is shown in Fig. 5(a). The digital design consists of four main subsystems: (i) a digital calibration stage; (ii) an IQ decomposition FIR filter that implements the Hilbert transform [33]; (iii) the 32-point DFT/ADFT algorithm implementation; and (iv) an energy calculation subsystem for facilitating real-time measurements on each output beam. The exact DFT core was designed using 10-bit precision twiddle factors which provide a good compromise between circuit size and maximum operating frequency.

5 Experimental Results
This section describes experimental results obtained from the 32-element ULA, including antenna characterization and beam measurements.
5.1 Antenna Array Characterization
The performance of the array can be characterized using S-parameters [34], which were measured using a commercial vector network analyzer. For example, the return loss of the th antenna, which represents the amount of power reflected from it, is given by . The measured return loss for , namely , is shown in Fig. 5(c). The proposed patch antenna resonates at a frequency of 5.9 GHz with an excellent return loss of dB.
Mutual coupling is another important issue during the design of an antenna array. In an array, the fields radiated by individual elements tend to interact with each other, thus causing interchange of energy [5]. Mutual coupling describes the energy absorbed by one antenna when a nearby antenna is operating, and depends upon many factors including antenna design parameters, inter-element spacing, and the direction of arrival (DOA) of the wave [40]. It can also be measured using S-parameters. Specifically, is a measure of coupling between ports and . We measured the mutual coupling between an antenna element and its nearest neighbors using measurements. In particular, we measured , , and , which characterize the coupling between ports 16 and its near neighbors (ports 14, 15, 17, and 18). The results are shown in Fig. 5(c) versus frequency: the values at 5.8 GHz are relatively low and given by dB, dB, dB, and dB. As expected, mutual coupling decreases with inter-element separation.
5.2 Calibration
Calibration of the RF array system is vital for obtaining optimal beamforming performance. Calibration was performed in two stages. The first stage was performed on the ADCs, and used open source routines that have already been developed for the same hardware by members of the CASPER group [2]. The second stage focused on digitally removing the effects of mismatches in the microwave front-end. Relative gain and phase mismatches of the IF outputs for each chain were calculated with respect to a reference chain using a input reference carrier at 5.86 GHz. Since the overall system is narrowband, the recorded gain and phase values were directly used to equalize the gain and phase of the sampled IF inputs. This was achieved by adding a complex multiplier after the digital Hilbert transform in each channel.

5.3 Beam Measurements
As shown in Fig. 5(b), the entire 5.8 GHz 32-element digital array placed in an anechoic chamber for measuring the received beam patterns ([22] shows a short realtime demo of the total system). Power patterns were measured by sending a continuous wave (CW) signal at GHz. The LO signal frequency determines the IF . The measurements in this paper were generated by setting GHz, thus resulting in an IF of 10 MHz, and digitizing the down-converted outputs at MHz. Fig. 6 shows the measured beams from the real-time experimental setup for both the exact and approximate algorithms along with the corresponding simulated curves. The measurement was conducted by using digital integrators at each FFT/ADFT bin output to calculate the received energy for a fixed amount of time.
The measured array factor of the beams highly depends on the measurement setup geometry. Ideally, the transmitter and the receiver should be placed far enough apart for waves incident on the receiver array to be approximated as plane waves. Numerical simulation in Fig. 7 shows how the actual array factor being measured deviates from this ideal depending on the geometry of the test setup. Based on the standard rules [41, p. 42], the transmitter and receiver should have a separation exceeding 20 m at 5.8 GHz in order for the receiver aperture to be in the far field. However, such a large separation was not achievable within our test facility. In particular, the beams were measured in an open parking deck with a transmitter receiver separation of approximately 7 m. Due to this reason, the measured beams shown in Fig. 6 have been compared with numerically-simulated beams that account for both finite transmitter-receiver separation and the actual element pattern.

Fig. 6 shows that the measured beam patterns for both the algorithms closely follow each other for all the bins. The measured beams also follow the expected patterns quite well in the vicinity of the main beam. For both algorithms, the measured plots have higher side-lobe levels in the deeper stop bands compared to the simulated ones. We believe that such degradation in stop band performance is mainly due to post-calibration errors of the system; these are dominated by the performance of the analog front-ends in the receiver. Measurement errors, including the fact that the tests were not performed in an anechoic environment, also lead to deviations from the expected patterns.
The 2D array factor of each beam arising from the proposed linear transform can be expressed as,
(7) |
which may be rearranged to
(8) | ||||
(9) |
where , , , and are elevation and azimuthal angles, respectively. and denote the inter-element spacing in and directions. The relationship in (9) can be used to compute the 2D beam responses corresponding to a 2D URA consisting of 32 linear arrays, each with the measured responses shown in Fig. 6. In particular, the term denotes the array factor of the th beam in the 32-element linear array subsystem. The measured 1D beam patterns were thus used in place of to synthesize the corresponding 2D beam patterns from a 2D aperture. Fig. 8 shows the 2D beam patterns obtained using the measured ULA beam measurements for the same beams shown in Fig. 4 assuming .
6 Conclusion
A large number of simultaneous beams has become an essential requirement for emerging mm-wave based 5G systems. Moreover, future communications applications, such as space-based Internet services, demand an ultra-high number of beams. An square antenna array aperture can generate up to orthogonal simultaneous beams by using the 2D -point spatial DFT. The upper bound of the multiplicative complexity associated with such processing using FFT algorithms is . This paper has discussed a low-complexity digital beamforming architecture for generating 1024 simultaneous RF beams using a 32-point DFT approximation that completely eliminates multiplication operations. The proposed ADFT algorithm consumes 46% less area than the reference FFT-based design, while achieving a 50% drop in critical path delay. The VLSI metrics and for the proposed algorithm are reduced by 73% and 86%, respectively. We have validated the proposed approach on a fully-functional 32-element digital 1D receive array that operates at 5.8 GHz. This design will serve as the main subsystem for future implementations of a 2D rectangular aperture that could generate 1024 simultaneous RF beams with significantly lower SWaP in VLSI implementations. The 1D array uses 32 parallel ADCs for sampling the antenna outputs and the ADFT (implemented on a Xilinx FPGA) for computing 32 RF beams in real-time. The measured RF beams show a per-beam bandwidth of 120 MHz when all 32 beams are realized in real time, with only marginal ( dB) degradation in beam performance compared to a control experiment based on athe Duhamel FFT core.
Acknowledgment
The authors are grateful to the CASPER community for thoughtful discussions and advice through their e-mail discussion group. This work would not have been possible without the extensive contributions to the open source ROACH-2 system development by multiple contributors in the radio-astronomy instrumentation community. One of the authors (RJC) thanks an anonymous reviewer and Mr. L. Portella for the identification of a typo in the matrix equations from Section 3.2.
Appendix A Factorization Terms of
The matrix factors , , from (6) are given below:
(10) |
(11) |
(12) |
(13) |
(14) |
(15) |
(16) |
(17) |
References
- [1] 45nm free process design kit, 2009. Available: https://www.eda.ncsu.edu/wiki/FreePDK45:Contents.
- [2] Calibrating ADC16x250-8 ADCs. University of Berkeley, Department of Astronomy, 2014 (accessed March, 2016). Available: http://w.astro.berkeley.edu/davidm/gems/.
- [3] 3GPP the mobile broadband standard, July 2018 (accessed on October 2018). Available: http://www.3gpp.org/release-15.
- [4] V. Ariyarathna, D. F. G. Coelho, S. Pulipati, F. M. Bayer, V. S. Dimitrov, R. J. Cintra, and A. Madanayake, Multibeam digital array receiver using a 16-point multiplierless DFT approximation, IEEE Transactions on Antennas & Propagation, 67 (2019), pp. 925–933.
- [5] C. A. Balanis, Antenna theory: analysis and design, John Wiley & Sons, 2016.
- [6] R. E. Blahut, Fast Algorithms for Digital Signal Processing, Cambridge University Press, 2010.
- [7] D. R. Bull and D. H. Horrocks, Primitive operator digital filters, IEE Proceedings G - Circuits, Devices and Systems, 138 (1991), pp. 401–412.
- [8] D. Burton, Elementary Number Theory, McGraw-Hill Education, 7th ed., 2010.
- [9] CASPER, ADC16x250-8 coax rev daughter ADC cards. Available: https://casper.berkeley.edu/wiki/ADC16x250-8_coax_rev_2.
- [10] , Reconfigurable open architecture computing hardware (ROACH-2) FPGA platform., Oct. 2013 (accessed February, 2018). Available: https://casper.berkeley.edu/wiki/ROACH-2Revision2.
- [11] R. Chen, H. Xu, C. Li, L. Zhu, and J. Li, Hybrid beamforming for broadband millimeter wave massive MIMO systems, in 2018 IEEE 87th Vehicular Technology Conference (VTC Spring), June 2018, pp. 1–5.
- [12] D. Choudhury, 5G wireless and millimeter wave technology evolution: An overview, in 2015 IEEE MTT-S International Microwave Symposium, May 2015, pp. 1–4.
- [13] R. J. Cintra, An integer approximation method for discrete sinusoidal transforms, Circuits, Systems, and Signal Processing, 30 (2011), pp. 1481–1501.
- [14] V. A. Coutinho, V. Ariyarathna, D. F. G. Coelho, R. J. Cintra, and A. Madanayake, An 8-beam 2.4 GHz digital array receiver based on a fast multiplierless spatial DFT approximation, in Proceedings of International microwave Symposium, 2018.
- [15] P. Duhamel and H. Hollmann, Split radix FFT algorithm, Electronics Letters, 20 (1984), pp. 14–16.
- [16] M. Ehrgott, Multicriteria Optimization, Springer, 2 ed., 2005.
- [17] N. J. G. Fonseca, Printed S-band 4 4 Nolen matrix for multiple beam antenna applications, IEEE Transactions on Antennas and Propagation, 57 (2009), pp. 1673–1678.
- [18] C. Fulton, M. Yeary, D. Thompson, J. Lake, and A. Mitchell, Digital phased arrays: Challenges and opportunities, Proceedings of the IEEE, 104 (2016), pp. 487–503.
- [19] W. M. Gentleman and G. Sande, Fast fourier transforms: For fun and profit, in Proceedings of the November 7-10, 1966, Fall Joint Computer Conference, AFIPS ’66 (Fall), New York, NY, USA, 1966, ACM, pp. 563–578.
- [20] R. Graham, D. Knuth, and O. Patashnik, Concrete Mathematics: A Foundation for Computer Science, A foundation for computer science, Addison-Wesley, 1994.
- [21] W. Hong, Z. H. Jiang, C. Yu, J. Zhou, P. Chen, Z. Yu, H. Zhang, B. Yang, X. Pang, M. Jiang, Y. Cheng, M. K. T. Al-Nuaimi, Y. Zhang, J. Chen, and S. He, Multibeam antenna technologies for 5G wireless communications, IEEE Transactions on Antennas and Propagation, 65 (2017), pp. 6231–6249.
- [22] IEEE.tv, Ted tours the 2018 Brooklyn 5G summit expo, Apr. 2018 (accessed September 2018). Available: https://ieeetv.ieee.org/ieeetv-specials/ted-tours-the-brooklyn-5g-summit-expo-floor-3.
- [23] S. Jang, J. Jeong, R. Lu, and M. P. Flynn, A 16-element 4-beam 1 GHz IF 100 MHz bandwidth interleaved bit stream digital beamformer in 40 nm CMOS, IEEE Journal of Solid-State Circuits, 53 (2018), pp. 1302–1312.
- [24] J. Jeong, N. Collins, and M. P. Flynn, A 260 MHz IF sampling bit-stream processing digital beamformer with an integrated array of continuous-time band-pass modulators, IEEE Journal of Solid-State Circuits, 51 (2016), pp. 1168–1176.
- [25] M. K. Khattak, C. Lee, D. Han, and S. Kahng, Flat Rotman lens for 5G beamforming antenna, in 2016 IEEE 5th Asia-Pacific Conference on Antennas and Propagation (APCAP), July 2016, pp. 205–206.
- [26] K. Kibaroglu, M. Sayginer, and G. M. Rebeiz, A 28 GHz transceiver chip for 5G beamforming data links in SiGe BiCMOS, in 2017 IEEE Bipolar/BiCMOS Circuits and Technology Meeting (BCTM), Oct 2017, pp. 74–77.
- [27] , An ultra low-cost 32-element 28 GHz phased-array transceiver with 41 dbm EIRP and 1.0–1.6 Gbps 16-QAM link at 300 meters, in 2017 IEEE Radio Frequency Integrated Circuits Symposium (RFIC), June 2017, pp. 73–76.
- [28] W. Liu and S. Weiss, Wideband Beamforming: Concepts and Techniques, Wireless Communications and Mobile Computing, Wiley, 2010.
- [29] J. Lota, S. Sun, T. S. Rappaport, and A. Demosthenous, 5G uniform linear arrays with beamforming and spatial multiplexing at 28, 37, 64, and 71 GHz for outdoor urban communication: A two-level approach, IEEE Transactions on Vehicular Technology, 66 (2017), pp. 9972–9985.
- [30] R. Miura, T. Tanaka, I. Chiba, A. Horie, and Y. Karasawa, Beamforming experiment with a DBF multibeam antenna in a mobile satellite environment, IEEE Transactions on Antennas and Propagation, 45 (1997), pp. 707–714.
- [31] H. Moody, The systematic design of the Butler matrix, IEEE Transactions on Antennas and Propagation, 12 (1964), pp. 786–788.
- [32] T. Okuyama, S. Suyama, J. Mashino, S. Yoshioka, Y. Okumura, K. Yamazaki, D. Nose, and Y. Maruta, Experimental evaluation of digital beamforming for 5G multi-site massive MIMO, in 2017 20th International Symposium on Wireless Personal Multimedia Communications (WPMC), Dec 2017, pp. 476–480.
- [33] A. V. Oppenheim and R. W. Schafer, Discrete-Time Signal Processing, Prentice Hall, 3 ed., 2009.
- [34] D. M. Pozar, Microwave engineering, John Wiley & Sons, 2009.
- [35] T. S. Rappaport, Y. Xing, O. Kanhere, S. Ju, A. Madanayake, S. Mandal, A. Alkhateeb, and G. C. Trichopoulos, Wireless communications and applications above 100 GHz: Opportunities and challenges for 6G and beyond, IEEE Access (accepted), (2019).
- [36] T. S. Rappaport, Y. Xing, G. R. MacCartney, A. F. Molisch, E. Mellios, and J. Zhang, Overview of millimeter wave communications for fifth-generation (5G) wireless networks—with a focus on propagation models, IEEE Transactions on Antennas and Propagation, 65 (2017), pp. 6213–6230.
- [37] W. Rotman and R. Turner, Wide-angle microwave lens for line source applications, IEEE Transactions on Antennas and Propagation, 11 (1963), pp. 623–632.
- [38] R. A. Sainati, CAD of microstrip antennas for wireless applications, Artech House, Inc., 1996.
- [39] M. Sayginer and G. M. Rebeiz, An eight-element 2-16-GHz programmable phased array receiver with one, two, or four simultaneous beams in SiGe BiCMOS, IEEE Transactions on Microwave Theory and Techniques, 64 (2016), pp. 4585–4597.
- [40] H. Singh, H. Sneha, and R. Jha, Mutual coupling in phased arrays: A review, International Journal of Antennas and Propagation, 2013 (2013).
- [41] W. L. Stutzman and G. A. Thiele, Antenna Theory and Design, Antenna Theory and Design, Wiley, 2012. ISBN 9780470576649.
- [42] D. Suárez, Aproximações para a transformada discreta de Fourier e aplicações em deteção e estimação, Master’s thesis, Universidade Federal de Pernambuo, 2015.
- [43] D. Suarez, R. J. Cintra, F. M. Bayer, A. Sengupta, S. Kulasekera, and A. Madanayake, Multi-beam RF aperture using multiplierless FFT approximation, Electronics Letters, 50 (2014), pp. 1788–1790.
- [44] S. Sun, T. S. Rappaport, M. Shafi, P. Tang, J. Zhang, and P. J. Smith, Propagation models and performance evaluation for 5g millimeter-wave bands, IEEE Transactions on Vehicular Technology, 67 (2018), pp. 8422–8439.
- [45] S. Sun, T. S. Rappaport, M. Shafi, and H. Tataria, Analytical framework of hybrid beamforming in multi-cell millimeter-wave systems, IEEE Transactions on Wireless Communications, (2018), pp. 1–1.
- [46] S. Sun, T. S. Rappaport, and M. Shaft, Hybrid beamforming for 5G millimeter-wave multi-cell networks, in IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS), April 2018, pp. 589–596.
- [47] B. Wang, F. Gao, S. Jin, H. Lin, G. Y. Li, S. Sun, and T. S. Rappaport, Spatial-wideband effect in massive mimo with application in mmwave systems, IEEE Communications Magazine, (2018), pp. 1–8.
- [48] S. Winograd, Arithmetic Complexity of Computations, CBMS-NSF Regional Conference Series in Applied Mathematics, 1980.
- [49] P. Xingdong, H. Wei, Y. Tianyang, and L. Linsheng, Design and implementation of an active multibeam antenna system with 64 RF channels and 256 antenna elements for massive MIMO application in 5G wireless communications, China Communications, 11 (2014), pp. 16–23.