This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

] Author to whom the correspondence should be addressed

DNA Barcodes using a Cylindrical Nanopore

Swarnadeep Seth    Aniket Bhattacharya [ [email protected] 1Department of Physics, University of Central Florida, Orlando, Florida 32816-2385, USA
Abstract

We report an accurate method to determine DNA barcodes from the dwell time measurement of protein tags (barcodes) along the DNA backbone using Brownian dynamics simulation of a model DNA and use a recursive theoretical scheme which improves the measurements to almost 100% accuracy. The heavier protein tags along the DNA backbone introduce a large speed variation in the chain that can be understood using the idea of non-equilibrium tension propagation theory. However, from an initial rough characterization of velocities into “fast” (nucleotides) and “slow” (protein tags) domains, we introduce a physically motivated interpolation scheme that enables us to determine the barcode velocities rather accurately. Our theoretical analysis of the motion of the DNA through a cylindrical nanopore opens up the possibility of its experimental realization and carries over to multi-nanopore devices used for barcoding.

A DNA barcode consists of a short strand of DNA sequence taken from a targeted gene like COI or cox I (Cytochrome C Oxidase 1) [1] present in the mitochondrial gene in animals. The unique combination of nucleotide bases in barcode allows us to distinguish one species from another. Unlike relying on the traditional taxonomical identification methods, DNA barcoding provides an alternative and reliable framework to categorize a wide variety of specimens obtained from the natural environment. Though researchers relied on DNA sequencing techniques for the identification of unknown species for a long time, in 2003, Hebert et al. [2] proposed the mictocondrial gene (COI) region barcoding to classify cryptic species [3] from the entire animal population. Since then, several studies have shown the potential applications of barcoding in conserving biodiversity [4], estimating phyletic diversity, identifying disease vectors [5], authenticating herbal products [6], unambiguously labeling the food products [7, 8], and protecting endangered species [4]. Traditional sequencing methods based on chemical analysis are widely used in the biological community to determine the barcodes. Nanopore based sequencing methods [9] are being explored in a dual nanopore system for a cost effective, high throughput, chemical-free, and real time barcode generation.

Refer to caption
Figure 1: Schematics of a model dsDNA captured in cylindrical nanopore of diameter d=2σd=2\sigma and thickness tporet_{pore}, where σ\sigma is the diameter of each monomer (purple beads). Protein tags (barcodes) of the same diameter but of different colors (only three are shown in here) interspersed along the dsDNA backbone. Opposite but unequal forces fU\vec{f}_{U} and fD\vec{f}_{D} are applied to straighten the dsDNA as it translocates in the direction bias net ±|ΔfUD|=±|fUfD|\pm|\Delta\vec{f}_{UD}|=\pm|\vec{f}_{U}-\vec{f}_{D}| through the nanopore. (b) Positions of the protein tags along the contour length of the model dsDNA of length L=1024σL=1024\sigma which represents an actual dsDNA of 48500 base pairs. The location of the tags are listed in Table-I.

The possibility of determining DNA barcodes have been demonstrated in a dual nanopore device, by scanning a captured dsDNA multiple times by applying a net periodic bias across the two pores [9, 10, 11, 12]. Theoretical and simulation studies have also been reported in the context of a double nanopore system [13, 14, 15]. In this article, we investigate a similar strategy in silico in a cylindrical nanopore and demonstrate that a cylindrical nanopore can have a competitive advantage over a dual nanopore system. By studying a model dsDNA with barcodes using Brownian dynamics we establish an important result that it is due to the disparate dwell time and speed of the barcodes (“tags”) compared to the nucleotide segments

Table 1: Tag positions along the dsDNA
Tag # T1T_{1} T2T_{2} T3T_{3} T4T_{4} T5T_{5} T6T_{6} T7T_{7} T8T_{8}
Position 154 369 379 399 614 625 696 901
Separation 154 215 10 20 215 11 71 205

(“monomers”) the current blockade time information only is not enough and will lead to an inevitable underestimation of the distance between the barcodes. Furthermore, using the ideas of the tension propagation theory [16, 17], we demonstrate that information about the fast-moving nucleotides in between the barcodes,- not easily accessible experimentally is a key element to resolve the underestimation. We suggest how to obtain this information experimentally and provide a physically motivated “two-step” interpolation scheme for an accurate determination of barcodes, even when the separation of (unknown) tags has a broad distribution.


\bullet The Model System: Our in silico coarse-grained (CG) model of a dsDNA consist of 1024 monomers interspersed with 8 barcodes at different locations shown in Fig. 1 and Table-I is motivated by an experimental study by Zhang et al. on a 48500 bp long dsDNA with 75 bp long protein tags at random locations along the chain  [10, 11, 12] using a dual nanopore device. Here we explore if a cylindrical nanopore with applied biases at each end can resolve the barcodes with similar accuracy or better. We purposely choose positions of the 8 barcodes (Table-I) to study how the effect of disparate distances among the barcodes affects their measurements. The tags T2T_{2}, T3T_{3}, T4T_{4} are closely spaced and form a group. Likewise, another group consisting of T5T_{5} and T6T_{6} are put in a closer proximity to T7T_{7}. The tags T1T_{1} and T8T_{8} are further apart from the rest of the tags. The general scheme of the BD simulation strategy for a translocating homo-polymer under alternate bias has been discussed in our recent publication [13, 14] and in the Appendix A.

In this article, tags are introduced by choosing the mass and friction coefficient at tag locations to be different than the rest of the monomers along the chain. This requires modification of the BD algorithm as discussed in the Appendix A. The protein tags used in the experiments [10, 11, 12] translate to about three monomers in the simulation. The heavier and extended tags introduce a larger viscous drag. Instead of explicitly putting side-chains at the tag locations, we made the mass and the friction coefficient of the tags 3 times larger. This we find enough to resolve the distance between the tags. Two forces fU\vec{f}_{U} and fD\vec{f}_{D} at each end of the cylinder in opposite directions keep the DNA straight inside the channel and allows translocation in the direction of the net bias (please see Fig. 1 and Fig. 2).


\bullet Barcodes from repeated scanning: As potentially could be done in a nanopore experiments, we switch the differential bias once the first tag or the last tag ( T1T_{1} , T8T_{8}) translocates through the nanopore during up(UU)/down(DD) D/U\rightarrow D/U translocation yet having end segments inside the pore (please see Fig. 2) so that the DNA remains captured in the cylindrical pore and the barcodes are scanned multiple times.

Refer to caption
Figure 2: Demonstration of the epoch when the bias voltage is flipped. (a) showing the last barcode is yet to translocate in the downward direction when the net bias ΔfDU=fDfU>0\Delta\vec{f}_{DU}=\vec{f}_{D}-\vec{f}_{U}>0. (b) shows the situation after a later time when finally all the barcodes crossed the cylindrical pore during downward translocation with a portion of the end segment still remaining inside the pore. At this point the bias is flipped with an upward bias ΔfUD=fUfD>0\Delta\vec{f}_{UD}=\vec{f}_{U}-\vec{f}_{D}>0, translocation now occurs in the upward direction. In this way, the DNA remains captured all the time during repeated scans.
Refer to caption
Figure 3: (a) Demonstration of calculation of wait time for T7T_{7} which has the monomer of index m=696m=696. The dwell velocity is then calculated using Eqn. 2. (b) Demonstration of calculation of tag time delay τ78UD=tiUD(8)tiUD(7)\tau_{78}^{U\rightarrow D}=t_{i}^{U\rightarrow D}(8)-t_{i}^{U\rightarrow D}(7) for tags T7T_{7} and T8T_{8} while they are moving downward. Please note that similar quantity for upward translocation τ87DU=tiDU(7)tiUD(8)τ78UD\tau_{87}^{D\rightarrow U}=t_{i}^{D\rightarrow U}(7)-t_{i}^{U\rightarrow D}(8)\neq\tau_{78}^{U\rightarrow D} as there is no symmetry of the tags along the chain.

The question we ask: can we recover the actual barcode locations from these scanning measurements, so that the method can be applied to determine unknown barcodes ? We monitor two important quantities, - the dwell time of each monomer and the time delay of arrival of two successive monomers at the pore as demonstrated in Fig. 3 and explained below. For each up/down-ward scan we measure the dwell times of the monomer mm as follows:

WUD(m)=tfUD(m)tiUD(m),\displaystyle W^{U\rightarrow D}(m)=t_{f}^{U\rightarrow D}(m)-t_{i}^{U\rightarrow D}(m), (1a)
WDU(m)=tfDU(m)tiDU(m).\displaystyle W^{D\rightarrow U}(m)=t_{f}^{D\rightarrow U}(m)-t_{i}^{D\rightarrow U}(m). (1b)

Here tiUD(m)t_{i}^{U\rightarrow D}(m) and tfUD(m)t_{f}^{U\rightarrow D}(m) are the arrival and exit times of the monomer with index mm as further demonstrated in Fig. 3(a). The corresponding dwell velocities vdwellUD(m)v_{dwell}^{U\rightarrow D}(m) and vdwellDU(m)v_{dwell}^{D\rightarrow U}(m) for the mthm^{th} bead (either a monomer or a tag) along the channel axis (please see Fig. 3(a)) can be obtained as follows.

vdwellUD(m)=tpore/WUD(m),\displaystyle v_{dwell}^{U\rightarrow D}(m)=t_{pore}/W^{U\rightarrow D}(m), (2a)
vdwellDU(m)=tpore/WDU(m).\displaystyle v_{dwell}^{D\rightarrow U}(m)=t_{pore}/W^{D\rightarrow U}(m). (2b)

In an actual experiment one measures the dwell velocities of the tags only which are equivalent to the current blockade times.

\bullet Non uniformity of the dwell velocity: The presence of tags with heavier mass (mtag=3mbulkm_{tag}=3m_{bulk}) and larger solvent friction (γtag=3γbulk\gamma_{tag}=3\gamma_{bulk}) introduces a large variation in the dwell time and hence a large variation in the dwell velocities of the DNA beads and tags (see Fig. 4). In general, there is no up-down symmetry for the dwell time/velocity as tags are not located symmetrically along the chain backbone. Thus the physical quantities are averaged over UDU\rightarrow D and DUD\rightarrow U translocation data. The average dwell velocity v¯dwell(m)=12[vdwellUD(m)+vdwellDU(m)]\bar{v}_{dwell}(m)=\frac{1}{2}\left[v_{dwell}^{U\rightarrow D}(m)+v_{dwell}^{D\rightarrow U}(m)\right] clearly shows two different velocity envelopes - the tags residing at the lower envelope. Fig. 4 shows that

Refer to caption
Figure 4: Dwell velocity of monomer in a cylindrical nanopore system. \triangledown and \vartriangle represent downward and upward translocation. \circ are average of both directions. Filled triangles and circles correspond to tag dwell velocities.

the dwell velocities of the tags (green circle ) are significantly lower than the velocity of the nucleotides in between the tags, which will underestimate the barcode distances as explained later. We further notice that increasing the pore width resolves the barcodes better.

\bullet Barcode estimation using a cylindrical nanopore setup: If the dsDNA with barcodes were a rigid rod, then one could obtain the barcode distances dmnUDd_{mn}^{U\rightarrow D} and dnmDUd_{nm}^{D\rightarrow U} between tags TmT_{m} and TnT_{n} from the following equations (shown for downward translocation only):

dmnUD\displaystyle d_{mn}^{U\rightarrow D} =vmnUD×τmnUDwhere,\displaystyle=v_{mn}^{U\rightarrow D}\times\tau_{mn}^{U\rightarrow D}\quad{\rm where,} (3a)
vmnUD\displaystyle v_{mn}^{U\rightarrow D} =12[vdwellUD(m)+vdwellUD(n)],\displaystyle=\frac{1}{2}\left[v_{dwell}^{U\rightarrow D}(m)+v_{dwell}^{U\rightarrow D}(n)\right], (3b)
τmnUD\displaystyle\tau_{mn}^{U\rightarrow D} =(tiUD(n)tiUD(m)).\displaystyle=\left(t_{i}^{U\rightarrow D}(n)-t_{i}^{U\rightarrow D}(m)\right). (3c)

Here τmnUD\tau_{mn}^{U\rightarrow D} is the time delay of arrivals of TmT_{m} and TnT_{n} for downward translocation (please see Fig. 3(b) which explains the special case when m=7m=7 and n=8n=8). Similar Equations can be obtained by flipping DD and mm with UU and nn respectively. In other words, Eqn. 3 gives the shortest distance and not necessarily the contour length (the actual distance) between the tags. However, this is the only data accessible through experiments and likely to provide an underestimation of the barcodes. Fig. 5(a) shows the data for 300 scans. The average with error bars are shown in the 3rd column of Table-II. Excepting for T6T_{6} these measurements grossly underestimate the actual positions with large error bars.

Refer to caption
Figure 5: (a) Barcodes generated using different methods. In each graph, the colored symbols/lines refer to the corresponding colors of the barcodes T1T_{1}, T2T_{2} , T3T_{3}, , T4T_{4}, T5T_{5}, T6T_{6}, T7T_{7}, and T8T_{8} respectively. The open and filled symbols represent barcodes for UDU\rightarrow D and DUD\rightarrow U transolcation using (a) Eqn. 3; (c) using method 1, and (e) using method 2. In (b), (d) and (e) the solid lines refer to the actual location of the barcodes and the dashed lines correspond to the averages from (a), (c) and (e) respectively. The improved accuracy for the latter two methods are readily visible in (d) and (f) where the simulation and the actual data are almost indistinguishable.
Table 2: Barcodes from various methods
    Tag     Relative     Barcode     Barcode     Barcode    
    Label     Distance     (Eqn. 3)     (Method-I)     (Method-II)    
        w.r.t T5T_{5}     ×\times     \checkmark          \checkmark    
    T1T_{1}     460     373 ±\pm 122     459 ±\pm 59     460 ±\pm 43    
    T2T_{2}     245     197 ±\pm 67     250 ±\pm 39     250 ±\pm 32    
    T3T_{3}     235     183 ±\pm 63     237 ±\pm 38     237 ±\pm 32    
    T4T_{4}     215     167 ±\pm 54     211 ±\pm 35     211 ±\pm 30    
    T5T_{5}     0     0     0     0    
    T6T_{6}     11     11 ±\pm 3     14 ±\pm 4     11 ±\pm 3    
    T7T_{7}     82     68 ±\pm 23     86 ±\pm 23     86 ±\pm 21    
    T8T_{8}     287     230 ±\pm 73     287 ±\pm 65     287 ±\pm 73    

\bullet Tension Propagation (TP) Theory explains the source of discrepancy and provides solution:  Unlike a rigid rod, tension propagation governs the semi-flexible chain’s motion in the presence of an external bias. In TP theory and its implementation in Brownian dynamics, the motion of the subchain in the cis side decouples into two domains [16, 17]. In the vicinity of the pore, the tension front affects the motion directly while the second domain remains unperturbed, beyond the reach of the TP front. In our case, after the tag TmT_{m} translocates through the pore, preceding monomers are dragged into the pore quickly by the tension front, analogous to the uncoiling effect of a rope pulled from one end. The onset of this sudden faster motion continues to grow and reaches its maximum until the tension front hits the subsequent tag Tm±1T_{m\pm 1}, with larger inertia and viscous drag. At this time (called the tension propagation time [18]) the faster motion of the monomers begins to taper down to the velocity of the tag Tm±1T_{m\pm 1}. This process continues from one segment to the other. Fig. 6 shows an example on how the segment connecting T7T_{7} and T8T8 has non-monotonic velocity under the influence of the tension front.

Refer to caption
Figure 6: Tension propagation (TP) through the chain backbone connecting T7T_{7} and T8T8. (a) Figure shows a sudden fast movement of monomers right after T7T_{7}’s passage through the pore. Due to the TP front’s influence (yellow blob region), subsequent monomers are sucked into the pore quickly. (b) TP front finally reaches T8T8, leading to a slower translocation speed due to the tag’s large inertia and higher viscous drag.

These contour lengths of faster moving segments in between two barcodes are not accounted for in Eqn 3. The experimental protocols are limited in extracting barcode information through Eqn. 3 (measuring current blockade time) and therefore, likely to underestimate the barcodes, unless the data is corrected to account for the faster moving monomers in between two tags.

\bullet How to determine the barcodes correctly ?  Fig. 1(b) and the 3rd3^{rd} column of Table-II when looked closely provide clues to the solution of the underestimated tag distances. We note that locations of the isolated tags (such as, T1T_{1} and T8T_{8}) far from T5T_{5} have a larger error bar while T6T_{6} which is adjacent to T5T_{5} has the correct distance from Eqn. 3. It is simply because in the later case the contour length between T5T_{5} and T6T_{6} is almost equal to the shortest distance. Evidently, the error bars increase with increased separation.

To compare the barcodes obtained from Eqn. 3 with the actual contour length (see 2nd2^{nd} column of Table-II) between tag pairs, we invoke the Flory theory to determine the scaling exponent ν\nu  [19] which reveals the behavior of the segments under translocation. The heatmap in Fig. 7 confirms that when the separation between the tag pairs is less compared to the DNA length, the connecting segment behaves like a rigid rod (ν>0.6\nu>0.6). While for the isolated tags, ν<0.6\nu<0.6 suggests that barcodes are shorter than their respective contour lengths. This clarifies the reason behind the barcode underestimation for the tags which are spaced apart while yielding accurate barcodes for tags located in groups.

Refer to caption
Figure 7: Flory exponent (ν\nu) for the segment connecting a tag pair represented as a two dimensional heatmap array on the color scale ranging from blue to white.

Within the experimental set up we suggest the following two methods which will account for the larger velocities of the monomers.

Method 1 - Barcode from known end-to-end Tag distance:  In order to measure the barcode distances accurately one thus needs the velocity of the entire chain. If the distance between T1T_{1} and T8T_{8}) d18Ld_{18}\simeq L, then the velocity of the segment d18d_{18} will approximately account for the average velocity of the entire chain vchainv_{chain} and correct the problem as demonstrated next. First we estimate the velocity of the chain

vchainUDv18UD=d18/τ18UD,v_{chain}^{U\rightarrow D}\approx v_{18}^{U\rightarrow D}=d_{18}/\tau_{18}^{U\rightarrow D}, (4)

assuming we know d18d_{18} and τ18UD\tau_{18}^{U\rightarrow D} is the time delay of arrival at the pore between T1T_{1} and T8T_{8} for UD{U\rightarrow D} translocation. We then estimate the barcode distance dmnUDd_{mn}^{U\rightarrow D} between tags TmT_{m} and TnT_{n} as

dmnUD=v18UD×τmnUD.d_{mn}^{U\rightarrow D}=v_{18}^{U\rightarrow D}\times\tau_{mn}^{U\rightarrow D}. (5)

In the similar fashion one can calculate dmnDUd_{mn}^{D\rightarrow U} using vchainDUv_{chain}^{D\rightarrow U} and τmnUD\tau_{mn}^{U\rightarrow D} information respectively. How do we know d18d_{18} ? One can use d18Lscand_{18}\approx L_{\rm scan} and vchainv¯scanv_{chain}\approx\bar{v}_{\rm scan}, from Eqn. 6 where v¯scan\bar{v}_{\rm scan} is the the average velocity of the scanned length LscanL_{\rm scan} from repeated scanning as discussed in the next paragraph. This method is effective for estimating the long-spaced barcodes but it overestimates the barcode distance if multiple barcodes are close by as evident in Fig. 5(d) and the 4th4^{th} column of table-II. Thus, we know how to obtain barcode distances accurately when they are close by (from Eqn. 3) and for large separation (Eqn. 5). We now apply the physics behind these two schemes to derive an interpolation scheme that will work for all separations among the barcodes.

Method 2 - Barcode using two-step method: Average scan time τ¯scan\bar{\tau}_{\rm scan} for the entire chain (which can be measured experimentally) is a better way to estimate the average velocity of the chain. LscanL_{\rm scan} is the maximum length up to which the dsDNA segment remains captured inside the nanopore gets scanned and denotes the theoretical maximum beyond which the dsDNA will escape from the nanopore, thus, LLscanL\approx L_{\rm scan}. For example, in our simulation, scanning length Lscan=0.804LL_{\rm scan}=0.804L. We denote the average scan velocity as

v¯scan=1Nscani=1NscanLscan/τscan(i),\bar{v}_{\rm scan}=\frac{1}{N_{\rm scan}}\sum_{i=1}^{N_{\rm scan}}L_{\rm scan}/\tau_{\rm scan}(i), (6)

where τscan(i)\tau_{\rm scan}(i) is the scan time for the ithi^{th} event, and Nscan=300N_{\rm scan}=300. To proceed further, we use our established results that the monomers of the dsDNA segments in between the tags move with velocity v¯scan\bar{v}_{\rm scan}, while tags move with their respective dwell velocities vmnUDv_{mn}^{U\rightarrow D} and vmnDUv_{mn}^{D\rightarrow U} (Eqn. 2). We then calculate the segment velocity between two tags by taking the weighted average of the velocities of tags and DNA segment in between as follows.

First, we estimate the approximate number of monomers Nmn=dmnUD/blN_{mn}=d_{mn}^{U\rightarrow D}/\langle b_{l}\rangle (bl\langle b_{l}\rangle is the bond-length) by considering the tag velocities only using Eqn. 3. We then calculate the segment velocity accurately by incorporating weighted velocity contributions from both the tags and the monomers between the tags.

vweightUD=1Nmn[vdwellUD(m)+vdwellUD(n)+(Nmn2)v¯scan]\begin{split}v_{weight}^{U\rightarrow D}=\frac{1}{N_{mn}}\Big{[}v_{dwell}^{U\rightarrow D}(m)&+v_{dwell}^{U\rightarrow D}(n)+\\ &(N_{mn}-2)\bar{v}_{\rm scan}\Big{]}\end{split} (7)

The barcodes are finally estimated by multiplying the calculated 2-step velocity in Eqn. 7 above by the tag time delay as

dmnUD=vweightUD×τmnUDd_{mn}^{U\rightarrow D}=v_{weight}^{U\rightarrow D}\times\tau_{mn}^{U\rightarrow D} (8)

for UDU\rightarrow D translocation and repeating the procedure for DUD\rightarrow U translocation. This 2-step method accurately captures the distance between the barcodes when the two tags are in proximity or spaced apart from each other. Table-II and Fig. 5 summarize our main results and claims.

\bullet Summary & Future work:  Motivated by the recent experiments we have designed barcode determination experiment in silico in a cylindrical nanopore using the Brownian dynamics scheme on a model dsDNA with known locations of the barcodes. We have carefully chosen the locations of the barcodes so that the separations among the barcodes span a broad distribution. We discover that if we use the dwell time data only for the barcodes from multiple scans of the dsDNA to calculate the average velocities of the tags then the method underscores the barcode distances for tags further apart. Our simulation guides us to conclude that the source of this underestimation lies in neglecting the information contained in the faster moving DNA segments in between any two tags. We use non-equilibrium tension propagation theory to explain the non-monotonic velocity of the chain segments where the barcodes lie at the lower bound of the velocity envelope as shown in Fig. 4. The emerging picture readily shows the way how to rectify this error by introducing an interpolation scheme that works well to determine barcodes spaced apart for all distances which we validate using simulation data. We suggest how to implement the scheme in an experimental setup. It is important to note that the interpolation scheme-based concept of the TP theory is quite general and we have ample evidence that this will work in a double nanopore system as well.

\bullet Conflicts of interest: The authors declare no competing financial interest.

\bullet Acknowledgements: The research at UCF has been supported by the grant number 1R21HG011236-01 from the National Human Genome Research Institute at the National Institute of Health. All computations were carried out at the UCF’s high performance computing platform STOKES.

Appendix A The Model and Brownian dynamics simulation

Our BD scheme is implemented on a bead-spring model of a polymer with the monomers interacting via an excluded volume (EV), a Finite Extension Nonlinear Elastic (FENE) spring potential, and a bond-bending potential enabling variation of the chain persistence length p\ell_{p} (Fig.A1). The model, originally introduced for a fully flexible chain by Grest and Kremer [20], has been studied quite extensively by many groups using both Monte Carlo (MC) and various molecular dynamics (MD) methods [21]. Recently we have generalized the model for a semi-flexible chain and studied both equilibrium and dynamic properties [18, 22, 23] and studied compression dynamics of a model dsDNA inside a nanochannel [24, 25] . The mutual EV interaction among any two monomers are given by the truncated Lennard-Jones (LJ) potential with a cut-off radius 21/6σ2^{1/6}\sigma

ULJ(rij)={4ϵ[(σrij)12(σrij)6]+ϵ, for r<21/60, otherwise \displaystyle U_{LJ}(r_{ij})=\begin{cases}4\epsilon\left[\left(\frac{\sigma}{r_{ij}}\right)^{12}-\left(\frac{\sigma}{r_{ij}}\right)^{6}\right]+\epsilon,\text{ for }r<2^{1/6}\\ 0,\text{ otherwise }\end{cases} (9)

where σ\sigma is the effective diameter of a monomer and ϵ\epsilon is the interaction strength. To mimic the connectivity between two adjacent monomers, finite-extensible-non-linear elastic (FENE) potential

Refer to caption
Figure A1: (a) Illustration depicts the monomers are interacting via LJ and FENE potential. The three body bending potential is calculated using the angle θi\theta_{i} between two adjacent bond vectors bi\vec{b}_{i} and bi+1\vec{b}_{i+1} respectively. (b) Interaction potential between two consecutive monomers is given by the green line for a separation distance rr in unit of σ\sigma. The blue diamonds denote the LJ potential with a cutoff radius 21/6σ2^{1/6}\sigma and the magenta circles correspond to the FENE potential with a spring constant κF=30.0ϵ/σ2\kappa_{F}=30.0\epsilon/\sigma^{2}. (c) A cylindrical nanopore of diameter 2σ2\sigma is dilled into a material of thickness tporet_{pore}. The walls consist of purely repulsive LJ particles.
UFENE(rij)=12κFR02ln[1(rijR0)2]\displaystyle U_{FENE}(r_{ij})=-\frac{1}{2}\kappa_{F}R_{0}^{2}\ln\left[1-\left(\frac{r_{ij}}{R_{0}}\right)^{2}\right] (10)

is used with the maximum bond-stretching length R0=1.5σR_{0}=1.5\sigma and spring constant κF=30ϵ/σ2\kappa_{F}=30\epsilon/\sigma^{2}. Here, rij=|rirj|r_{ij}=|\vec{r}_{i}-\vec{r}_{j}| is the separation distance between two adjacent monomers ii and j=i±1j=i\pm 1 located at ri\vec{r}_{i} and rj\vec{r}_{j} respectively. Along with these two potentials, we introduce a bending potential

Ubend(θi)=κ(1cos(θi))\displaystyle U_{bend}(\theta_{i})=\kappa\left(1-\cos\left(\theta_{i}\right)\right) (11)

with bending rigidity κ\kappa. In three dimensions, for κ0\kappa\neq 0, the persistence length p\ell_{p} of the chain is related to κ\kappa via [26]

p=κkBT,\ell_{p}=\frac{\kappa}{k_{B}T}, (12)

where kBk_{B} is the Boltzmann constant and TT is the temperature. Here θi\theta_{i} is the bond angle between two subsequent bond vectors bi=ri+1ri\vec{b}_{i}=\vec{r}_{i+1}-\vec{r}_{i} and bi1=riri1\vec{b}_{i-1}=\vec{r}_{i}-\vec{r}_{i-1}. A cylindrical nanopore of diameter 2σ2\sigma is drilled through a solid material of thickness tporet_{pore} consists of immobile and purely repulsive LJ particles. Our model of DNA polymer consists 10161016 monomer beads along with 88 heavier tags (T1T_{1} - T8T_{8}) located at positions 154,369,379,399,614,625,696154,369,379,399,614,625,696, and 901901 respectively (please refer to Fig. 2 and Table-I in the main article). A recent study by Zhang et al. on 48512 bp long dsDNA uses 75 bp long protein tags as barcodes [10]. In simulation, we purposely choose the mass of a tag (mtagm_{tag}) three times heavier of a normal monomer to replicate the tags used in the experiments. We proportionally increase the solvent friction of the tags Γtag=3Γi\Gamma_{tag}=3\Gamma_{i}. We use the Brownian dynamics to solve the equation of motion of a monomer ii having a mass mim_{i} and solvent friction Γi\Gamma_{i} as

miri¨=i[ULJ+UFENE+Ubend+Uwall]Γivi+ηi\displaystyle m_{i}\ddot{\vec{r_{i}}}=\vec{\nabla_{i}}\left[U_{LJ}+U_{FENE}+U_{bend}+U_{wall}\right]-\Gamma_{i}\vec{v_{i}}+\eta_{i} (13)

where Γi=0.7miϵ2/σ2\Gamma_{i}=0.7\sqrt{m_{i}\epsilon^{2}/\sigma^{2}} is the frictional coefficient arising from solvent-monomer interaction. For the case of a tag, mtag=3mim_{tag}=3m_{i} and Γtag=2.1miϵ2/σ2\Gamma_{tag}=2.1\sqrt{m_{i}\epsilon^{2}/\sigma^{2}}. The Gaussian white noise ηi\eta_{i} arising from thermal fluctuation is delta correlated and expressed as ηi(t).ηjj(t)=2dkBTΓδijδ(tt)\langle\eta_{i}(t).\eta_{j}{j}(t^{\prime})\rangle=2dk_{B}T\Gamma\delta_{ij}\delta(t-t^{\prime}) with d=3d=3 in three dimension. We express length and energy in units of σ\sigma and ϵ\epsilon respectively such that kBT/ϵ=1.0k_{B}T/\epsilon=1.0. The parameters for FENE potential in Eq. (10) are κF\kappa_{F} and R0R_{0}, and set to be κF=30ϵ/σ2\kappa_{F}=30\epsilon/\sigma^{2} and R0=1.5σR_{0}=1.5\sigma. The numerical integration of Eq. (13) is implemented using the algorithm introduced by Gunsteren and Berendsen [27]. Our previous experiences with BD simulation suggests that for a time step Δt=0.01\Delta t=0.01 these parameters values produce stable trajectories over a very long period of time and do not lead to unphysical crossing of a bond by a monomer [22, 23]. The average bond length stabilizes to bl=0.971±0.001σ\langle b_{l}\rangle=0.971\pm 0.001\sigma with negligible fluctuation regardless of the chain size and rigidity [22]. Hence we relate the polymer’s contour length LL and the number of monomers NN as L=(N1)blL=(N-1)\langle b_{l}\rangle.

References

  • [1] Hebert, P. D. N.; Ratnasingham, S.; de Waard, J. R. Barcoding Animal Life: Cytochrome c Oxidase Subunit 1 Divergences among Closely Related Species. Proc. R. Soc. Lond. B 2003, 270, 96.
  • [2] Hebert, P. D. N.; Cywinska, A.; Ball, S. L.; deWaard, J. R. Biological Identifications through DNA Barcodes. Proc. R. Soc. Lond. B 2003, 270 (1512), 313-321.
  • [3] Hebert, P. D. N.; Penton, E. H.; Burns, J. M.; Janzen, D. H.; Hallwachs, W. Ten Species in One: DNA Barcoding Reveals Cryptic Species in the Neotropical Skipper Butterfly Astraptes Fulgerator. Proceedings of the National Academy of Sciences 2004, 101 (41), 14812-14817.
  • [4] Vernooy, R.; Haribabu, E.; Muller, M. R.; Vogel, J. H.; Hebert, P. D. N.; Schindel, D. E.; Shimura, J.; Singer, G. A. C. Barcoding Life to Conserve Biological Diversity: Beyond the Taxonomic Imperative. PLoS Biol 2010, 8 (7), e1000417.
  • [5] Besansky, N. J.; Severson, D. W.; Ferdig, M. T. DNA Barcoding of Parasites and Invertebrate Disease Vectors: What You Don’t Know Can Hurt You. Trends in Parasitology 2003, 19 (12), 545-546.
  • [6] Techen, N.; Parveen, I.; Pan, Z.; Khan, I. A. DNA Barcoding of Medicinal Plant Material for Identification. Current Opinion in Biotechnology 2014, 25, 103-110.
  • [7] Xiong, X.; Yuan, F.; Huang, M.; Lu, L.; Xiong, X.; Wen, J. DNA Barcoding Revealed Mislabeling and Potential Health Concerns with Roasted Fish Products Sold across China. 2019, 82 (7), 1200-1209.
  • [8] Wong, E. H.-K.; Hanner, R. H. DNA Barcoding Detects Market Substitution in North American Seafood. Food Research International 2008, 41 (8), 828-837.
  • [9] Pud, S.; Chao, S.-H.; Belkin, M.; Verschueren, D.; Huijben, T.; van Engelenburg, C.; Dekker, C.; Aksimentiev, A. Mechanical Trapping of DNA in a Double-Nanopore System. Nano Lett. 2016, 16 (12), 8021-8028.
  • [10] Zhang, Y.; Liu, X.; Zhao, Y.; Yu, J.-K.; Reisner, W.; Dunbar, W. B. Single Molecule DNA Resensing Using a Two-Pore Device. Small 2018, 14 (47), 1801890.
  • [11] Liu, X.; Zhang, Y.; Nagel, R.; Reisner, W.; Dunbar, W. B. Controlling DNA Tug-of-War in a Dual Nanopore Device. Small 2019, 15 (30), 1901704.
  • [12] Liu, X.; Zimny, P.; Zhang, Y.; Rana, A.; Nagel, R.; Reisner, W.; Dunbar, W. B. Flossing DNA in a Dual Nanopore Device. Small 2020, 16 (3), 1905379.
  • [13] Bhattacharya, A.; Seth, S. Tug of War in a Double-Nanopore System. Phys. Rev. E 2020, 101 (5).
  • [14] Seth, S.; Bhattacharya, A. Polymer Escape through a Three Dimensional Double-Nanopore System. J. Chem. Phys. 2020, 153 (10), 104901.
  • [15] Choudhary, A.; Joshi, H.; Chou, H.-Y.; Sarthak, K.; Wilson, J.; Maffeo, C.; Aksimentiev, A. High-Fidelity Capture, Threading, and Infinite-Depth Sequencing of Single DNA Molecules with a Double-Nanopore System. ACS Nano 2020, 14 (11), 15566-15576.
  • [16] Sakaue, T. Nonequilibrium Dynamics of Polymer Translocation and Straightening. Phys. Rev. E 2007, 76 (2).
  • [17] Ikonen, T.; Bhattacharya, A.; Ala-Nissila, T.; Sung, W. Influence of Non-Universal Effects on Dynamical Scaling in Driven Polymer Translocation. The Journal of Chemical Physics 2012, 137 (8), 085101.
  • [18] Adhikari, R.; Bhattacharya, A. Driven Translocation of a Semi-Flexible Chain through a Nanopore: A Brownian Dynamics Simulation Study in Two Dimensions. The Journal of Chemical Physics 2013, 138 (20), 204909.
  • [19] Rubinstein, M.; Colby, R. H. Polymer physics. Oxford: Oxford University Press 2003.
  • [20] Grest, G. S.; Kremer, K. Molecular Dynamics Simulation for Polymers in the Presence of a Heat Bath. Phys. Rev. A 1986, 33 (5), 3628-3631.
  • [21] Binder, K. Monte Carlo and Molecular Dynamics Simulations in Polymer Science; Oxford University Press, 1995, Chap. 2.
  • [22] Huang, A.; Bhattacharya, A.; Binder, K. Conformations, Transverse Fluctuations, and Crossover Dynamics of a Semi-Flexible Chain in Two Dimensions. The Journal of Chemical Physics 2014, 140 (21), 214902.
  • [23] Huang, A.; Adhikari, R.; Bhattacharya, A.; Binder, K. Universal Monomer Dynamics of a Two-Dimensional Semi-Flexible Chain. EPL 2014, 105 (1), 18002.
  • [24] Huang, A.; Reisner, W.; Bhattacharya, A. Dynamics of DNA Squeezed Inside a Nanochannel via a Sliding Gasket. Polymers 2016, 8 (10), 352.
  • [25] Bernier, S.; Huang, A.; Reisner, W.; Bhattacharya, A. Evolution of Nested Folding States in Compression of a Strongly Confined Semiflexible Chain. Macromolecules 2018, 51 (11), 4012-4022.
  • [26] Landau, L. D.; Lifshitz, E. M.Statistical Physics; Pergamon Press 1981.
  • [27] van Gunsteren, W. F.; Berendsen, H. J. C. Algorithms for Brownian Dynamics. Molecular Physics 1982, 45 (3), 637-647.