Constant Acceleration Flow

Dogyun Park
Korea University
[email protected]
&Sojin Lee
Korea University
sojin

\_

[email protected]
&Sihyeon Kim
Korea University
sh

\_

[email protected]
&Taehoon Lee
Korea University
[email protected]
&Youngjoon Hong^∗
KAIST
[email protected]
&Hyunwoo J. Kim
Korea University
[email protected] Corresponding authors.

Abstract

Rectified flow and reflow procedures have significantly advanced fast generation by progressively straightening ordinary differential equation (ODE) flows. They operate under the assumption that image and noise pairs, known as couplings, can be approximated by straight trajectories with constant velocity. However, we observe that modeling with constant velocity and using reflow procedures have limitations in accurately learning straight trajectories between pairs, resulting in suboptimal performance in few-step generation. To address these limitations, we introduce Constant Acceleration Flow (CAF), a novel framework based on a simple constant acceleration equation. CAF introduces acceleration as an additional learnable variable, allowing for more expressive and accurate estimation of the ODE flow. Moreover, we propose two techniques to further improve estimation accuracy: initial velocity conditioning for the acceleration model and a reflow process for the initial velocity. Our comprehensive studies on toy datasets, CIFAR-10, and ImageNet 64×64 demonstrate that CAF outperforms state-of-the-art baselines for one-step generation. We also show that CAF dramatically improves few-step coupling preservation and inversion over Rectified flow. Code is available at https://github.com/mlvlab/CAF.

1 Introduction

Diffusion models [1, 2] learn the probability flow between a target data distribution and a simple Gaussian distribution through an iterative process. Starting from Gaussian noise, they gradually denoise to approximate the target distribution via a series of learned local transformations. Due to their superior generative capabilities compared to other models such as GANs and VAEs, diffusion models have become the go-to choice for high-quality image generation. However, their multi-step generation process entails slow generation and imposes a significant computational burden. To address this issue, two main approaches have been proposed: distillation models [3, 4, 5, 6, 7, 8, 9] and methods that simplify the flow trajectories [10, 11, 12, 13, 14] to achieve fewer-step generation. An example of the latter is rectified flow [10, 13, 11], which focuses on straightening ordinary differential equation (ODE) trajectories. Through repeated applications of the rectification process, called reflow, the trajectories become progressively straighter by addressing the flow crossing problem. Straighter flows reduce discretization errors, enabling fewer steps in the numerical solution and, thus, faster generation.

Rectified flow [10, 13] defines the straight ODE flow over time $t$ with a drift force $\mathbf{v}$ , where each sample $\mathbf{x}_{t}$ transforms from $\mathbf{x}_{0}\sim\pi_{0}$ to $\mathbf{x}_{1}\sim\pi_{1}$ under a constant velocity $v=\mathbf{x}_{1}-\mathbf{x}_{0}$ . It approximates the underlying velocity $\mathbf{v}$ with a neural network $\mathbf{v}_{\theta}$ . Then, it iteratively applies the reflow process to avoid flow crossing by rewiring the flow and building deterministic data coupling. However, constant velocity modeling may limit the expressiveness needed for approximating complex couplings between $\pi_{0}$ and $\pi_{1}$ . This results in sampling trajectories that fail to converge optimally to the target distribution. Moreover, the interpolation paths after the reflow may still intersect—a phenomenon known as flow crossing—which leads to curved rectified flows because the model estimates different targets for the same input. As illustrated in Fig. 1(a), instead of following the intended path from $\mathbf{x}_{0}^{1}$ to $\mathbf{x}_{1}^{1}$ , a sampling trajectory from Rectified flow erroneously diverts towards $\mathbf{x}_{1}^{2}$ due to the flow crossing. Such flow crossing can make the accurate learning of straight ODE trajectories more challenging.

In this paper, we introduce the Constant Acceleration Flow (CAF), a novel ODE framework based on a constant acceleration equation, as outlined in (4). Our CAF generalizes Rectified flow by introducing acceleration as an additional learnable variable. This constant acceleration modeling offers the ability to control flow characteristics by manipulating the acceleration magnitude and enables a direct closed-form solution of the ODE, supporting precise and efficient sampling in just a few steps. Additionally, we propose two strategies to address the flow crossing problem. The first one is initial velocity conditioning (IVC) for the acceleration model, and the second one is to employ reflow to enhance the learning of initial velocity. Fig. 1(b) presents that CAF, with the proposed strategies, can accurately predict the ground-truth path from $\mathbf{x}_{0}^{1}$ to $\mathbf{x}_{1}^{1}$ , even when flow crossing occurs. Through extensive experiments, from toy datasets to real-world image generation on CIFAR-10 [15] and ImageNet 64 $\times$ 64, we demonstrate that our CAF exhibits superior performance over Rectified flow and state-of-the-art baselines. Notably, CAF achieves superior Fréchet Inception Distance (FID) scores on CIFAR-10 and ImageNet 64 $\times$ 64 in conditional settings, recording FIDs of 1.39 and 1.69, respectively, thereby surpassing recent strong methods. Moreover, we show that CAF provides more accurate flow estimation than Rectified flow by assessing the ‘straightness’ and ‘coupling preservation’ of the learned ODE flow. CAF is also capable of few-step inversion, making it effective for real-world applications such as box inpainting.

To summarize, our contributions are as follows:

•

We propose Constant Acceleration Flow (CAF), a novel ODE framework that integrates acceleration as a controllable variable, enhancing the precision of ODE flow estimation compared to the constant velocity framework.
•

We propose two strategies to address the flow crossing problem: initial velocity conditioning for the acceleration model and a reflow procedure to improve initial velocity learning. These strategies ensure a more accurate trajectory estimation even in the presence of flow crossings.
•

Through extensive experiments on synthetic and real datasets, CAF demonstrates remarkable performance, especially achieving the superior FID on CIFAR-10 and ImageNet 64 $\times$ 64 over strong baselines. We also demonstrate that CAF learns more accurate flow than Rectified flow by assessing the straightness, coupling preservation, and inversion.

2 Related work

Generative models.

Learning generative models involves finding a nonlinear transformation between two distributions, typically denoted as $\pi_{0}$ and $\pi_{1}$ , where $\pi_{0}$ is a simple distribution like a Gaussian, and $\pi_{1}$ is the complex data distribution. Various approaches have been developed to achieve this transformation. For example, variational autoencoders (VAE) [16, 17] optimize the Evidence Lower Bound (ELBO) to learn a nonlinear mapping from the latent space distribution $\pi_{0}$ to the data distribution $\pi_{1}$ . Normalizing flows [18, 19, 20] construct a series of invertible and differentiable mappings to transform $\pi_{0}$ into $\pi_{1}$ . Similarly, GANs [21, 22, 23, 24, 25] earn a generator that transforms $\pi_{0}$ into $\pi_{1}$ through an adversarial process involving a discriminator. These models typically perform a one-step generation from $\pi_{0}$ to $\pi_{1}$ . In contrast, diffusion models [2, 26, 27, 28, 29, 30] propose learning the probability flow between the two distributions through an iterative process. This iterative process ensures stability and precision, as the model incrementally learns to reverse a diffusion process that adds noise to data. Diffusion models have demonstrated superior performance across various domains, including images [31, 12, 32, 33], 3D [34, 35, 36, 37], and video [38, 39, 40].

Few-step diffusion models

Addressing the slow generation speed of diffusion models has become a major focus in recent research: Distillation methods [3, 4, 5, 6, 7, 8, 9] seek to optimize the inference steps of pre-trained diffusion models by amortizing the integration of ODE flow. Consistency models [6, 8, 7] train a model to map any point on the pre-trained diffusion trajectory back to the data distribution, enabling fast generation. Rectified flow [10, 13, 11] is another direction, which focuses on straightening ODE trajectories under a constant velocity field. By straightening the flow and reducing path complexity, it allows for fast generation through efficient and accurate numerical solutions with fewer Euler steps. Recent methods such as AGM [41] also introduce acceleration modeling based on Stochastic Optimal Control (SOC) theory instead of relying solely on velocity. However, AGM predicts time-varying acceleration, which still requires multiple iterative steps to solve the differential equations. In contrast, our proposed CAF ODE assumes that the acceleration term is constant with respect to time. Therefore, there is no need to iteratively solve complex time-dependent differential equations. This simplification allows for a direct closed-form solution that supports efficient and accurate sampling in just a few steps.

3 Preliminary

Rectified flow [10, 13] is an ordinary differential equation-based framework for learning a mapping between two distributions $\pi_{0}$ and $\pi_{1}$ . Typically, in image generation, $\pi_{0}$ is a simple tractable distribution, e.g., the standard normal distribution, defined in the latent space and $\pi_{1}$ is the image distribution. Given empirical observations of $\mathbf{x}_{0}\sim\pi_{0}$ and $\mathbf{x}_{1}\sim\pi_{1}$ over time $t\in[0,1]$ , a flow is defined as

\frac{d\mathbf{x}_{t}}{dt}=\mathbf{v}(\mathbf{x}_{t},t),

(1)

where $\mathbf{x}_{t}=\mathcal{I}(\mathbf{x}_{0},\mathbf{x}_{1},t)$ is a time-differentiable interpolation between $\mathbf{x}_{0}$ and $\mathbf{x}_{1}$ , and $\mathbf{v}:\mathbb{R}^{d}\times[0,1]\rightarrow\mathbb{R}^{d}$ is a velocity field defined on data-time domain. Rectified flow learns the velocity field $\mathbf{v}$ with a neural network $\mathbf{v}_{\theta}$ by minimizing the following mean square objective:

\min_{\theta}\mathbb{E}_{\mathbf{x}_{0},\mathbf{x}_{1}\sim\gamma,t\sim p(t)}\left[\left\|\mathbf{v}(\mathbf{x}_{t},t)-\mathbf{v}_{\theta}(\mathbf{x}_{t},t)\right\|^{2}\right],

(2)

where $\gamma$ represents a coupling of ( $\pi_{0},\pi_{1}$ ) and $p(t)$ is a time distribution defined on $[0,1]$ . The choice of interpolation $\mathcal{I}$ leads to various algorithms, such as Rectified flow [10], ADM [30], EDM [29], and LDM [42]. Specifically, Rectified flow proposes a simple linear interpolation between $\mathbf{x}_{0}$ and $\mathbf{x}_{1}$ as $\mathbf{x}_{t}=(1-t)\mathbf{x}_{0}+t\mathbf{x}_{1}$ , which induces the velocity field $\mathbf{v}$ in the direction of ( $\mathbf{x}_{1}-\mathbf{x}_{0})$ , i.e., $\mathbf{v}(\mathbf{x}_{t},t)=\mathbf{x}_{1}-\mathbf{x}_{0}$ . This means the Rectified flow transports $\mathbf{x}_{0}$ to $\mathbf{x}_{1}$ along a straight trajectory with a constant velocity. After training $\mathbf{v}_{\theta}$ , we can generate a sample $\mathbf{x}_{1}$ using off-the-shelf ODE solvers $\Phi$ , such as the Euler method:

\mathbf{x}_{t+\Delta t}=\mathbf{x}_{t}+\Delta t\cdot\mathbf{v}_{\theta}(\mathbf{x}_{t},t),\quad t\in\{0,\Delta t,\dots,(N-1)\cdot\Delta t\},

(3)

where $\Delta t=\frac{1}{N}$ and $N$ is the total number of steps. To achieve faster generation with fewer steps without sacrificing accuracy, it is crucial to learn a straight ODE flow. Straight ODE flow minimize numerical errors encountered by the ODE solver.

Reflow and flow crossing.

The trajectories of interpolants $\mathbf{x}_{t}$ may intersect—a phenomenon known as flow crossing—due to stochastic coupling between $\pi_{0}$ and $\pi_{1}$ (e.g., random pairing of $\mathbf{x}_{0}$ and $\mathbf{x}_{1}$ ). These intersections introduce approximation errors in the neural network, leading to curved sampling trajectories [10]. Our toy experiment, illustrated in Fig. 1(a), clearly demonstrates this issue: the simulated sampling trajectories become curved due to flow crossing, rendering one-step simulation inaccurate. To address this problem, Rectified flow [10] introduces a reflow procedure. This procedure iteratively straightens the trajectories by reconstructing a more deterministic and direct pairing of $\mathbf{x}_{0}$ and $\mathbf{x}_{1}$ without altering the marginal distributions. Specifically, the reflow procedure involves generating a new coupling $\gamma$ of $(\mathbf{x}_{0},\mathbf{x}_{1}=\Phi(\mathbf{x}_{0};\mathbf{v}_{\theta}^{k}))$ using a pre-trained Rectified flow model $\mathbf{v}_{\theta}^{k}$ , where $k$ denotes the iteration of the reflow procedure, and $\Phi(\mathbf{x}_{0};\mathbf{v}_{\theta}^{k})=\mathbf{x}_{0}+\int_{0}^{1}\mathbf{v}_{\theta}^{k}(\mathbf{x}_{t},t)dt$ . By iteratively refining the coupling and the velocity field, the reflow procedure reduces flow crossing, resulting in straighter trajectories and improved accuracy in fewer steps.

4 Method

We aim to develop a generative model based on the ODE framework that enables faster generation without compromising quality. To achieve this, we propose a novel approach called Constant Acceleration Flow (CAF). Specifically, CAF formulates an ODE trajectory that transports $\mathbf{x}_{t}$ with a constant acceleration, offering a more expressive and precise estimation of the ODE flow compared to constant velocity models. Additionally, we propose two novel techniques that address the problem of flow crossing: 1) initial velocity conditioning and 2) reflow procedure for learning initial velocity. The overall training pipeline is presented in Alg. 1.

4.1 Constant Acceleration Flow

We propose a novel ODE framework based on the constant acceleration equation, which is driven by the empirical observations $\mathbf{x}_{0}\sim\pi_{0}$ and $\mathbf{x}_{1}\sim\pi_{1}$ over time $t\in[0,1]$ as:

d\mathbf{x}_{t}=\mathbf{v}(\mathbf{x}_{0},0)dt+\mathbf{a}(\mathbf{x}_{t},t)tdt,

(4)

where $\mathbf{v}:\mathbb{R}^{d}\times[0]\rightarrow\mathbb{R}^{d}$ is the initial velocity field and $\mathbf{a}:\mathbb{R}^{d}\times[0,1]\rightarrow\mathbb{R}^{d}$ is the acceleration field. We abbreviate time variable $t$ for notation simplicity, i.e., $\mathbf{v}(\mathbf{x}_{0},0)=\mathbf{v}(\mathbf{x}_{0})$ , $\mathbf{a}(\mathbf{x}_{t},t)=\mathbf{a}(\mathbf{x}_{t})$ . By integrating both sides of (4) with respect to $t$ and assuming a constant acceleration field, i.e., $\mathbf{a}(\mathbf{x}_{t_{1}})=\mathbf{a}(\mathbf{x}_{t_{2}}),\forall t_{1},t_{2}\in[0,1]$ , we derive the following equation:

\mathbf{x}_{t}=\mathbf{x}_{0}+\mathbf{v}(\mathbf{x}_{0})t+\frac{1}{2}\mathbf{a}(\mathbf{x}_{t})t^{2}.

(5)

Given the initial velocity field $\mathbf{v}$ , the acceleration field $\mathbf{a}$ can be derived as

\mathbf{a}(\mathbf{x}_{t})=2(\mathbf{x}_{1}-\mathbf{x}_{0})-2\mathbf{v}(\mathbf{x}_{0}),

(6)

by setting $t=1$ and the constant acceleration assumption. Then, we propose a time-differentiable interpolation $\mathcal{I}$ as:

\mathbf{x}_{t}=\mathcal{I}(\mathbf{x}_{0},\mathbf{x}_{1},t,\mathbf{v}(\mathbf{x}_{0}))=(1-t^{2})\mathbf{x}_{0}+t^{2}\mathbf{x}_{1}+\mathbf{v}(\mathbf{x}_{0})(t-t^{2}),

(7)

by substituting (6) to (5). Using this result, we can easily simulate an intermediate sample $\mathbf{x}_{t}$ on our CAF ODE trajectory.

Learning initial velocity field.

Selecting an appropriate initial velocity field is crucial, as different initial velocities lead to distinct flow dynamics. Here, we define the initial velocity field as a scaled displacement vector between $\mathbf{x}_{1}$ and $\mathbf{x}_{0}$ :

\mathbf{v}(\mathbf{x}_{0})=h(\mathbf{x}_{1}-\mathbf{x}_{0}),

(8)

where $h\in\mathbb{R}$ is a hyperparameter that adjusts the scale of the initial velocity. This configuration enables straight ODE trajectories between distributions $\pi_{0}$ and $\pi_{1}$ , similar to those in Rectified flow. However, varying $h$ changes the flow characteristics: 1) $h=1$ simulates constant velocity flows, 2) $h<1$ leads to a model with a positive acceleration, and 3) $h>1$ results in a negative acceleration, as illustrated in Fig. 3. Empirically, we observe that the negative acceleration model is more effective for image sampling, possibly due to its ability to finely tune step sizes near data distribution.

The initial velocity field is learned using a neural network $\mathbf{v}_{\theta}$ , which is optimized by minimizing the distance metric $d(\cdot,\cdot)$ between the target and estimated velocities as

\min_{\theta}\mathbb{E}_{\mathbf{x}_{0},\mathbf{x}_{1}\sim\gamma,t\sim p(t),\mathbf{x}_{t}\sim\mathcal{I}}\left[d(\mathbf{v}(\mathbf{x}_{0}),\mathbf{v}_{\theta}(\mathbf{x}_{t}))\right],

(9)

where $p(t)$ is a time distribution defined on $[0,1]$ . Note that our velocity model learns target initial velocity defined at $t=0$ . This differs from Rectified flow, which learns target velocity field defined over $t\in[0,1]$ .

Learning acceleration field.

Under the assumption of constant acceleration, the acceleration field is derived from (6) as

\mathbf{a}(\mathbf{x}_{t})=2(\mathbf{x}_{1}-\mathbf{x}_{0})-2\mathbf{v}(\mathbf{x}_{0}).

(10)

We learn the acceleration field using a neural network $\mathbf{a}_{\phi}$ by minimizing the distance metric $d(\cdot,\cdot)$ as:

\min_{\phi}\mathbb{E}_{\mathbf{x}_{0},\mathbf{x}_{1}\sim\gamma,t\sim p(t),\mathbf{x}_{t}\sim\mathcal{I}}\left[d(\mathbf{a}(\mathbf{x}_{t}),\mathbf{a}_{\phi}(\mathbf{x}_{t}))\right].

(11)

In Sec. C, we theoretically show that CAF ODE preserves the marginal data distribution.

4.2 Addressing flow crossing

Rectified flow addresses the issue of flow crossing by a reflow procedure. However, even after the procedure, trajectories may still intersect each other. Such intersections hinder learning straight ODE trajectories, as demonstrated in Fig. 1(a). Similarly, our acceleration model also encounters the flow crossing problem. This leads to inaccurate estimation, as the model struggles to predict estimation on these intersections correctly. To further address the flow crossing, we propose two techniques.

Initial velocity conditioning (IVC).

We propose conditioning the estimated initial velocity $\hat{\mathbf{v}}_{\theta}=\mathbf{v}(\mathbf{x}_{0})$ as the input of the acceleration model, i.e., $\mathbf{a}_{\phi}(\mathbf{x}_{t},\hat{\mathbf{v}}_{\theta})$ . This approach provides the acceleration model with auxiliary information on the flow direction, enhancing its capability to distinguish correct estimations and mitigate ambiguity at the intersections of trajectories, as illustrated in Fig. 1. Our IVC circumvents the non-intersecting condition required in Rectified flow (see Theorem 3.6 in [10]), which is a key assumption for achieving a straight coupling $\gamma$ . By reducing the ambiguity arising from intersections, CAF can learn straight trajectories with less constrained couplings, which is quantitatively assessed in Tab. 5.

To incorporate IVC into learning the acceleration model, we reformulate (11) as:

\min_{\phi}\mathbb{E}_{\mathbf{x}_{0},\mathbf{x}_{1}\sim\gamma,t\sim p(t),\mathbf{x}_{t}\sim\mathcal{I}}\left[d\left(\text{sg}[\mathbf{a}(\mathbf{x}_{t})],\mathbf{a}_{\phi}(\mathbf{x}_{t},\hat{\mathbf{v}}_{\theta})\right)\right].

(12)

where $\text{sg}[\cdot]$ indicates stop-gradient operation. Since our velocity model learns to predict the initial velocity (see (9)), we ensure that the model can handle both forward and reverse CAF ODEs, which start from $\mathbf{x}_{0}$ and $\mathbf{x}_{1}$ , respectively. Thus, our acceleration model can generalize across different flow directions, enabling inversion as demonstrated in Sec. B.2.

Algorithm 1 Training process of Constant Acceleration Flow

1:deterministic coupling

\gamma

, initial velocity scale

h

\mathbf{v}_{\theta},\mathbf{a}_{\phi}

2:while not converge do

\mathbf{x}_{0},\mathbf{x}_{1}\sim\gamma

t\sim\text{Unif}([0,1])

\mathbf{v}(\mathbf{x}_{0})=h(\mathbf{x}_{1}-\mathbf{x}_{0})

\triangleright

Target initial velocity

\mathbf{x}_{t}=\mathcal{I}(\mathbf{x}_{0},\mathbf{x}_{1},t,\mathbf{v}(\mathbf{x}_{0}))

\triangleright

Eq. (7)

\mathcal{L}_{\text{vel}}=d(\mathbf{v}(\mathbf{x}_{0}),\mathbf{v}_{\theta}(\mathbf{x}_{t}))

\theta\leftarrow\theta-\nabla\mathcal{L}_{\text{vel}}

\triangleright

update

\theta

using SGD with gradient

8:end while

9:while not converge do

10:

\mathbf{x}_{0},\mathbf{x}_{1}\sim\gamma

t\sim\text{Unif}([0,1]),\hat{\mathbf{v}}_{\theta}=\mathbf{v}_{\theta}(\mathbf{x}_{0})

11:

\mathbf{a}(\mathbf{x}_{t})=2(\mathbf{x}_{1}-\mathbf{x}_{0})-2\hat{\mathbf{v}}_{\theta}

\triangleright

Target acceleration

12:

\mathbf{x}_{t}=\mathcal{I}(\mathbf{x}_{0},\mathbf{x}_{1},t,\hat{\mathbf{v}}_{\theta})

\triangleright

Eq. (7)

13:

\mathcal{L}_{\text{acc}}=d(\text{sg}[\mathbf{a}(\mathbf{x}_{t})],\mathbf{a}_{\phi}(\mathbf{x}_{t},\hat{\mathbf{v}}_{\theta}))

14:

\phi\leftarrow\phi-\nabla\mathcal{L}_{\text{acc}}

\triangleright

update

\phi

using SGD with gradient

15:end while

16:return

\mathbf{v}_{\theta},\mathbf{a}_{\phi}

Reflow for initial velocity.

It is also important to improve the accuracy of the initial velocity model. Following [10], we address the inaccuracy caused by stochastic pairing of $\mathbf{x}_{0}$ and $\mathbf{x}_{1}$ by employing a pre-trained generative model $\psi$ , which constructs a more deterministic coupling $\gamma$ of $\mathbf{x}_{0}$ and $\mathbf{x}_{1}$ . We subsequently use this new coupling $\gamma$ to train the initial velocity and acceleration models.

4.3 Sampling

After training the initial velocity and acceleration models, we generate samples using the CAF ODE introduced in (4). The discrete sampling process is given by:

\mathbf{x}_{t+\Delta t}=\mathbf{x}_{t}+\Delta t\cdot\mathbf{v}_{\theta}(\mathbf{x}_{0})+t^{\prime}\cdot\Delta t\cdot\mathbf{a}_{\phi}(\mathbf{x}_{t},t,\mathbf{v}_{\theta}(\mathbf{x}_{0})),

(13)

where $N$ is the total number of steps, $\Delta t=\frac{1}{N}$ , $t=i\cdot\Delta t$ , and $t^{\prime}=\frac{(2i+1)}{2}\cdot\Delta t$ where $i\in\{0,...,N-1\}$ (See Alg. 2). We adopt $t^{\prime}$ since it empirically improves accuracy, especially in the small $N$ regime. Notably, when $N=1$ (one-step generation), $t\textquoteright$ simplifies to $\frac{1}{2}$ , leading to the closed-form solution in (5). See Alg. 3 for inversion algorithm.

Algorithm 2 Sampling process of Constant Acceleration Flow

1:velocity model

\mathbf{v}_{\theta}

, acceleration model

\mathbf{a}_{\phi}

, sampling steps

N

\pi_{0}

\mathbf{x}_{0}\sim\pi_{0}

\hat{\mathbf{v}}_{\theta}\leftarrow\mathbf{v}_{\theta}(\mathbf{x}_{0})

4:for

i=0

{N-1}

t\leftarrow\frac{i}{N}

t^{\prime}\leftarrow\frac{2i+1}{2N}

\hat{\mathbf{a}}_{\phi}\leftarrow\mathbf{a}_{\phi}(\mathbf{x}_{t},\mathbf{v}_{\theta})

\mathbf{x}_{t+\frac{1}{N}}\leftarrow\mathbf{x}_{t}+\frac{1}{N}\hat{\mathbf{v}}_{\theta}+\frac{t^{\prime}}{N}\hat{\mathbf{a}}_{\phi}

9:end for

10:return

\mathbf{x}_{1}

5 Experiment

We evaluate the proposed Constant Acceleration Flow (CAF) across various scenarios, including both synthetic and real-world datasets. In Sec. 5.1, our investigation begins with a simple two-dimensional synthetic dataset, where we compare the performance of Rectified flow and CAF to clearly demonstrate the effectiveness of our model. Next, we extend our experiments to real-world image datasets, specifically CIFAR-10 (32×32) and ImageNet (64×64), in Sec. 5.2. These experiments highlight CAF’s ability to generate high-quality images with a single sampling step. Furthermore, we conduct an in-depth analysis of CAF through evaluations of coupling preservation, straightness, inversion tasks, and an ablation study in Sec. 5.3.

5.1 Synthetic experiments

We demonstrate the advantages of the Constant Acceleration Flow (CAF) over the constant velocity flow model, Rectified Flow [10], through synthetic experiments. For the neural networks, we use multilayer perceptrons (MLPs) with five hidden layers and 128 units per layer. Initially, we train 1-Rectified flow on 2D synthetic data to establish a deterministic coupling. We then train both CAF and 2-Rectified flow. For CAF, we incorporate the initial velocity into the acceleration model by concatenating it with the input, ensuring that the model capacities of both CAF and 2-Rectified flow remain comparable. We set $d$ as $l_{2}$ distance. Fig. 2 presents samples generated from CAF in one step and from 2-Rectified flow in two steps. Our CAF more accurately approximates the target distribution $\pi_{1}$ than 2-Rectified flow. In particular, CAF with $h=2$ (negative acceleration) learns the most accurate distribution. In contrast, 2-Rectified flow frequently generates samples that significantly deviate from $\pi_{1}$ , indicating its difficulty in accurately estimating straight ODE trajectories. This experiment shows that reflowing alone may not overcome the flow crossing problem, leading to poor estimations, whereas our proposed acceleration modeling and IVC effectively address this issue. Moreover, Fig. 3 shows sampling trajectories from CAF trained with different hyperparameters $h$ . It clearly demonstrates that $h$ controls the flow dynamics as we intended: $h>1$ indicates negative acceleration, $h=1$ represents constant velocity, and $h<1$ corresponds to positive acceleration flows. Additional synthetic examples are provided in Fig. 6.

5.2 Real-data experiments

To further validate the effectiveness of our approach, we train CAF on real-world image datasets, specifically CIFAR-10 at 32×32 resolution and ImageNet at 64×64 resolution. To create a deterministic coupling $\gamma$ , we utilize the pre-trained EDM models [29] and adopt the U-Net architecture of ADM [30] for the initial velocity and acceleration models. In the acceleration model, we double the input dimension of first layer to concatenate the initial velocity to the input $\mathbf{x}_{t}$ of the acceleration model, which marginally increases the total number of parameters. We set $h=1.5$ and $d$ as LPIPS-Huber loss [43] for all real-data experiments.

GAN Models
Model	$N$	Unconditional	Conditional
Model	$N$	FID $\downarrow$	FID $\downarrow$
BigGAN [22]	1	8.51	-
StyleGAN-Ada [23]	1	2.92	2.42
StyleGAN-XL [24]	1	-	1.85
Diffusion/Consistency Models
Score SDE [1]	2000	2.20	-
DDPM [2]	1000	3.17	-
VDM [27]	1000	7.41	-
LSGM [28]	138	2.10	-
DDIM [26]	10	13.36	-
EDM [29]	35	2.01	1.82
EDM [29]	5	37.75	35.54
CT [6]	2	5.83	-
CT [6]	1	8.70	-
Diffusion/Consistency Models – Distillation
Diff-Instruct [9]	1	4.53	-
DMD [44]	1	3.77	-
DFNO [5]	1	3.78	-
TRACT [45]	1	3.78	-
KD [46]	1	9.36	-
CD [6]	2	2.93	-
CD [6]	1	3.55	-
CTM [7]	2	1.87	1.63
CTM [7]	1	1.98	1.73
Rectified Flow Models
2-Rectified Flow [10]	2	7.89	3.74
2-Rectified Flow [10]	1	11.81	6.88
2-Rectified Flow + Distill [10]	1	4.84	-
\cdashline1-5 CAF (Ours)	1	4.81	2.68
CAF + GAN (Ours)	1	1.48	1.39

Model	$N$	FID $\downarrow$	IS $\uparrow$	Rec $\uparrow$
GAN Models
BigGAN-deep [22]	1	4.06	-	0.48
StyleGAN-XL [24]	1	2.09	82.35	0.52
Diffusion/Consistency Models
DDIM [26]	50	13.7	-	0.56
DDIM [26]	10	18.3	-	0.49
DDPM [2]	250	11.0	-	0.58
iDDPM [47]	250	2.92	-	0.62
ADM [30]	250	2.07	-	0.63
EDM [29]	79	2.44	48.88	0.67
EDM [29]	5	55.3	-	-
DPM-solver [48]	20	3.42	-	-
DPM-solver [48]	10	7.93	-	-
DEIS [49]	20	3.10	-	-
DEIS [49]	10	6.65	-	-
CT [6]	2	11.1	-	0.56
CT [6]	1	13.0	-	0.47
Diffusion/Consistency Models – Distillation
Diff-Instruct [9]	1	5.57	-	-
DMD [44]	1	2.62	-	-
TRACT [45]	1	7.43	-	-
DFNO [5]	1	7.83	-	0.61
PD [3]	1	15.39	-	0.62
CD [6]	2	4.70	-	0.64
CD [6]	1	6.20	40.08	0.57
CTM [7]	2	1.73	64.29	0.57
CTM [7]	1	1.92	70.38	0.57
Rectified Flow Models
CAF (Ours)	1	6.52	37.45	0.62
CAF + GAN (Ours)	1	1.69	62.03	0.64

Metric	2-Rectified Flow	CAF (ours)
LPIPS $\downarrow$	0.092	0.041
PSNR $\uparrow$	29.79	33.16

Dataset	2-Rectified Flow	CAF (ours)
2D	0.065	0.058
CIFAR-10	0.043	0.034

	$\displaystyle\mathbf{x}_{1}$	$\displaystyle=\mathbf{x}_{0}+\int_{0}^{1}\mathbf{v}(\mathbf{x}_{0})+\mathbf{a}(\mathbf{x}_{t})\cdot tdt=\mathbf{x}_{0}+\mathbf{v}(\mathbf{x}_{0})+\int_{0}^{1}\mathbf{a}(\mathbf{x}_{t})\cdot tdt$		(17)
		$\displaystyle=\mathbf{x}_{0}+\mathbf{v}(\mathbf{x}_{0})+\mathbf{a}(\mathbf{x}_{t})\int_{0}^{1}tdt=\mathbf{x}_{0}+\mathbf{v}(\mathbf{x}_{0})+\frac{1}{2}\mathbf{a}(\mathbf{x}_{t})$		(18)

Model	NFE	FID $\downarrow$
CM	18	13.16
CTM	-	N/A
EDM	-	N/A
2-RF	12	16.41
CAF (Ours)	12	10.39
CAF (+GAN) (Ours)	12	10.91

Model	Acceleration	Closed-form solution	Reflow for velocity	FID on CIFAR-10 $\downarrow$
AGM [41]	Time-varying	No	No	11.88 ( $N=5$ )
AGM (enhanced ver.)	Time-varying	No	Yes	15.23 ( $N=5$ )
CAF (Ours)	Constant	Yes	Yes	4.81 ( $N=1$ )

	$\displaystyle\mathbf{z}_{t}$	$\displaystyle=\mathbf{z}_{0}+\int_{0}^{t}\mathbf{v}^{\mathbf{x}}(\mathbf{z}_{0},0)+\mathbf{a}^{\mathbf{x}}(\mathbf{z}_{t},t)\cdot tdt$		(23)
		$\displaystyle=\mathbf{z}_{0}+\int_{0}^{t}\mathbf{v}^{\mathbf{x}}(\mathbf{z}_{t},t)dt.$		(24)

Constant Acceleration Flow

Abstract

1 Introduction

2 Related work

Generative models.

Few-step diffusion models

3 Preliminary

Reflow and flow crossing.

4 Method

4.1 Constant Acceleration Flow

Learning initial velocity field.

Learning acceleration field.

4.2 Addressing flow crossing

Initial velocity conditioning (IVC).

Reflow for initial velocity.

4.3 Sampling

5 Experiment

5.1 Synthetic experiments

5.2 Real-data experiments

Distillation.

CIFAR-10.

ImageNet.

5.3 Analysis

Coupling preservation.

Flow straightness.

Inversion

Ablation study.

6 Conclusion

Acknowledgement

References

Appendix A Implementation details

Appendix B Additional results

B.1 Additional qualitative results

2D toy dataset.

Real-world dataset.

B.2 Real-world applications

B.3 Comparison with previous acceleration modeling literatures

Appendix C Marginal preserving property of Constant Acceleration Flow

Definition C.1.

Definition C.2.

Theorem 1.

Appendix D Limitation and Broader impacts

D.1 Limitations

D.2 Broader Impacts