\authorinfo

Further author information: (Send correspondence to Tsukasa Fukusato)
Zhengyu Huang: E-mail: [email protected]
Haoran Xie: E-mail: [email protected]
Tsukasa Fukusato: E-mail: [email protected], Telephone: 81-3-5841-4109

Interactive 3D Character Modeling from 2D Orthogonal Drawings with Annotations

Zhengyu Huang Japan Advanced Institute of Science and Technology, Ishikawa, Japan Haoran Xie Japan Advanced Institute of Science and Technology, Ishikawa, Japan Tsukasa Fukusato The University of Tokyo, Tokyo, Japan

Abstract

We propose an interactive 3D character modeling approach from orthographic drawings (e.g., front and side views) based on 2D-space annotations. First, the system builds partial correspondences between the input drawings and generates a base mesh with sweeping splines according to edge information in 2D images. Next, users annotates the desired parts on the input drawings (e.g., the eyes and mouth) by using two type of strokes, called addition and erosion, and the system re-optimizes the shape of the base mesh. By repeating the 2D-space operations (i.e., revising and modifying the annotations), users can design a desired character model. To validate the efficiency and quality of our system, we verified the generated results with state-of-the-art methods.

keywords:

Interactive modeling, sketch-based modeling, user interface

1 INTRODUCTION

In the animation and game industries, when modeling new 3D characters (or objects), artists first draw orthographic views of them. However, it is cumbersome and time consuming converting 2D drawings into 3D models manually because 3D modeling with specialized tools (e.g., Maya and 3DMAX) require professional knowledge and those user interfaces are not as intuitive as 2D drawing.

Although several sketch-based modeling methods have been proposed for 3D content creation [1] , it is still a challenging issue to represent characteristics of character drawings — there is a gap between the generated results and professional 3D modelings. To solve this issue, we proposed a user interface to easily and efficiently design such characteristics on 3D shapes with the help of 2D annotations. Leveraging orthogonal views, our system can faithfully reconstruct 3D models from drawings. The main contribution of this paper is to provide a novel user-friendly workflow for designing 3D models from 2D drawings with sketch-like annotations, which eliminates the need for complex 3D operations.

Refer to caption — Figure 1: Overview of the proposed system.

2 Related Work

Sketching is a form of artistic expression that is highly abstracted from the real world that has been used in various graphics applications, such as normal map editing [2], flow design [3], and portrait drawing [4]. As input for 3D modeling, 2D sketch has ambiguity problem in free-form drawing. To solve this issue, Teddy [5] was proposed as one of the earliest free-form sketch modeling user interface. Once 3D viewpoints are determined by users, smooth surfaces are generated by interpolating the curves extracted from users’ sketches. Several interpolation functions for sketch modeling [6, 7, 8] were proposed to improve the results. However, these approaches usual tended to get over-smoothed surfaces. Some approaches, such as BendSketch [9] offered a solution for this issue.

Another popular approach to sketch-based modeling is data-based learning. By analyzing a massive number of sketch-model pairs, methods of this type can generate an accurate 3D model from a user’s simple sketch. Smirnov et al. [10] applied Coons patches to learn shape surfaces, but their method is limited to generating smooth shapes. Sketch2CAD allows users to create objects incrementally with sketches, which were inferred to CAD instructions by convolutional neural networks [11]. SimpModeling provides a sketching system for animalmorphic-head modeling which can generate details of a head from sketches by pixel-aligned implicit learning.

SketchModeling [12] is very relevant to our work, which is also attempting to reconstruct character models from multi-views of sketch images, though it is an automatic approach. With an encoder-decoder U-Net architecture, SketchModeling can get depth maps and corresponding normal maps from input sketches, optimize the point cloud by merging these views, and obtain complete 3D models. Inspired by structured annotations [13], which is a generalized-cylinder-based modeling for a single view, we propose a user interface to easily and efficiently design characteristics on 3D shapes with fewer types of annotations leveraging the orthogonal views.

3 User interface

In this section, we describe how users interact with the proposed two-stage user interface (see Fig. 2) to model a character with annotations. The tool bar consists of four parts: (1) Annotation mode, including local mode (alignment annotation addition and edge/background marking for corresponding alignment annotation), addition annotation boundary addition(B), and erosion annotation addition (E) from left to right; (2) View mode, including 2D front view (V1), 2D side view (V2), 3D view (V3D) and selected-annotation-only mode; (3) Rendering mode for annotations, including drawing as segments, drawing as curve and drawing as generated cylinder; (4) Other options. including relocation a selected annotation from V1 to V2 (and V2 to V1), a lock button and a unlock button shows whether or not adopting epipolar constraint from the other view as a reference when relocating.

3.1 Annotation Tool

Since input 2D orthogonal drawings often do not provide complete information for 3D modeling, our system allows the user to freely draw annotations that are not limited by edge information. The user can draw brief strokes in either front view or side view by inserting key points on the canvas with a mouse-click operation, and each stroke can be labelled as alignment, addition, or erosion. Then, the system automatically generates corresponding strokes in the other view with the same label. The user is allowed to edit strokes in editing mode to calculate correct 3D coordinates of strokes in the 3D view. In contrast, with the eraser tool, the user clicks on a stroke, and the system deletes the selected stroke. Moreover, the undo shortcut (“Z” key) can delete the last stroke from the stroke list. Note that our system can also load or export the user-drawn annotations by pushing down the “Load”(“L”) or “Save”(“S”) key on the keyboard.

3.2 Editing Tool

In order to facilitate repeated modification and find the appropriate correspondence between annotations in two views, the system provides an editing function. In 2D-view mode (V1 or V2), any vertex position of the corresponding annotation can be modified by selecting any visible curve. In editing mode, the system allows the user to generate polar constraints using the corresponding annotations of another view, reducing the 2D editing of the vertices to 1D editing.

4 Overview

Figure 1 shows an overview of our sketch-based character modeling system. After a character design is input, users can draw corresponding annotations one by one in both front and side views under the epipolar constraint. Users can also add extra edge information that is invisible in the original images by sketching for each part of the model with the help of our region-based boundary extraction. Once the user’s annotations are completed, the corresponding lines from alignment annotations (blue strokes) are extracted as hard constraints, and those lines marked with the other two annotations are excluded to revise relationships between the two views of the sketches and calculate more precise coordinates of 3D points for base mesh. After the base mesh is determined, cloud points will be sampled from edge information according to the addition annotations (orange strokes) and erosion annotations (green strokes) as constrains and optimization-based surface fitting is conducted to generate a smooth surface.

4.1 Alignment-Based Global Modeling (Base-Mesh Generation)

In the first step, generalized cylinders are generated as a base mesh according to corresponding alignment annotations in two views. Each part of the character $P$ is modelled separately. Here, any annotation $A$ is restored as a series of ordered $n$ key points $\{\bm{p_{0},p_{1},...,p_{n-1}}\}$ , belonging to a single part and represented as a Hermitian curve. In this system, alignment annotations are mainly used to represent the skeleton or center of gravity of modeling parts and to correct the 3D position of a specific curve in some cases.

Primitives. A typical generalized cylinder is shown in Fig. 3, consisting of a skeleton curve $\bm{s}(t)(t\in[0,1])$ , and a cross-section radial distance function $\bm{r}(t)$ . Here, different types of candidate boundaries which may be projected onto two views are marked with different colors.

Camera model and epipolar constraint. In $\mathbb{R}^{3}$ space, extrinsic parameters of a camera for 3D reconstruction can be denoted by a translation vector $T$ and a $3\times 3$ rotation matrix $\bm{R}=[\bm{R}_{0},\bm{R}_{1},\bm{R}_{2}]^{T}$ . Here $f$ denotes the focal length of the camera. Given a set of 3D points $P$ , the set of points projected onto the front view is $Q_{1}$ , and the set of points projected onto the side-view is $Q_{2}$ . If $\bm{p}\in P$ , $\bm{q}_{1}\in Q_{1}$ and $\bm{q}_{2}\in Q_{2}$ correspond to each other and the camera parameters of two views are ( $\bm{R}_{1},\bm{T}_{1},f_{1}$ ) and ( $\bm{R}_{2},\bm{T}_{2},f_{2}$ ), respectively, then, $\bm{q}_{1}$ and $\bm{q}_{2}$ can be expressed with following equations:

\left\{\begin{matrix}\bm{q}_{1x}&=f_{1}\frac{\bm{R}_{10}\cdot\bm{v}+T_{1y}}{\bm{R}_{12}\cdot\bm{v}+T_{1z}}\\ \bm{q}_{1y}&=f_{1}\frac{\bm{R}_{11}\cdot\bm{v}+T_{1y}}{\bm{R}_{12}\cdot\bm{v}+T_{1z}}\\ \bm{q}_{2x}&=f_{2}\frac{\bm{R}_{20}\cdot\bm{v}+T_{2y}}{\bm{R}_{22}\cdot\bm{v}+T_{2z}}\\ \bm{q}_{2y}&=f_{2}\frac{\bm{R}_{21}\cdot\bm{v}+T_{2y}}{\bm{R}_{22}\cdot\bm{v}+T_{2z}}\\ \end{matrix}\right.

(1)

Since the two views are orthogonal to the y-axis, $\bm{q}_{1}$ and $\bm{q}_{2}$ are located exactly at orthogonal planes. We also have $f_{1}=f_{2}$ , $T_{1z}=T_{2z}=0.0$ , and $T_{1y}=T_{2y}$ . Then, the epipolar constraint derived from Equation. 1 can be simplified as:

\bm{q}_{1y}=\bm{q}_{2y}

(2)

Thus, the corresponding 3D positions can be calculated correctly. For instance, support a point $\bm{p}$ with coordinates $(x_{1},y_{1})$ in the front view and $(x_{2},y_{2})$ in the side view. Since $y_{1}=y_{2}$ epipolar line constraint, its 3D coordinates are $(x_{1},y_{1},-x_{2})$ . Although we are using only this special case, the polar constraints can be naturally extended to reduce the workload of a multi-view alignment.

Base mesh generation. Our system uses the edge information of the image and the relative position of alignment annotations to automatically calculate the preliminary boundary of 2D generalized cylinders of for each view. For any point on the skeleton curve $\bm{s}(t)$ of an alignment annotation where $t=t^{\prime}$ , the nearest intersection points in two directions between its vertical line ends and edges can be found as an initial boundary $\bm{b}(t)$ of cross-section $\bm{r}(t)$ . If edges $E$ denotes a set of world coordinates converted from edge pixels in input images and function $D(E,\bm{b}(t))$ denotes the distance between $E$ and $\bm{b}(t)$ , the formula can be described as follows:

\left\{\begin{matrix}\bm{b}(t)=\mathop{\rm arg~{}min}\limits(D(E,\bm{b}(t)))\\ D(E,\bm{b}(t))=min_{\bm{e}\in E}\{\left\|\bm{b}(t)+\bm{s}(t)-\bm{e}\right\|_{2}+\left\|\bm{b}(t)\right\|_{2}\}\end{matrix}\right.

(3)

Note that the mesh generated by this method tends to fail for edges that are close to parallel. The main reason for this is the need to discretize the mesh when generating it, so parameter $t$ for $s(t)$ has a certain interval. This weakness will be overcome in the following local refinement step.

4.2 Local Refinement with Annotation Constrains in Two Orthogonal Views

The next step is refining the base mesh with optimization. Here, we introduce addition annotations and erosion annotations, as shown in Fig. 1, to realize this objective: addition annotations $A_{a,t}$ define boundaries for cross-sections along a skeleton curve, while erosion annotations $A_{e,t}$ modify shapes of end-caps of generalized cylinders. Note both annotations only work when they attached to an alignment annotation specified by the user.

These annotations should be converted to $\bm{b}(t)_{k}$ , denoting a boundary function with a type of $k$ ( $k\in K$ ). $K$ is a set of types of boundary, and in our case K=0, 1, 2 denoting a blue curve (cross-section contour), orange curve, and green curve in the right column of Fig. 3, respectively. All reconstructed parts of character should have minimized errors between the visible contours and annotations constraints when project back to 2D views. The objective function $F(B,E,A)$ in this step can be summarized as:

\left\{\begin{matrix}B=\mathop{\rm arg~{}min}\limits F(E,A)\\ minF(E,A)=\sum_{\bm{b}(t)\in A,k\in K}{D(E,\bm{b}(t)_{k})}\end{matrix}\right.

(4)

where $B$ is a set of boundaries extracted from annotations; edges $E$ and function $D(E,\bm{b}(t))$ are the same as the one in Equation. 3 described. Once $B$ is determined, the base mesh in the global step can be refined as follows:

Cross-section modeling vertical to the skeleton direction of generalized cylinder. When $\bm{b}(t)_{0}\notin B$ , there are only at most four constraint points for each cross-section. In this case, the system would fit an ellipse by regarding these constraint points as its poles. Specially, only one constraint point in the cross-section means this cross-section is a circle.

If $\exists t=t^{\prime}$ let $\bm{b}(t)_{0}\in B$ , cross-sections between these addition annotation boundaries will be calculated with cubic B-spline interpolation.As the cross-sections along to the skeleton direction of generalized cylinder have been calculated, the side surface of the generalized cylinder matching with input and user-defined boundaries can be generated.

End-caps of generalized cylinder. If there is an erosion annotation for a generalized cylinder, the end-caps surfaces will be deformed with Laplacian-based editing [14] according to the annotation’s shape. Otherwise, surfaces at the ends of the generalized cylinder are planes.

The main idea of this step is to use two kinds of annotations to obtain constraints for a generalized cylinder. Therefore, this step can not only perform boundary refinement, but also improve the effectiveness of boundary classification, which in turn improves modeling efficiency and accuracy.

5 Result and Discussion

In our implementation, the system was programmed in C $++$ as a real-time drawing application on the Windows 10 platform. A workstation with Intel Core i5-8400, 2.80GHz 2.80GHz, NVIDIA RTX2070 GPU, and 16GB RAM was used as the testing computing environment. Figure 4 shows 3D modelling results with our system comparing with SketchModeling [12]. Both the final cloud points and mesh of the first-row character do not match well with the input sketch after hours of calculations, which means this learning-based method failed to predict the position and depth-map of some parts of the characters, while models generated with our system were more faithful to the original design drawings in less time by modelling several independent parts.

Unlike the template-based method, the proposed system enables users to make character models without careful parameter tuning. By comparing our results with results of state-of-the-art methods, we verified that the proposed system could improve the quality of 3D character models with simpler but more intuitive operations. The current system focuses mainly on the 3D shapes reconstruction process, so texture mapping and complex structure modeling might be a good topic for future research.

Acknowledgements.

This research was supported by the Kayamori Foundation of Informational Science Advancement, JSPS KAKENHI JP20K19845, and JP19K20316.

References

[1] Bhattacharjee, S. and Chaudhuri, P., “A survey on sketch based content creation: from the desktop to virtual and augmented reality,” Computer Graphics Forum 39(2), 757–780 (2020).
[2] He, Y., Xie, H., Zhang, C., Yang, X., and Miyata, K., “Sketch-based normal map generation with geometric sampling,” in [International Workshop on Advanced Imaging Technology (IWAIT) 2021 ], 11766, 261 – 266, International Society for Optics and Photonics, SPIE (2021).
[3] Hu, Z., Xie, H., Fukusato, T., Sato, T., and Igarashi, T., “Sketch2vf: Sketch-based flow design with conditional generative adversarial network,” Computer Animation and Virtual Worlds 30(3-4), e1889 (2019). e1889 cav.1889.
[4] Huang, Z., Peng, Y., Hibino, T., Zhao, C., Xie, H., Fukusato, T., and Miyata, K., “dualface: Two-stage drawing guidance for freehand portrait sketching,” Computational Visual Media 8, 63–77 (2022).
[5] Igarashi, T., Matsuoka, S., and Tanaka, H., “Teddy: A sketching interface for 3d freeform design,” in [Proceedings of the 26th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH) ], 409–416, ACM (1999).
[6] Karpenko, O. A. and Hughes, J. F., “Implementation details of smoothsketch: 3d free-form shapes from complex sketches,” in [ACM SIGGRAPH 2006 Sketches ], 51–es, ACM (2006).
[7] Joshi, P. and Carr, N. A., “Repoussé: Automatic inflation of 2d artwork,” in [Proceedings of Eurographics Workshop on Sketch-Based Interfaces and Modeling (SBIM) ], 49–55, Eurographics Association (2008).
[8] Bernhardt, A., Pihuit, A., Cani, M., and Barthe, L., “Matisse: Painting 2d regions for modeling free-form shapes,” in [Proceedings of the Fifth Eurographics Conference on Sketch-Based Interfaces and Modeling (SBM) ], 57–64, Eurographics Association (2008).
[9] Li, C., Pan, H., Liu, Y., Tong, X., Sheffer, A., and Wang, W., “Bendsketch: modeling freeform surfaces through 2d sketching,” ACM Trans. Graph. 36(4), 125:1–125:14 (2017).
[10] Smirnov, D., Bessmeltsev, M., and Solomon, J., “Learning manifold patch-based representations of man-made shapes,” in [International Conference on Learning Representations (ICLR) ], 1–24 (2021).
[11] Li, C., Pan, H., Bousseau, A., and Mitra, N. J., “Sketch2CAD: Sequential CAD modeling by sketching in context,” ACM Transactions on Graphics (TOG) 39(6), 164:1–164:14 (2020).
[12] Lun, Z., Gadelha, M., Kalogerakis, E., Maji, S., and Wang, R., “3D shape reconstruction from sketches via multi-view convolutional networks,” in [Proceedings of International Conference on 3D Vision (3DV) ], 67–77, IEEE (2017).
[13] Gingold, Y. I., Igarashi, T., and Zorin, D., “Structured annotations for 2d-to-3d modeling,” ACM Trans. Graph. 28(5), 148 (2009).
[14] Au, O. K., Tai, C., Liu, L., and Fu, H., “Dual laplacian editing for meshes,” IEEE Transactions on Visualization and Computer Graphics 12(3), 386–395 (2006).