This paper was converted on www.awesomepapers.org from LaTeX by an anonymous user.
Want to know more? Visit the Converter page.

Textured As-Is BIM via GIS-informed Point Cloud Segmentation

Mohamed Said Helmy Alabassy

*Affiliation: Bauhaus University Weimar, Faculty of Civil and Environmental Engineering, Chair of Construction Chemistry and Polymer Materials

Email: [email protected]

This manuscript is an extended version of the following conference paper: Krischler, Judith; Alabassy, Mohamed Said Helmy; Koch, Christian (2023). BIM Integration for Automated Identification of Relevant Geo-Context Information via Point Cloud Segmentation. 30th International Workshop of the European Group for Intelligent Computing in Engineering (EG-ICE). Date: 04.07. - 07.07.2023, London, UK. URL

Abstract

Creating as-is models from scratch is to this day still a time- and money-consuming task due to its high manual effort. Therefore, projects, especially those with a big spatial extent, could profit from automating the process of creating semantically rich 3D geometries from surveying data such as Point Cloud Data (PCD). An automation can be achieved by using Machine and Deep Learning Models for object recognition and semantic segmentation of PCD. As PCDs do not usually include more than the mere position and RGB colour values of points, tapping into semantically enriched Geoinformation System (GIS) data can be used to enhance the process of creating meaningful as-is models. This paper presents a methodology, an implementation framework and a proof of concept for the automated generation of GIS-informed and BIM-ready as-is Building Information Models (BIM) for railway projects. The results show a high potential for cost savings and reveal the unemployed resources of freely accessible GIS data within.

Keywords Semantic Segmentation; Point Cloud Segmentation; As-Is Model; GIS-informed BIM; Scan2BIM; Railway Alignment; IFC

Highlights

  • Recreating horizontal and vertical alignments from semantically segmented railway trackbeds in GIS-compatible data format (i.e., GeoJSON) automatically.

  • Developing a pipeline to generate UV-mapped coloured textured meshes of irregular 3D shapes automatically from vertex coloured point clouds with embedded texture atlas using IFC.

  • Showcasing potentials for cost savings in early planning phases of railway projects by relying on free and open GIS data. Automating the creation of Alignments in IFC in combination with accessibility to geographic databases within BIM could provide further insights relevant to railway planning activties by querying classes or identifying multi- or bi-temporal differences in the vicinity of railway lines.

  • Demonstrating the technical feasibility of progressive UV-mapped textured meshes compatible with the IFC data format and visualising their information correctly within currently available free and open source BIM software, which should reduce the loss of semantical information exchanged between CityGML 3 and IFC and benefit other BIM related applications, like information modelling of damages.

1 Introduction

1.1 Motivation

Due to the ailing state of the German railways, an increased focus on digitisation is directed recently to maintain, upgrade and expand the existing railway network and prepare it for the future [1, 2]. Digitisation of the standing network is a very costly and time-consuming endeavour, especially in Germany, where over 33,000 km of railway exist, with almost 2,000 km of railway tracks, 200 bridges and 18,000 points and crossings renovated and upgraded in 2023 alone [3]. This is even more pronounced in the early stages of the larger construction projects, upon which status-quo data (e.g., plans, drawings, or surveys) depends, that is often incomplete, exists in non-digital formats or could be missing altogether [4]. However, this data still needs to be processed for integration into inter-compatible working methods based on Building Information Modelling (BIM) [5, 6, 7].

The time-frame of such processes from a project’s initiation to the finishing handover often spans over decades as evident from some of the German Reunification Railway Transport Projects number 8 and 9 for the expansion and new construction of the routes Nuremberg–Erfurt–Halle/Leipzig–Berlin, and along the route between Leipzig –Dresden respectively [8], where the acquisition of up-to-date surveying data of the location is crucial to ensure correct planning and informed decision-making. Especially through the very inceptive phase of the project, when there is often little to no allocated funds to support the decisions needed to be made that are in turn time-varying on shifting requirements, demand, possible project variants, extents, costs, regulations and permits, etc [9].

A patchy data basis, however incomplete it might be, could be improved upon with registered surveying patches for regions of interest using commercial aerial Light Detection and Ranging (LiDAR) scanning services, through photorealistic 3D tiles from the latest Google Maps 3D immersive view [10] or other similar alternatives. Yet, surveying services are often very expensive, rendering their costs a hindrance to utilise, as they are neither always available, easy to acquire at remote locations nor guaranteed to be up to date over time with changes of the status-quo on ground. Thus, a more practical and less costly approach, that includes openly accessible geographic and location information free of charge provided by online web mapping platforms and Geographic Information Systems (GIS), could be of benefit. The open geodata published by official surveying authorities and available for free could be an economical alternative source to acquire a relatively up-to-date data basis that can be processed and utilised to generate as-is BIM models.

The latest strides in the research field of intelligent recognition of Point Cloud Data (PCD) have been established in state of the art systems for autonomous driving, navigation, flight planning for drones, and aerial scans’ segmentation, that could be utilised for the semantic segmentation of point clouds generated from available geodata. This would help classify further objects of interest that are unlabelled in the raw data or difficult to attain otherwise to derive further insightful knowledge and semantic information that assist in the identification of interdependencies and relations between labelled classes and project requirements defined by official regulations, public authorities, planners and other concerned stakeholders.

Furthermore, the various types of data and information formats, compounded by limitations to incompatible software and hardware used by each actor, emphasises the benefits to common adoption of freely and widely available open data formats, which are information-lossless, convertible and inter-compatible with technical hardware and software resources of all involved stakeholders. This early phase always requires extensive data exchange between various stakeholders, that highlights the importance of adopting big open BIM to facilitate interoperability and reduce the incompatibility of different data formats separately used by each party involved.

Such reconstruction of the status-quo, as relatively accurate BIM-ready as-is railway models they are, using freely available PCD (i.e., Scan2BIM) suffices to propel information based decision-making forward in early planning phases immensely with minimum cost. A hybrid approach that incorporates not only PCD, but also geospatial and a mix of other free publicly accessible data may help overcome common obstacles arising from insufficient data basis.

1.2 Research Question

Hence, this study considers the applicability of state of the art methods available today to find practical solutions to the main question of how to optimally combine the wealth of information available in open geodata, including the German official topographic and property cadastre information systems (i.e., photos, point clouds, property and land use shapes and attributes) through semantic segmentation of point clouds. The focus lies on the railway infrastructure domain, with BIM models in the open IFC format, which paves the way for a very economical workflow to reconstruct an as-is information model well suited for BIM-based planning in early project phases. It focuses on the transfer of knowledge attained from semantic PCD segmentation through surface reconstructed geometry and derived semantics and properties into big open BIM. We hypothesise that the inclusion of GIS data may help overcome technical and financial limitations to insufficient data basis. This research further expands on previous work by the authors [11]. For this purpose, we rely solely on freely available data, using only open-source tools and convert our results into the open BIM IFC data format.

1.3 Main Contributions

A modified version of the deep neural network (DNN) architecture 2DPASS has been used to train a model capable of semantically segmenting buildings, vegetation, water bodies, roads, and railway trackbeds. The used architecture relies in the training phase both on 3D features from a dataset of processed and semi-automatically annotated LiDAR scans as well as 2D features from coloured orthophotos. The original dataset is acquired from freely available LiDAR scans, digital orthophotos, ATKIS shape files from the geoportal websites of the Free States of Thuringia and Saxony. The authors of this study attempt to make the most use of free and open data sources, software, and information formats for pre- and post-processing of point cloud data, semantic segmentation, post processing of the inferred segmentation results with 3D reconstruction and modelling of railway related contexts.

The contribution of this work links to formerly published literature about Scan2BIM approaches in railway [12, 13, 14, 15, 16]. The proof of concept to our proposals is demonstrated on two case studies. The first case study demonstrates the potentials for cost savings in early planning phases of railway projects by estimating horizontal and vertical alignments from the inferred semantic segmentation of the predicted railway trackbeds in GIS-compatible data format (i.e., GeoJSON) that could be later used as input for automatically generating and IfcAlignment or querying multi- or bi-temporal differences along a buffer zone of the alignment.

The second case study demonstrates the texturing potentials of the resulting IFC files and builds on former studies by [17, 18, 19], and expands it further as no attempt has been found in the body of literature so far to generate UV-mapped textured meshes of irregular 3D shapes in IFC automatically, whereas the aforementioned studies successfully embedded a texture image onto a 2D flat surface in 3D space with manual UV-mapping.

2 Related Works

2.1 Point Cloud Segmentation

Point cloud acquisition has become more ubiquitous due to advancements in latest depth and light capturing sensors, that could be mass-produced within a wide range of price, quality and application spanning from LiDAR scans in engineering surveying to those fitted on drones or the miniaturised variants in smartphones and extended reality (XR) gadgets. This wide adoption has driven progress on semantic segmentation of point clouds to develop highly sophisticated Deep Learning (DL) models able to classify large amounts of points automatically on graphical and computer processing units for a wide range of applications ranging from early phase planning of projects, scene understanding for autonomous driving to construction monitoring and inspection, XR applications and beyond.

Texture utilising methods for point cloud segmentation involve a collection of varying approaches. One focuses on fusing representations from points, voxels, and/or projection images within different branches of the network’s architecture. Tang et al. combined point-wise MLPs in each sparse convolution block to learn a point-voxel representation [20] and relied on Neural Architecture Search to reach an optimally efficient design . Xu et al. proposed a range-point-voxel fusion network (i.e., RPVNet) to utilise information from the three representations [21]. However, texture features and semantics from photos were underutilised or discarded altogether by focusing only on colourless LiDAR point clouds.

Another approach fuses inputs from multiple sensors leveraging benefits of both camera and LiDAR [22, 23, 24]. El Madawy et al. converted RGB images to a polar-grid mapping representation and designed early and mid-level fusion architectures [22]. Point Painting exploited the segmentation logits of images [25] and projected them into LiDAR space by spherical [26] or bird’s-eye projection [27] for LiDAR network performance improvement, whereas Zhuang et al. exploited a collaborative fusion of two modalities in camera coordinates [28]. Yet, the paired multi-modality data requires multi-sensor inputs in both training and inference and is more computationally intensive and unavailable in practical applications.

A third approach involves knowledge distillation, which is a technique used to transfer knowledge learnt by compressing a large teacher network to a smaller student [29]. Further improvements to knowledge transferring were achieved using different methods of matching feature representations [30, 31]. For instance, aligning attention maps [32] and Jacobean matrices [33] were independently applied. This technique has been applied successfully to transfer priors across different modalities by using additional images in the training phase and to improve performance at inference [34, 35, 36, 37, 38]. For instance, by either inflating kernels of 2D convolution into 3D [21], introducing a 2D pixel-to-point assisted pre-training [39] or by utilising a teacher-student framework [40]. Furthermore, 2DPASS leveraged auxiliary modal fusion to transfer 2D knowledge through multi-scale fusion-to-single knowledge distillation, which took care of the modal-specific knowledge and demonstrated generality, flexibility and effectiveness when compared to other fusion based methods [41].

2.2 Scan2BIM in Railway Infrastructure

The so-called Scan2BIM approach means the (automated) acquisition of PCD and the subsequent creation of BIM models from it  [42]. Many works deal with the topic of applying Scan2BIM to railway infrastructure. This subsection presents an overview of the related research.

Ariyachandra et al. [13] described a method for creating large-scale geometric information models (GIM) of rail infrastructure using automated segmentation of point cloud data (PCD). The geometry of the segmented rails and track bed was then reconstructed resulting in IFC files of the respective objects. The authors distinguished between a distinct segmentation of rails and track beds. In their earlier publications [43], they focused on the identification of railway masts of double-track railways from LiDAR PCD by taking railway design rules strongly into account and therefore renouncing neighbourhood structures, scanning geometry and the intensity of input data. After removing the adjacent vegetation, a track corridor remains from which the masts were then extracted, which are parallel to the global Z-axis. The masts were then differentiated from other pole-like objects defining inner and outer boxes around the masts to identify outliers. The proposed method reached high detection and precision rates, and included eventually the 3D mast models in IFC.

Yang [44] extracted rail beds and rails from mobile laser scanning (MLS) point cloud. They were able to detect different rail segments (where one alignment ends and the other begins, e.g., points) and reached an overall detection accuracy of more than 95% in completeness and quality and 99.78% correctness. The applied method used cross-sections and so called “scanning lines” to identify the ballast bed of the rail and with the help of the specific geometric angle of the ballast bed, and was able to identify the course of the railway and to identify objects that did not belong to the railway itself. To avoid scanning the whole point cloud, the areas of interest have been segmented in the point cloud. In order to quantify the accuracy of the result, rail lines were manually digitised as lines and with a buffer zone of 15 cm compared to the extracted railway lines from the point cloud.

Cheng et al. [45] used coloured point clouds (range accuracy ±2mm\pm 2~{}mm) in order to first segment them into (single-track) railway bed and remaining points. The tunnel cross-sections were extracted and the parameters derived. Ten different classes were identified, including rail head and railway bed. To estimate the horizontal and vertical alignments of the railway, a method of 3D local cylindrical neighbourhood was used. The presented method for classifying point cloud reached close to 100 % accuracy. The authors achieved to create a parametric BIM model from the derived objects, yet it was not converted into an open product data model.

Eickeler and Borrmann [14] presented a three-stage concept for the creation of a railway specific dataset that allows a higher grade of automation in railway asset detection. The paper focused on videos and images, not on PCD derived from LiDAR scans. The three stages consisted of 1. Simple Class Annotation (SCA), 2. classification by domain knowledge and 3. creation a full asset model (optimisation).

Cserép et al. [15] used coloured LiDAR PCD from a vehicle-mounted scanner (mobile mapping system; 60 km/h). The PCD was fragmented and then used for cable and rail recognition. To remove vegetation within the PCD along the railway tracks, contour detection was used. The railway was identified using contour finding and Hough transformation. The paper compared three different algorithms for detection of the railways. First, the trackbed was detected taking into account the point density and the most common height in a defined area. Second, the PCD was reduced to a 2D digital elevation model (image) and the 2D Hough transformation used in order to identify rail pair from the resulting 2D projected PCD. With this methodology of creating as-is models, it was hard to detect cables being placed underneath each other, but the standardised track gauge could be taken into account. The 3D line of the railway was detected using Hough transformation. Third, the region growing algorithm with the trajectory of the identified cables was used. The trajectory of the power cables was calculated with the Random Sample Consensus (RANSAC) algorithm. All three approaches have been compared. For the cable objects, the Region growing algorithm showed the best accuracy and operated the fastest.

The work of Soilán et al. [46] included an automated approach to derive precise railway alignment (centreline between two tracks) from LiDAR scans in order to create IFC alignment objects from that. The authors used LAS files (without colour) created by ground vehicle mounted LiDAR sensors (2000 points per square metre) to extract railway alignments and expressed them as IFC files (xBIM toolkit for IFC 4x1 was used). The point cloud was segmented in the direction of the trajectory using a distance filter in order to minimise the processing effort. An intensity-height-filter was applied to remove non-rail points. The extracted railway line was stored as polylines and the centreline between the rails was used to create the IfcAlignment instance to form an IFC file. The validation was carried out comparing the generated alignment using the reference points for the construction.

The reconstructing of 3D railway geometries from surveying data often depends on highly detailed point clouds that can even serve as a basis for the extraction of overhead cables. All of the presented works required highly detailed point clouds and have often created the data sets for the sole purpose of reconstructing highly detailed geometries. Some of the reviewed work has eventually converted the 3D reconstructions to the open IFC data model. None of the found work used GIS data to support the semantic segmentation process.

2.3 Texturing and Semantic Enrichment

Hüthwohl et al. successfully demonstrated the applicability of mapping a 2D texture to a 2D plane for a beam face of a 3D IFC model [17] using an image as a 2D texture in 3D space, albeit in the context of classification of Building Information Modelling (BIM) of damages and exploring their suitability to include inspection and integrating damage information modelling for RC bridges in a standardised and open IFC format within an open bridge management system [47, 18]. A prototype software was developed for that purpose, allowed by the use of “Gygax construction IT research platform for 2D & 3D” [48], that relied on IFCEngine DLL, Helix Toolkit and SharpDX, to allow the shader to render the texture as a UV-mapped (i.e., S,T in IFC) triangulated mesh consisting of split rectangle into two triangles to a linked image.

The authors highlighted several disadvantages to this approach due to problems with estimating the correct location and orientation of the image plane in the IFC model, the distortions resulting from taking close-up photos and increase in data size by embedding the texture into the IFC model. This furthermore includes the risk of corrupting the IFC file during data exchange, when an external Uniform Resource Identifier (URI) of the image embedded in the IFC model without sharing the attached texture or when the texture image is moved to a different location.

Several pilot attempts for formalising compatible texture generation and integrating it within the entities of the IFC schema have been discussed in grey literature and shown potentials for BlenderBIM’s capability to handle texture maps [49]. However, no attempt has been found in the body of literature so far to generate UV-mapped textured meshes of irregular 3D shapes in IFC automatically, mainly due to the complex set of challenges to be faced from aliasing problems arising from inadequate sampling of all colour frequency values for texels through interpolation of texture attributes projected onto an intermediary arbitrary parameterised surface. Mipmapping based on Level of Development (LoD) requirements could be a solution in this case, yet it would require generating at least 3 texture maps of different resolutions based on the required triangle area of the triangulated faces defined for each LoD 3 and upwards.

2.4 Research Objectives

  • Examining the potentials for cost savings by using freely available GIS data in early planning phases of railway projects by estimating horizontal and vertical alignments from the inferred semantic segmentation of the predicted railway trackbeds in GIS-compatible data format (i.e., GeoJSON) that could be later used as input for automatically generating and IfcAlignment or querying multi- or bi-temporal differences along a buffer zone of the alignment.

  • Examining the feasibility and compatibility of progressive meshing and surface mapping from vertex coloured point cloud to generate UV-mapped textured meshes of complex irregular 3D shapes in IFC automatically.

3 Methods

This section describes a methodology of using GIS-informed point cloud segmentation for an automated attempt of creating as-is models for early planning phases. Openly accessible Point Cloud Data (i.e., LiDAR), coloured orthophotos and 2D GIS data serve as a basis of the dataset for training the point cloud segmentation model.

Refer to caption
Figure 1: Simplified methodology of textured as-is modelling from GIS-informed semantic segmentation of PCDs.

3.1 Input Data

Usually, the LiDAR PCD consists of the x, y, z coordinates of points, CRS Projection Information and little to no semantic classification to them. Often, there are also no colours like RGB-values assigned to the points. Furthermore, PCD is often documented within tabular or textural formats, such as .csv, .xyz or within the widely used open format LAS, which may include some semantic classification, depending on the age, technical specifications of the scanning device and its support for the latest LAS format version. The targeted PCD converted from a LiDAR scan in this study usually comes from surveying flights and is therefore, unlike data from other kinds of acquisition, recorded from above and therefore involving specific limitations that need to be addressed, which will be discussed in section 4.

3.2 Preprocessing and Semi-automated Annotation

In the first step, the orthophotos are used to colour the PCD, in the case of missing RGB values of the respective points. Orthophotos result usually from surveying flights and are typically provided as georeferenced, grid-based image data, such as .(geo)tiff.

The GIS data, together with its classification of GIS features, can be derived from querying GIS databases, such as OpenStreetMap or provided GIS data from public institutions. It is important to identify the necessary label and later object classes in advance to decide on which object classes should be represented in the subsequent as-is model. The used GIS data consists usually of 2D polygons, sometimes containing attributed information about the height of the real object, for example in the case of buildings. The labelling data that comes with the 2D GIS features, such as the classification into “road”, “vegetation area”, “railways”, etc., is used in a next step to create annotation masks. Especially when using PCD originating from aerial scans, points with a higher z-value can cover other points of a different class, such as treetops covering a street when viewed from above. The GIS data helps to form coherently connected segmentation areas while also including algorithms of nearest neighbour to avoid false labelling. More information regarding this can be found in section 4.

3.3 Point Cloud Segmentation

The annotation masks can now be used in a 80:20 training vs. validation dataset relation within the Deep Learning (DL) model, namely 2DPASS [41]. The trained model can then be applied via inference to selected case study data, which is formerly unknown to the model. The outcome of applying the trained DL model to real data is an inferred segmentation from the already trained model based on a thresholded probability map. The resulting segmentation can be segregated into preassigned classes for reducing processing time and resources and their respective point clouds can then be treated individually.

3.4 3D Reconstruction and Texturing

Using the results from the previous step, the post-processing steps can be executed. This includes 3D mesh reconstruction from the class-specific point clouds with semantic information coming from the initial GIS data. Depending on the class, the result leads to 3D meshes of buildings in LoD2 or higher depending on the required Level of Information Need (LoIN) within a given coordinate reference system. The final meshes are furthermore first instanced as individual objects and then converted to a suitable data model, such as IFC and textured according to the orthographic photos.

4 Implementation

The developed pipeline for this study was mainly implemented in Python and minor portions through Python bindings to packages in C++ or background processes. Figure 2 showcases the steps of the workflow undertaken starting with downloading the original data directly from the geoportals of the States of Thuringia and Saxony that consists of the coloured orthophotos, LiDAR scans, cadastre masks, CityGML models of the cities of Erfurt, Jena and Weimar in the State of Thuringia, as well as Dresden and Leipzig in the State of Saxony respectively and annotating them semi-automatically then proof-checked manually. Table 1 elaborates the various data types and formates included in the acquired dataset.

Table 1: Freely available input data used within the implementation and case-study.
Data Format Description
Digital Surface Model (DSM) *.laz 3D PCD, uncoloured
DigitalElevation Model (DEM) *.laz 3D PCD, uncoloured
Digital Orthophotos (DOP) *.tiff 2D images, coloured
Cadastral Maps (ALKIS) *.shp 2D vector data
Topographical Maps (ATKIS) *.shp 2D vector data
Buildings, LoD2 (CityGML1) *.gml 3D city model data

4.1 Preprocessing and Annotation of Input Data

The base dataset from LiDAR scans provided in the LAS data format. It consists mainly of three layers: ground, above ground and outliers (i.e., Ground, 20 and Unclassified in Thuringian LAS files). Table 2 details the default classes available in LAS format. As relying solely on the cadastre masks without further preprocessing steps was found insufficient to correctly annotate points related to specific classes like points related to external walls and corners of buildings, roads and railways that may lie outside the perimeter of the shape instances, which cannot catch their fine details. Therefore, to automate the annotation process of the other classes, each tile is processed with Connected Components Extraction and Density-based spatial clustering of applications with noise (DBSCAN) to cluster the point cloud and rely on the initial clustering for set operations with overlaid masks from the ATKIS relevant shapes that are geometrically manipulated to create buffer layers for annotation with the overlapping label extracted from their attributes’ table.

Table 2: Values and meanings of default LAS classes in LiDAR scans of dataset.
LAS Class Meaning LAS Class Meaning
0 Created, Never Classified 10 Rail
1 Unclassified 11 Road Surface
2 Ground 12 Reserved
3 Low Vegetation 13 Wire Guard (Shield)
4 Medium Vegetation 14 Wire Conductor (Phase)
5 High Vegetation 15 Transmission Tower
6 Building 16 Wire Structure Connector (Insulator)
7 Low Point (Noise) 17 Bridge Deck
8 Reserved 18 High Noise
9 Water >18 User Defined
Refer to caption
Figure 2: Detailed process map elaborating the methods used.

The connected components labelling with CloudCompare operates by propagating a front on the surface implicitly represented by the point cloud with the Fast Marching algorithm and making it dependent on scalar distance values associated to each point and neighbouring entities within an octree structure; which is followed by smoothing via Gaussian distance field gradient [50]. A sample of the dataset original PCD and orthophotos and the underlying processed clustering interim results for the semi-automated annotation are showcased in fig. 3.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Refer to caption
(f)
Figure 3: Overview of the original dataset tiles of LiDAR scans, Orthophotos and ATKIS mask and their initial pre-processing steps; (a) original LiDAR scan of a tile in Jena, Thuringia, (b) The tile’s relevant orthophoto, (c) Colourised LAS file based on the sampling of colour from the orthophoto onto the vertices of the point cloud, (d) Connected Component Clustering of the LAS classes, (e) Labelled Gaussian mixture model fit to the orthophoto’s pixels as features in RGB colour space, (f) ATKIS shape masks overlaid onto the orthophoto.

The cadastral shape files follow a very consistent and detailed data schema that is managed by the Working Committee of the Surveying Authorities of the States of the Federal Republic of Germany (AdV) and disseminated through the Documentation on the Modelling of Geoinformation of Official Surveying and Mapping (GeoInfoDoc; German: Dokumentation zur Modellierung der Geoinformationen des amtlichen Vermessungswesens (GeoInfoDok)) [51]. The adopted and currently used version 7.1.2 was used to query the relevant attributes from the transport mode attributes for rail and road and thus facilitate the automatic annotation of the data set.

The data structure is designed to store and organise GIS data within the so-called AAA schema to facilitate managing and querying spatial data [51]. AAA stands for AFIS/ ALKIS/ATKIS; which refers to the official reference-point network, cadastre data and topographical maps. The geographical information contains the actual spatial data including geometric representations of geographic features such as points, lines, and polygons, which are stored in the Shapefile format (i.e., .shp). The associated attributes or properties of the geographic features are stored as key-value pairs with information about the keys and the values’ units derived from the GeoInfoDoc. The geodetic Coordinate Reference System (CRS) is ETRS89 for UTM zones 32N and 33N in the States of Thuringia and Saxony (i.e., EPSG 25832 and 25833) respectively. Information about the coordinate reference system used for the geometries, including its name, authority, and parameters. Since the number of attributes queried was very limited, there was no direct attempt to use prior knowledge about indexing the data to optimise for efficient spatial queries. For roadway axis object “Ax_Fahrbahnachse” the following attributes were queried as listed in table 3.

Table 3: Relevant roadway attributes extracted from the shape of the roadway axis “Ax_Fahrbahnachse” object for labelling PCD. The first column constitutes the attributes’ label, which is a three letter abbreviation of the attribute name in German. The second column provides the translated meaning into English and the third shows the data type of the values as either strings, integers or string enumerators (i.e., str, int, str enum) respectively.
Label Meaning Data Type
NAM Name str
BEZ Label str
BRF Lane Width str
FSZ Lane Number int
WDM Classification str enum
FTR Roadway Separation str enum
STS Road Key str
IBD International Significance str enum

Whereas in railways the attributes of the railway shape objects “AX_Bahnstrecke” and “AX_Bahnverkehr” were queried for the following key-value pairs as listed in table 4. The derived attributes’ values from the ATKIS masks are then used for set operations on the clustered above ground LAS layer with an optimally derived empirical threshold of >65%>65\% overlapping of clusters to automate the labelling of class to balance precision and computing speeds to extract the vegetation, elevated roads, bridges and railway classes. Thresholds below this value would trigger a further step of sorting for the nearest label from cluster centre that annotates the largest sample of cluster points within a buffer zone of 0.75x the mean width in 2D space [52]. The following classes could be identified using the aforementioned approach:

  1. 1.

    Vegetation: high and low.

  2. 2.

    Road: traffic roads with their pavements, footpaths and parking lots.

  3. 3.

    Railway: trackbeds including sleepers and ballast, with subclasses for rails, posts and cables could be extracted, but were merged together eventually in one class for balancing reasons.

  4. 4.

    Crossings: bridges over land, water bodies or railway as well as tunnels and either assigned to road or railway class.

  5. 5.

    Buildings.

  6. 6.

    Miscellaneous class: “undefined” for found objects that fit none of the aforementioned categories, like sparse bird flock in Jena, vehicles in Erfurt, ships and boats in Dresden, transmission towers in Weimar, etc.

Table 4: Relevant railway attributes extracted from the shape of the railway axes shape objects “AX_Bahnstrecke” and “AX_Bahnverkehr” for labelling PCD.
Label Meaning Data Type
BKT Rail Category str enum
ELK Electrification str enum
GLS Number of Tracks str
NRB Rail Route Number str
SPW Track Gauge str enum
ZUS Condition str enum

For the generation of 2D masks from orthophotos, two methods have been tested: (1) Using overlay set operations with derived geometries from the ATKIS shapes on synthetically generated connected components labelled images, which originate from the Gaussian mixture model fit of the pixels RGB values of the original orthophotos in colour space as features and (2) trials of instance segmentation via inference with zero-shot generalisation from the Segment Anything Model [53]. Especially in cases of pale vegetation and buildings with low contrasting coloured roof tops to the surroundings, (2) resulted in better image clusters for annotations.

4.2 Training and Testing of Point Cloud Segmentation DNN Framework

The processed dataset included 313 grid cells of 1x1 kilometres from the cities of Erfurt, Jena and Weimar in Thuringia and 236 from the cities of Leipzig and Dresden of 2x2 kilometres in Saxony respectively with a total number of 1,257 cells after quad-splitting the grid cells from Saxony to 1x1 kilometres. Every cell was further quad split with the number of points in each block chosen at 4096 for training with limited GPU resources. The following classes were annotated: buildings, vegetation, water bodies, roads, railway trackbeds and a miscellaneous class for none of the aforementioned categories.

The custom dataset had to be manipulated and converted through an intermediary MMDetection3D format to provide description of the 3D boxes with 600 epochs for training of modified 2DPASS architecture. The best trained model achieved a semantic segmentation result of 71.48% for the Mean Intersection over Union (MIoU). Further sensitivity analysis steps to optimise the training hyper-parameters are needed and an increase in number of points per block to 8192 should be investigated to determine whether the segmentation result can yield a higher value for the evaluation metric.

4.3 Semantic Enrichment for BIM

This step takes the segmentation results along with the colourised PCD sampled from orthophotos to construct 3D primitive shapes and enrich them with generated context from the derived segmentation and related GIS data or meshes 3D surfaces with texture with a focus on IFC compatibility to be integrated in the information model.

Table 5: Freely available segmentation output data used within the implementation and their suggested relevant shape representation and entity in IFC.
Segmentation Class LAS Class IFC Product IFC Entity
Ground Terrain 2 Tri/PolygonalFaceSet IfcSite
Overground 20 - IfcProxy
Vegetation 2, 3, 4, 5, 20 FacetedBrep IfcProxy
Buildings 0, 1, 6, 20 Tri/PolygonalFaceSet IfcBuilding
Water 2, 9 Tri/PolygonalFaceSet IfcProxy
Roads 2, 11 Tri/PolygonalFaceSet IfcPavement
Railway Body 2, 10 FacetedBrep IfcProxy
Track Alignment 2, 10 PositioningElement IfcAlignment
Miscellaneous 0, 1, 2 Tri/PolygonalFaceSet IfcProxy
Unclassified 0, 1 FacetedBrep IfcProxy

4.3.1 Constructing IFC-Alignment Objects

In order to create not only geometrically but also semantically proper IfcAlignment entities, the segment type needs to be assigned to the respective alignment points that originate from the semantic segmentation. Hence, it is not sufficient to only assign the affiliation as ‘alignment’, but also the segment type, such as line, curve or clothoid, needs to be assigned to every point.

In a similar approach to Lin et al. [54], the prediction mask’s boundaries containing railway tracks and connecting bridges could be isolated in a projected 2.5D raster using the Python Packages Geopandas, Shapely, Rasterio and CloudCompare in a background process. This served as the foreground to estimate the centreline of the tracks through estimation of a normalised euclidean distance transformation and local maxima extraction with a maximum search radius of the minimum nearest neighbour value. In case of branching or merging of the elongated stripes, cleaning for noise and further processing steps are necessary in order to predict instances properly.

The stripes can be split by skeletonising the boundary shape of the foreground and Hit-or-Miss morphological operations to identify junctions and points to split them into separate segments. Using any of the available line decimation algorithms like Rahmer-Douglas-Peuker or Visvalingam-Whyatt can then estimate a simplified polyline of the split skeleton segments similar to the approach implemented in [55], which can be smoothed out using curve fitting with polynomials. The estimated smoothed curve points could be used to identify their closest points belonging to the actual 2.5D raster and retrieve their original height value using their identifiers.

In the IfcRail domain, the construction of railway alignments requires many different entities due to the parametric nature of railway planning. Within the IFC 4.3.x specification, horizontal alignments can be one of four different segment types: lines, circular arcs, linear curvatures like clothoids or nonlinear curvatures like Helmert Curves [56]. All horizontal segments are, among other parameters, defined by their StartPoint (a cartesian point of x- and y-coordinates), a StartDirection (expressed as an angle) and their SegmentLength, which leads overall to the next so-called station of the alignment segment. In case of curves and clothoids, further parameters, such as the radius, are necessary to construct the arithmetical geometric elements. Those elements form the horizontal alignment on a plane. To display also the horizontal course of the height, the construction of an IfcGradientCurve is necessary. The final alignment must always include both the vertical and horizontal alignment.

The segmented points of the railway track do not yet contain the information about their segment affiliation. Therefore, fitting algorithms are used to identify different segment types. Once the coordinates can be assigned to a segment type (e.g., IfcLine, IfcClothoid and IfcCurveSegment), the IfcAlignment objects can be constructed. In the end, not all coordinates are necessary to define a segment. By identifying the start and end point of a segment, the other points, which have been assigned to the segment, can be used in order to derive other necessary parameters, such as radius, tangent direction or a segment length.

To estimate the segment types and their parameters, a 2.5D raster of the segmented inference for the railway tracks as a binary image can be generated. This raster can be thresholded into a binary image and further processed as an image with a morphological skeletonisation step to extract the medial axis of the railway track instances. Based on the skeletonisation method used, further processing steps to clean and smooth the axes may be required. Afterwards, the Hough transformation can be used on the skeletonised 2.5D raster to detect straight lines and circular segments. The remaining parts of the medial axes can be fitted with clothoids by interpolating between detected end points and/or junctions detected using binary Hit-or-Miss transform in a plane with assigned tangent directions derived from the oriented gradient map of the euclidean distance transform via G1G^{1} Hermite interpolation.

4.3.2 Colour Sampling and Texturing

As a single colour value alone is sometimes insufficient to model realistic appearances or provide information, mapping a texture to a 2D or 3D surface like gift wrapping is a long established process in animation and graphics fields but it is rarely adopted in BIM related application, alas in research specific context of damage modelling where most of the case studies published have been focused. The different spaces considered when unwrapping and mapping texture to a 3D or 2D surface in 3D are shown in fig. 4. The texture mapping to the object space could be directly by defining a parametric representation of a surface (e.g., sphere) to map U,V to S,T within the interval [0,1].

Refer to caption
Figure 4: Process for texture mapping through parameterised intermediary geometry.

By defining the geometric representation of the shape mathematically the (U, V) of each vertex can be matched to its correspondent (S, T). This could be achieved by directly defining the mathematical equations for a surface or by, manually unwrapping a 3D shape into a 2D surface using UV-mapping and editing specialised software or by using an intermediate geometry for projecting the texture (e.g., a cylinder) for projection when a discrete or symbolic definition of the shape cannot be easily or directly formulated, especially in free-formed complicated 3D shapes. Each of the aforementioned options has its own advantages and limitations depending on the parameterisation constraints and method used as well as the texture pattern itself.

A Python script has been developed to generate a UV-mapped textured 2-manifold mesh. The Trimesh library is used to create a mesh from the point cloud using the Poisson surface reconstruction. Downsampling and normals estimation are performed using Open3d [57], where vertices with low density are removed from the mesh to reduce its size. A texture atlas is automatically generated for UV-mapping, where texture coordinates at edges are generated automatically into a collection of parametrised texture maps compiled together into an atlas [58, 59, 60] as Portable Network Graphics (PNG) image. The UV-mapped mesh is regenerated using Pymeshlab with vertex colouring information transferred to a texture and saved as an .obj linked to a texture atlas as a .png image defined through a material .mtl file. Figure 5 displays the process of 3D reconstructing a textured mesh from the building instance vertices of the Church of our Lady (i.e., Frauenkirche).

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Refer to caption
(f)
Figure 5: Overview of the 3D reconstruction process of a building instance; (a) 3D view of a point cloud instance of the Frauenkirche in Dresden, (b) Randomly coloured clusters of the point cloud instance via mean shift clustering algorithm, (c) Final triangle meshed surface of the point cloud after iterative subsampling, ball pivoting surface reconstruction and triple axial dust accumulation simulation finalised with a screened Poisson surface reconstruction, (d) 3D view of the intermediary parametrised geometry to the UV-mapped mesh surface in texture space, (e) Final texture atlas of the triangle meshed surface of the point cloud as an RGB colour space image in PNG format. (f) IFC textured model viewed with BlenderBIM.

The use of the IfcIndexedTriangleTextureMap entity to model a texture on a triangulated mesh as IFCTriangulatedFaceSet has been successfully attempted a few times. It shows great potential to model texture-relevant information in IFC within its initial concept of information modelling for damages in bridges and beyond to other infrastructure domains. However, the inclusion of coloured textures has not been widely adopted due to the difficulties of mapping textures to 3D models automatically and visualising them in IFC viewers correctly.

The Blender software with the BlenderBIM addon proved in this study to be the most suitable for this task. The future expansion of the IFC schema to include new entities like IfcIndexedPolygonalTextureMap to provide the means to map 2D textures on quad-meshes or as an IfcPolygonalFaceSet [61]. Table 5 lists the possible segmentation results via inference and the corresponding IFC entities suitable for their integration into BIM.

5 Case Study

Grid cell number “33410_5656_2_sn” in the city of Dresden, Saxony from the test set is used for inference. Figure 6 showcases the output result for vegetation, buildings and railway trackbed. A colourised LiDAR scan of the input sampled from its related orthophoto is used to add colours into the segmented vertices.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Figure 6: Point cloud semantic segmentation; (a) Colourised Aerial LiDAR scan, (b) Segmented vegetation via inference followed by connected component clustering, (c) Segmented buildings via inference followed by connected component clustering, (d) Segmented railway trackbeds via inference followed by connected component clustering, (e) Sampled colours to the clustered instances of segmented buildings.

5.1 Recreating Horizontal and Vertical Alignment

The first use-case is demonstrated by generating an estimation of the segmented track-bed of the railway to roughly estimate its alignment. A 2.5D raster of 5000x5000 pixels resolution is generated to facilitate the estimation through classical image processing methods as shown in fig. 7. First, the image is thresholded to generate a binary image and cleaned from noise and thin elements with opening and closing. Second, the euclidean distance transform (EDT) is calculated to estimate the orientation and magnitude of the gradients of the EDT map.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Refer to caption
(f)
Refer to caption
(g)
Refer to caption
(h)
Figure 7: Fitting alignment segments to track-bed points. Sub-figures (e) and (f) have been drastically downscaled with cubic interpolation and manipulated by increasing the exposure to allow for better visibility of single pixel thick morphological skeleton in the manuscript; (a) 2.5D Raster image, (b) Binary raster image after opening and closing processing steps, (c) Euclidean Distance Transform map of the binary raster, (d) HSV colour representation of the orientation capped at 180 degrees and normalised magnitude of gradients to the euclidean distance transform map, (e) Skeletonised morphology of the binary image, (f) Labelled segments of the split skeleton assigned to random colours using minimum spanning tree of branching clusters to identify junctions [62], (g) A zoomed-in detail of the red bordered region marked in (f) for demonstration, (h) Labelled nodes of the Minimum Spanning Tree for the skeleton pixel branches with a junction node correctly identified at label 50.

Third, the gradient orientation is used as a prior guess of the vector normal to the tangent direction along the median axis. Fourth, the axis itself for the foreground is extracted using morphological skeletonisation and further processed with smoothing and cleaning of noise and small features below 2 metres (i.e., 5\approx 5 pixels). Fifth, the skeleton is split into separate labelled entities by detecting junction and endpoints, removing the junctions temporarily to disconnect the branches of the skeleton from each other to allow for connected component labelling (CCL) then reassemble them to their connected labels. Sixth, the line segments are extracted using the probabilistic Hough transform.

Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Figure 8: Fitting alignment segments to track-bed points; (a) Detected line segments using probabilistic Hough Transform, (b) Best detected Hough circle(s) within the interval [450, 10,000] with 5 metres step, (c) Fitted points to clothoid segments from the decimated remaining curves of the skeletonised image via Visvalingam-Whyatt algorithm (d) Final fitted clothoidal, circular and line segments in red, green and blue respectively, (e) Overlay of the fitted curves and line segments for the recreated railway alignment onto the colourised LiDAR scan viewed in QGIS software.

Seventh, circular segments of the skeleton are similarly estimated with a search interval [450, 10,000] with 5 metres step. The circular segments are extracted from the full circles with a boolean bitwise and operation on the original skeleton then estimated into polar coordinates through the endpoints of every segment to allow for ordered retrieval of each curve’s pixel indices from its branch in the minimum spanning tree. Eighth, the remaining skeleton parts are decimated through Visvalingam-Whyatt algorithm and fitted into clothoids with the G1G^{1} Hermite interpolation based on the end points and the estimated conformed tangent vector normal to the gradient vector.

Finally, the parameterised line, circular and clothoidal segments are modelled as stringlines in a geodataframe and exported into GeoJSON format to be later used as an input to automatically generate the vertical and horizontal alignments in IFC using IfcOpenShell.

5.2 BIM of Progressive Textured Meshes

The second use-case is demonstrated on the Church of our Lady (Frauenkirche) found at the lower right corner of the cell in fig. 9 is used to demonstrate the possibility of integrating 3D coloured texture after meshing and automated UV-mapping of the vertices’ colours to a texture atlas into IFC using Xatlas. IfcOpenShell, BlenderBIM and FreeCAD are used to parse the mesh, texture atlas and the UV-map from the Wavefront OBJ format into their relevant IFC entities to generate a simple test project that contains the meshed instance as an IfcBuilding and the local path to the texture PNG image linked into the IFC file through a URI. Figure 9 shows the IFC model visualised in BlenderBIM alongside the embedded unwrapped UV-Map of the model shown in shown in the UV-editor window on the left. The UML Diagram elaborates the relations defined between the various entities of the textured building.

Refer to caption
(a)
Refer to caption
(b)
Figure 9: A simplified as-is textured IFC model with colours sampled directly from PCD; (a) The textured IFC model on the right and the UV-mapper window on the left displaying the texture atlas as (IfcImageTexture) entity automatically mapped to the shape representation of the triangle meshed surface (IfcPolygonalFaceSet) (IfcSurfaceStyleWithTextures) through as If of the IfcBuilding instance displayed in BlenderBIM, (b) UML diagram demonstrating the use of IfcIndexedPolygonalTextureMap to provide the texture and texture coordinates for IfcPolygonalFaceSet from a texture atlas.

6 Discussion

6.1 Evaluation of Recreated Alignment

The difference between the recreated horizontal alignment (ss) and the provided shape from ATKIS (tt) using the Root Mean Square Deviation (RMSD) and the coefficient of variation (VC) to the RMSD are calculated using eqs. 1 and 2. Over a total length of 3870.444 metres, the RMSD and the VC thereof normalised by the mean value of the measured euclidean distances (δi\delta\textsubscript{i}) are calculated to be 12.50612.506 metres and ±1.143\pm 1.143 respectively.

RMSD(s,t)=i=1n(sixtix)2+(siytiy)2n2RMSD~{}(s,t)=\sqrt[2]{\cfrac{\sum_{i=1}^{n}{(s\textsubscript{$ix$}-t\textsubscript{$ix$})^{2}+(s\textsubscript{$iy$}-t\textsubscript{$iy$})^{2}}}{n}} (1)
CV(RMSD)=RMSDδ~iCV~{}(RMSD)=\cfrac{RMSD}{\widetilde{\delta}\textsubscript{$i$}} (2)

While the RMSD of fitting the recreated alignment in our presented workflow is higher than already published results in the literature [13, 15], it still has minimal impact on the precision of consequent output results when deployed for early planning applications. Furthermore, it provides a comprehensible pipeline to define nodes, edges, dependent features, labels and initialisation weights from the CSR Graph of the alignments, which lays the ground for designing a more refined regression solution based on a Graph Neural Network (GNN) framework to be addressed in a future work to derive recreated vertical and horizontal alignments optimally.

Refer to caption
(a)
Refer to caption
(b)
Figure 10: Evaluation of the recreated railway alignments; (a) Overlay of the original ATKIS shapes and fitted curves and line segments for the recreated railway alignment in green and red respectively onto the colourised LiDAR scan viewed in QGIS software, (b) A Plot of the original and recreated alignments after recalculating the constructed curves vertices for the same xtx\textsubscript{t}.
Refer to caption
(a)
Refer to caption
(b)
Refer to caption
(c)
Refer to caption
(d)
Refer to caption
(e)
Figure 11: Comparison of queried buildings within the vicinity of the recreated railway alignment and the original ATKIS shape using buffer regions of 100 metres; (a) Segmented Buildings PCD displayed onto the colourised LiDAR scan viewed in QGIS, (b) Overlay of the buffer zones of the fitted recreated alignment and the original in red and green respectively, (c) Queried buildings instances that intersect with the buffer zone of the original alignment in green colour, (d) Queried buildings instances that intersect with the buffer zone of the recreated alignment in red colour, (e) Overlay of the queried building instances with a 100 m proximity to the railway alignments from (c) and (d) result in minor difference of around 4.32% error.

6.2 Limitations

6.2.1 Mindful Decision-Making on Technical Feasibility and Applicability

When striving for very accurate results, as many studies do, the inputs must meet the targeted standard deviation of the results at least. Using less accurate data, by the means of e.g., point cloud density, it must be taken into account that certain details or specific object classes (e.g., overhead wires), can not be accurately reconstructed or are impossible to reconstruct through meshing at all. The selection of object classes to be segmented must therefore be made carefully.

6.2.2 Accuracy of Colour Sampling from Orthophotos

Colour value sampling from orthographic photos onto PCD depends on the quality of photos used per se and may vary from previous scans. Orthographic photos from aerial scans do not represent the most accurate colour sampling for vertical elements and planes (e.g., external walls and facades of buildings), as they are technically unfavourable to be captured from above.

6.2.3 Inherent Limitations of Flyover LiDAR Scans

Related to the aforementioned limitation, a well-known fact of reconstruction from scanning is that the results can only be considered accurate for the built environment which is not subterranean. When creating as-is models solely from data from flyovers and/or overground assessments, the bottom edge of the resulting meshes will only represent an assumption and will not take into account subterranean buildings or infrastructure [63].

When using PCD and orthographic photos from aerial LiDAR scans, the recorded sets of points differ from others scanned at ground-level, e.g., mobile mapping systems that could be mounted on (rail) vehicles. Rails, overhead wires, signals or other type of built environment, that might be covered by vegetation, adjacent buildings, building components or shadows when using aerial records. The authors aimed at reducing the effect by including GIS data to fill in the gaps, nonetheless, the aforementioned aspect needs to be taken into account and implies, that the trained neural network model might not be suitable for all kinds of PCD, but rather for the ones of aerial origin. The neural network has not been tested on PCD of other nature yet. A possible future improvement could be the combination of aerial and ground-level PCDs to reach a well-balanced dataset and to study the effects that might show in the results.

6.2.4 Texture Visualisation in IFC Viewer

The surface reconstruction of segmented PCD into triangle meshes results in a drastic increase in the size of generated shape entities that could leap into the Gigabytes size intervals, if fine meshes and high resolution texture maps are to be used. Most shaders and geometry kernels of the available IFC viewers may not be able to process those on an average daily-use computers due to the limits of disk space, free cache memory and graphics processing capacity.

6.2.5 Balancing Meshing Complexity with LoD Requirements

The accuracy resemblance of the reconstructed meshes according to the reality strongly correlates with the accuracy and quality of the input data. Besides that, high geometric and semantic details come with the cost of an increase of size and the amount of information. Therefore, balancing information and LoD needs of as-is models is crucial.

6.2.6 Advantages and Disadvantages of Texture Embedding in IFC

The embedding of texture paths as Unique Resource Identifiers (URIs), whether relative or absolute, brings the risk of corrupting the IFC model, if the textures are mislocated or the models are exchanged without their linked attachments. However, entities like IfcBlobTexture or IfcPixelTexture in the IFC schema provide a workaround by embedding the texture maps directly in the model, albeit at the cost of a larger model size.

6.3 Potentials

The demonstrated framework shows the automation of laborious manual as-is modelling, relying solely on openly available point cloud and GIS data. Especially in early or pre-planning phases of infrastructure projects, as-is models hold the potential to support decision-making processes without spending money on costly surveying tasks, leaving more time and budget for continuative thorough analyses. Combined with already established methods of quantitative difference estimation in bi-temporal scans [64], relevant queries about changes in context of railway can be collected and further processed to gain relatively accurate and up to date knowledge with minimal additional cost provided that the technical expertise is available. The industry standard IFC makes the results accessible both for further planning based on the created as-is models as well as allowing textured visualisation and documentation in an open format with a suitable data model. Public providers of cadastral and surveying data could, on the same data base they provide now, allocate detailed classified 3D models of the (built) environment to the public. With this, they would enable potential for many possible use cases in the context of infrastructure planning and at the same time relieve the public hand from avoidable expenses.

7 Conclusions

This paper demonstrated a framework for GIS-informed point cloud segmentation resulting in the creation of BIM-ready as-is models of the built environment in railway infrastructure projects. GIS-data was used to create classified masks of coloured point cloud data, including the classes of e.g., vegetation, rail, buildings etc. The segmented point clouds were then used to train a deep learning model, namely 2DPASS, to automatically segment PCD. From the resulting sets of points, class-specific 3D meshes were created, optimised and converted to then openBIM format IFC. The conversion included the proper usage of IFC classes, the enrichment with properties originating from the GIS data and finally, the texturing of the IFC files with aerial images. All data that was used during this process either originated from public surveying data, such as [65] or from openly accessible GIS databanks such as OpenStreetMap.

Future work should consider combining other types of input data, such as 2D plans, with PCD in order to create as-is models. This would fill accuracy gaps and reduce assumptions associated with the use of PCD, such as uncertainties regarding underground structures. The result of such an approach would furthermore reflect more realistically the current work of infrastructure planning by taking into account the best-available data and could contribute to upgrading datasets by contextualising and enriching it.

Funding

This research received no external funding.

Data Availability

All original GIS data used for the purpose of this study are freely accessible to download and use from the official websites of the Geoportal of the German Free States of Thuringia and Saxony respectively via the Data licence Germany – attribution – Version 2.0 (dl-de/by-2-0)

Acknowledgements

Many thanks to Ms. Judith Krischler and Prof. Christian Koch for their invaluable contribution and expertise in the original EG-ICE 2023 conference paper that greatly assisted this research. Furthermore, we thank Mr. Sergei Rogozin for his Python programming contribution as a student assistant to help retrieve attributes from ATKIS shape files and create buffered labels for annotating some road and rail subclasses in the dataset.

Conflicts of Interest

The author declares no conflict of interest exists that could affect, influence or jeopardise the integrity of research reported in this study.

Abbreviations

The following abbreviations are used in this manuscript:
AEC Architecture, Engineering and Construction ATKIS The Official Topographic Cartographic Information System (German: Amtliches Topographisch- Kartographisches Informationssystem) BIM Building Information Modelling CCL Connected Component Labelling CNN Convolutional Neural Network CRS Coordinate Reference System DBSCAN Density-based Spatial Clustering of Applications with Noise DL Deep Learning DNN Deep Neural Network EDT Euclidean Distance Transform GIS Geographic Information Systems IDM Information Delivery Manual IFC Industry Foundation Classes LiDAR Light Detection and Ranging LoD Level of Development LoIN Level of Information Need MIoU Mean Intersection over Union PCD Point Cloud Data PNG Portable Networks Graphics RANSAC Random Sample Consensus SPH Service Phase UML Unified Modelling Language URI Unified Resource Identifier XR Extended Reality

References

  • [1] DB InfraGO AG. Infrago-zustandsbericht: Netz und personenbahnhöfe.
  • [2] DB InfraGO AG. Netzzustandsbericht fahrweg 2022: Db infrago ag - geschäftsbereich fahrweg.
  • [3] Deutsche Bahn AG. Integrierter zwischenbericht januar - juni 2023, 2023.
  • [4] DB AG. Bim strategy: Implementation of building information modelling (bim) in the infrastructure division at deutsche bahn ag.
  • [5] Michaela Renzel, Markus Schäfer, and Swantje Stürze. Inge – eine arbeitshilfe zur digitalisierung von inbetriebnahmegenehmigungsverfahren. In Christian Hofstadler and Christoph Motzko, editors, Agile Digitalisierung im Baubetrieb, pages 251–266. Springer Fachmedien Wiesbaden, Wiesbaden, 2021.
  • [6] Christian Hofstadler and Christoph Motzko, editors. Agile Digitalisierung im Baubetrieb. Springer Fachmedien Wiesbaden, Wiesbaden, 2021.
  • [7] DB Engineering & Consulting. Geodaten und digitaler zwilling erweitern bim-horizont.
  • [8] Bundesministirium für Digitales und Verkehr. Sachstandsbericht verkehrsprojekte deutsche einheit.
  • [9] Shervin Haghsheno, Gerhard Satzger, Svenja Lauble, and Michael Vössing, editors. Künstliche Intelligenz im Bauwesen: Grundlagen und Anwendungsfälle. Springer Vieweg, Wiesbaden, 2024.
  • [10] Google maps platform documentation - map tiles api: Photorealistic 3d tiles, 2024.
  • [11] Judith Krischler, Mohamed Said Helmy Alabassy, and Christian Koch. Bim integration for automated identification of relevant geo-context information via point cloud segmentation. In Tim Broyd, Haijiang Li, and Qiuchen Lu, editors, The 30th International Workshop of the European Group for Intelligent Computing in Engineering (EG-ICE), 2023.
  • [12] M. Soilán, A. Nóvoa, A. Sánchez-Rodríguez, B. Riveiro, and P. Arias. Semantic segmentation of point clouds with pointnet and kpconv architectures applied to railway tunnels. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pages 281–288, 2020.
  • [13] Mahendrini Fernando Ariyachandra and Ioannis Brilakis. Automated generation of railway track geometric digital twins (railgdt) from airborne lidar data. In Jimmy Abualdenien, André Borrmann, Lucian-Constantin Ungureanu, and Timo Hartmann, editors, EG-ICE 2021 Workshop on Intelligent Computing in Engineering. Universitätsverlag der TU Berlin.
  • [14] Felix Eickeler and André Borrmann. Building a balanced and well-rounded dataset for railway asset detection. In Jimmy Abualdenien, André Borrmann, Lucian-Constantin Ungureanu, and Timo Hartmann, editors, EG-ICE 2021 Workshop on Intelligent Computing in Engineering. Universitätsverlag der TU Berlin.
  • [15] M. Cserép, A. Demján, F. Mayer, B. Tábori, and P. Hudoba. Effective railroad fragmentation and infrastructure recognition based on dense lidar point clouds. ISPRS Annals of the Photogrammetry, Remote Sensing and Spatial Information Sciences, pages 103–109, 2022.
  • [16] M. Mahendrini FernandoR. Ariyachandra and Ioannis Brilakis. Leveraging railway topology to automatically generate track geometric information models from airborne lidar data. In Automation in Construction, volume 155. 2023.
  • [17] Philipp Hüthwohl, Ioannis Brilakis, André Borrmann, and Rafael Sacks. Integrating rc bridge defect information into bim models. Journal of Computing in Civil Engineering, 32(3), 2018.
  • [18] Rafael Sacks, Amir Kedar, André Borrmann, Ling Ma, Ioannis Brilakis, Philipp Hüthwohl, Simon Daum, Uri Kattel, Raz Yosef, Thomas Liebich, Burcu Esen Barutcu, and Sergej Muhic. Seebridge as next generation bridge inspection: Overview, information delivery manual and model view definition. Automation in Construction, 90:134–145, 2018.
  • [19] Mathias Artus, Mohamed Said Helmy Alabassy, and Christian Koch. A bim based framework for damage segmentation, modeling, and visualization using ifc. Applied Sciences, 12(6):2772, 2022.
  • [20] Haotian Tang, Zhijian Liu, Shengyu Zhao, Yujun Lin, Ji Lin, Hanrui Wang, and Song Han. Searching efficient 3d architectures with sparse point-voxel convolution.
  • [21] Jianyun Xu, Ruixiang Zhang, Jian Dou, Yushi Zhu, Jie Sun, and Shiliang Pu. Rpvnet: A deep and efficient range-point-voxel fusion network for lidar point cloud segmentation.
  • [22] Khaled El Madawy, Hazem Rashed, Ahmad El Sallab, Omar Nasr, Hanan Kamel, and Senthil Yogamani. Rgb and lidar fusion based 3d semantic segmentation for autonomous driving.
  • [23] Georg Krispel, Michael Opitz, Georg Waltner, Horst Possegger, and Horst Bischof. Fuseseg: Lidar point cloud segmentation fusing multi-modal data. In 2020 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 1863–1872. IEEE, 2020.
  • [24] Gregory P. Meyer, Jake Charland, Darshan Hegde, Ankit Laddha, and Carlos Vallespi-Gonzalez. Sensor fusion for joint 3d object detection and semantic segmentation.
  • [25] Sourabh Vora, Alex H. Lang, Bassam Helou, and Oscar Beijbom. Pointpainting: Sequential fusion for 3d object detection.
  • [26] Andres Milioto, Ignacio Vizzo, Jens Behley, and Cyrill Stachniss. Rangenet ++: Fast and accurate lidar semantic segmentation. In 2019 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pages 4213–4220. IEEE, 11/3/2019 - 11/8/2019.
  • [27] Yuhui Yuan, Lang Huang, Jianyuan Guo, Chao Zhang, Xilin Chen, and Jingdong Wang. Ocnet: Object context network for scene parsing.
  • [28] Zhuangwei Zhuang, Rong Li, Kui Jia, Qicheng Wang, Yuanqing Li, and Mingkui Tan. Perception-aware multi-sensor fusion for 3d lidar semantic segmentation.
  • [29] Geoffrey Hinton, Oriol Vinyals, and Jeff Dean. Distilling the knowledge in a neural network.
  • [30] Lei Jimmy Ba and Rich Caruana. Do deep nets really need to be deep?
  • [31] Guobin Chen, Wongun Choi, Xiang Yu, Tony Han, and Manmohan Chandraker. Learning efficient object detection models with knowledge distillation. In I. Guyon, U. Von Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, editors, Advances in Neural Information Processing Systems, volume 30. Curran Associates, Inc, 2017.
  • [32] Sergey Zagoruyko and Nikos Komodakis. Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer.
  • [33] Suraj Srinivas and Francois Fleuret. Knowledge transfer with jacobian matching.
  • [34] Saurabh Gupta, Judy Hoffman, and Jitendra Malik. Cross modal distillation for supervision transfer.
  • [35] Lichen Wang, Jiaxiang Wu, Shao-Lun Huang, Lizhong Zheng, Xiangxiang Xu, Lin Zhang, and Junzhou Huang. An efficient approach to informative feature extraction from multimodal data.
  • [36] Shanxin Yuan, Bjorn Stenger, and Tae-Kyun Kim. Rgb-based 3d hand pose estimation via privileged learning with depth images.
  • [37] Zhengzhe Liu, Xiaojuan Qi, and Chi-Wing Fu. 3d-to-2d distillation for indoor scene parsing.
  • [38] Long Zhao, Xi Peng, Yuxiao Chen, Mubbasir Kapadia, and Dimitris N. Metaxas. Knowledge as priors: Cross-modal knowledge generalization for datasets without superior knowledge.
  • [39] Yueh-Cheng Liu, Yu-Kai Huang, Hung-Yueh Chiang, Hung-Ting Su, Zhe-Yu Liu, Chin-Tang Chen, Ching-Yu Tseng, and Winston H. Hsu. Learning from 2d: Contrastive pixel-to-point knowledge transfer for 3d pretraining.
  • [40] Zhihao Yuan, Xu Yan, Yinghong Liao, Yao Guo, Guanbin Li, Zhen Li, and Shuguang Cui. X-trans2cap: Cross-modal knowledge transfer using transformer for 3d dense captioning.
  • [41] Xu Yan, Jiantao Gao, Chaoda Zheng, Chao Zheng, Ruimao Zhang, Shenghui Cui, and Zhen Li. 2dpass: 2d priors assisted semantic segmentation on lidar point clouds. pages 677–695.
  • [42] André Borrmann, Markus König, Christian Koch, and Jakob Beetz, editors. Building Information Modeling. VDI-Buch. Springer Fachmedien Wiesbaden, Wiesbaden, 2021.
  • [43] M. R. M. F. Ariyachandra and Ioannis Brilakis. Detection of railway masts in airborne lidar data. Journal of Construction Engineering and Management, 146(9), 2020.
  • [44] Bisheng Yang and Lina Fang. Automated extraction of 3-d railway tracks from mobile laser scanning point clouds. ISPRS International Journal of Geo-Information, 7(12):4750–4761, 2014.
  • [45] Yun-Jian Cheng, Wen-Ge Qiu, and Dong-Ya Duan. Automatic creation of as-is building information model from single-track railway tunnel point clouds. Automation in Construction, 106:102911, 2019.
  • [46] Mario Soilán, Andrea Nóvoa, Ana Sánchez-Rodríguez, Andrés Justo, and Belén Riveiro. Fully automated methodology for the delineation of railway lanes and the generation of ifc alignment models using 3d point cloud data. Automation in Construction, page 103684, 2021.
  • [47] Rafael Sacks, Amir Kedar, André Borrmann, Ling Ma, Dominic Singer, and Uri Kattel. Seebridge information delivery manual (idm) for next generation bridge inspection. In Anoop Sattineni, Salman Azhar, and Daniel Castro, editors, Proceedings of the 33rd International Symposium on Automation and Robotics in Construction (ISARC), Proceedings of the International Symposium on Automation and Robotics in Construction (IAARC), pages 826–834. International Association for Automation and Robotics in Construction (IAARC), 2016.
  • [48] P. Hüthwohl, I. Armeni, H. Fathi, and I. Brilakis. Gygax construction it research platform for 2d & 3d, 2017.
  • [49] Dion Moult and Michalis Kamburelis. Understanding how textures and shaders work in ifc, 2022.
  • [50] Daniel Girardeau-Montaut. Détection de changement sur des données géométriques tridimensionnelles. Theses, LTCI - Laboratoire Traitement et Communication de l’Information, 2006.
  • [51] Clemens Portele. Objektartenkataloge zur geoinfodok: Aaa-anwendungsschema (7.1.2): Reference version 7.1, 2022.
  • [52] P. Glira, K. Ölsböck, T. Kadiofsky, M. Schörghuber, J. Weichselbaum, C. Zinner, and L. Fel. Photogrammetric 3d mobile mapping of rail tracks. ISPRS Journal of Photogrammetry and Remote Sensing, 183:352–362, 2022.
  • [53] Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollár, and Ross Girshick. Segment anything.
  • [54] Xiangguo Lin, Jixian Zhang, Zhengjun Liu, and Jing Shen. Semi-automatic road tracking by template matching and distance transform. In 2009 Joint Urban Remote Sensing Event, pages 1–7. IEEE, 5/20/2009 - 5/22/2009.
  • [55] Mohamed Said Helmy Alabassy. Automated approach for building information modelling of crack damages via image segmentation and image-based 3d reconstruction. In Michael Disser, André Hoffmann, Luisa Kuhn, Patrick Scheich, and Institut für Numerische Methoden und Informatik im Bauwesen, TU Darmstadt, editors, 32. Forum Bauinformatik 2021, pages 123–130. Institut für Numerische Methoden und Informatik im Bauwesen, TU Darmstadt.
  • [56] buildingSMART. Ifc 4.3.2.0 documentation, 2023.
  • [57] In IEEE Visualization, 2002. VIS 2002. IEEE, 2002.
  • [58] Bruno Lévy, Sylvain Petitjean, Nicolas Ray, and Jérome Maillot. Least squares conformal maps for automatic texture atlas generation. ACM Transactions on Graphics, 21(3):362–371, 2002.
  • [59] O. Sorkine, D. Cohen-Or, R. Goldenthal, and D. Lischinski. Bounded-distortion piecewise mesh parameterization. In IEEE Visualization, 2002. VIS 2002, pages 355–362. IEEE, 2002.
  • [60] Pedro V. Sander, John Snyder, Steven J. Gortler, and Hugues Hoppe. Texture mapping progressive meshes. In Lynn Pocock, editor, Proceedings of the 28th annual conference on Computer graphics and interactive techniques, pages 409–416, New York, NY, USA, 2001. ACM.
  • [61] Ifcindexedpolygonaltexturemap resource definition data schema (ifc4x3_add2): Ifc 4.3.2.20240128 (ifc4x3_add2) develpment documentation, 2023.
  • [62] Juan Nunez-Iglesias, Adam J. Blanch, Oliver Looker, Matthew W. Dixon, and Leann Tilley. A new python library to analyse skeleton images confirms malaria parasite remodelling of the red blood cell membrane skeleton. PeerJ, 6:e4312, 2018.
  • [63] Rui Zhang, Yichao Wu, Wei Jin, and Xiaoman Meng. Deep-learning-based point cloud semantic segmentation: A survey. Electronics, 12(17):3642, 2023.
  • [64] Mike R. James, Stuart Robson, and Mark W. Smith. 3–d uncertainty–based topographic change detection with structure–from–motion photogrammetry: precision maps for ground control and directly georeferenced surveys. Earth Surface Processes and Landforms, 42(12):1769–1788, 2017.
  • [65] Thüringer Landesamt für Bodenmanagement und Geoinformation. Offene geodaten, 2023.