FloorGenT: Generative Vector Graphic Model of Floor Plans for Robotics

Ludvig Ericson, Patric Jensfelt

Floor plans are the basis of reasoning in and communicating about indoor environments. In this paper, we show that by modelling floor plans as sequences of line segments seen from a particular point of view, recent advances in autoregressive sequence modelling can be leveraged to model and predict floor plans. The line segments are canonicalized and translated to sequence of tokens and an attention-based neural network is used to fit a one-step distribution over next tokens. We fit the network to sequences derived from a set of large-scale floor plans, and demonstrate the capabilities of the model in four scenarios: novel floor plan generation, completion of partially observed floor plans, generation of floor plans from simulated sensor data, and finally, the applicability of a floor plan model in predicting the shortest distance with partial knowledge of the environment.

Figure 1: Samples from a Trained Model. No cherry picking.

Randomly picked samples of a FloorGenT network trained on the KTH floor plan dataset generated with nucleus sampling (p=90%). Blue is sampled model output. Left: unconditioned novel samples. Middle: partial sequence completion samples conditioned on the segments shown in red (first 25 segments of randomly selected test sequences, i.e., novel data to the network). Right: partial image conditioned samples with the input image shown in red (rasterization of the first 25 segments of randomly selected test sequences).

Figure 2: Schematic Data Flow.

Overview of data flow in FloorGenT. The input sequence is a possibly empty sequence of tokens t, where each token is embedded as a sum of three discrete embedding vectors, is the input to the first self-attention layer. For the image models, the embedded input image is input to the cross-attention layers. When sampling, the next token is repeatedly drawn from the next token distribution, and fed back into the network at the end of the token sequence.

Figure 3: Data Representaion.

An example of a token sequence and its corresponding drawing in the shape of an L. Note that in practice, the line segments would be sorted by their distance to some origin coordinate as described in the paper.