Recent advances in neural models have shown great results for virtual try-on (VTO) problems, where a 3D representation of a garment is deformed to fit a target body shape. However, current solutions are limited to a single garment layer, and cannot address the combinatorial complexity of mixing different garments. Motivated by this limitation, we investigate the use of neural fields for mix-and-match VTO, and identify and solve a fundamental challenge that existing neural-field methods cannot address: the interaction between layered neural fields. To this end, we propose a neural model that untangles layered neural fields to represent collision-free garment surfaces. The key ingredient is a neural untangling projection operator that works directly on the layered neural fields, not on explicit surface representations. Algorithms to resolve object-object interaction are inherently limited by the use of explicit geometric representations, and we show how methods that work directly on neural implicit representations could bring a change of paradigm and open the door to radically different approaches.
@inproceedings{santesteban2021ulnefs, title = {{ULNeF}: Untangled Layered Neural Fields for Mix-and-Match Virtual Try-On}, author = {Igor Santesteban and Miguel A. Otaduy and Nils Thuerey and Dan Casas}, booktitle = {Advances in Neural Information Processing Systems, (NeurIPS)}, year = {2022} }
State-of-the-art 3D virtual try-on (VTO) methods are limited to a single garment or a predefined outfit, but in real life we combine many clothes to create different outfits. Unfortunately, existing garment-specific neural processing solutions cannot address the combinatorial complexity of mix-and-match VTO.
To address this limitation, we introduce Untangled Layered Neural Fields (ULNeFs), which solves multi-object interaction using implicit representations of the objects. We represent multiple possibly colliding objects (e.g., multiple garments) using a layered variant of neural fields, and we design an algorithm that untangles these layered neural fields to represent collision-free objects.
To encode garments using implicit representations is challenging because they are open surfaces. This difficulties any untangling operation since inside/outside queries are now well defined. We address this problem by learning two fields: a signed distance field f(x) that represents the garment surface and provides a notion of inside-outside, and a covariant field h(x) that models the volume near the openings that other garments can pass through without producing tangled configurations. Using these fields, we can detect if a point x is in a tangled configuration.
Thanks to the implicit surface representation, untangling can be formulated as a local operation (e.g., solving an optimization problem) on the field values at positions x. Out main contribution is a neural model that learns the untangling operation, which we train from random sets of field values. Importantly, note that this neural projection operator is trained only once for any arbitrary combination of N surfaces and, once trained, it naturally generalizes to any garment or implicit surface.
ULNeFs is able to output order-dependent results, depending on how the layers are sorted. Below we show a collection of outfits generated with ULNeF mixing 3, 4, and 5 garments in various configurations.
The work was funded in part by the European Research Council (ERC Consol- idator Grant no. 772738 TouchDesign) and Spanish Ministry of Science (RTI2018-098694-B-I00 VizLearning).