Computer-generated imagery is now ubiquitous in our society, spanning fields such as games and movies, architecture, engineering, or virtual prototyping, while also helping create novel ones such as computational materials. With the increase in computational power and the improvement of acquisition techniques, there has been a paradigm shift in the field towards data-driven techniques, which has yielded an unprecedented level of realism in visual appearance.
Unfortunately, this leads to a series of problems: First, there is a disconnect between the mathematical representation of the data and any meaningful parameters that humans understand; the captured data is machine-friendly, but not human friendly. Second, the many different acquisition systems lead to heterogeneous formats and very large datasets. And third, real world appearance functions are usually nonlinear and high-dimensional. As a result, visual appearance datasets are increasingly unfit to editing operations, which limits the creative process for scientists, engineers, artists and practitioners in general. There is an immense gap between the complexity, realism and richness of the captured data, and the flexibility to edit such data. This line of research plans to bridge this gap, putting the user at the core. Achieving our goals will finally enable us to reach the true potential of real-world captured datasets in many aspects of society.
Abstract: We introduce text2fabric, a novel dataset that links free-text descriptions to various fabric materials. The dataset comprises 15,000 natural language descriptions associated to 3,000 corresponding images of fabric materials. Traditionally, material descriptions come in the form of tags/keywords, which limits their expressivity, induces pre-existing knowledge of the appropriate vocabulary, and ultimately leads to a chopped description system. Therefore, we study the use of free-text as a more appropriate way to describe material appearance, taking the use case of fabrics as a common item that non-experts may often deal with. Based on the analysis of the dataset, we identify a compact lexicon, set of attributes and key structure that emerge from the descriptions. This allows us to accurately understand how people describe fabrics and draw directions for generalization to other types of materials. We also show that our dataset enables specializing large vision-language models such as CLIP, creating a meaningful latent space for fabric appearance, and significantly improving applications such as fine-grained material retrieval and automatic captioning.
Abstract: Intuitively editing the appearance of materials from a single image is a challenging task given the complexity of the interactions between light and matter, and the ambivalence of human perception. This problem has been traditionally addressed by estimating additional factors of the scene like geometry or illumination, thus solving an inverse rendering problem and subduing the final quality of the results to the quality of these estimations. We present a single-image appearance editing framework that allows us to intuitively modify the material appearance of an object by increasing or decreasing high-level perceptual attributes describing such appearance (e.g., glossy or metallic). Our framework takes as input an in-the-wild image of a single object, where geometry, material, and illumination are not controlled, and inverse rendering is not required. We rely on generative models and devise a novel architecture with Selective Transfer Unit (STU) cells that allow to preserve the high-frequency details from the input image in the edited one. To train our framework we leverage a dataset with pairs of synthetic images rendered with physically-based algorithms, and the corresponding crowd-sourced ratings of high-level perceptual attributes. We show that our material editing framework outperforms the state of the art, and showcase its applicability on synthetic images, in-the-wild real-world photographs, and video sequences.
Abstract: Our visual perception of the world is strongly influenced by material appearance. Humans can easily recognize and discriminate materials, despite the influence on their final appearance of confounding factors such as illumination or surface geometry. However, understanding material appearance and perceived properties such as glossiness remains challenging. Recent literature has shown how unsupervised generative neural networks can spontaneously learn perceptually-meaningful latent representations from simple stimuli renderings of bumpy surfaces, and cluster them according to glossiness despite receiving no explicit information about it. Furthermore, those representations correlate better with human perception of gloss than the physical parameters of the materials, suggesting that our brains may decipher glossiness by learning the statistical structure of images. In this work, we analyze the performance of such unsupervised learning models on a wider variety of complex real-world images, including realistic object geometries, real environment maps, and measured materials. We train a PixelVAE generative network in an unsupervised manner on a dataset containing three different geometries under three different illuminations, using more than 300 materials. We study the latent representations found by our model without receiving any prior knowledge. Our results show that the model clusters the stimuli hierarchically, suggesting that geometry could be the most relevant appearance factor, followed by illumination. This is different from previous experiments using abstract bumpy surfaces, where the role of geometry was less prominent due to the randomness of the bumps. Finally, we analyze how our (unsupervised) learned latent representations correlate with human ratings of glossiness perception, showing a reasonable organization despite the complex interactions with geometry and lightness. In conclusion, our results suggest that unsupervised learning representations may help to understand human visual perception of material appearance even in the presence of complex stimuli.
Abstract: Simulating the light transport on biological tissues is a longstanding challenge, given its complex multilayered structure. In biology, one of the most remarkable and studied examples of tissues are the scales that cover the skin of reptiles, which present a combination of photonic structures and pigmentation. This is, however, a somewhat ignored problem in computer graphics. In this work, we propose a multilayered appearance model based on the anatomy of the snake skin. Some snakes are known for their striking, highly iridescent scales resulting from light interference. We model snake skin as a two-layered reflectance function: The top layer is a thin layer resulting on a specular iridescent reflection, while the bottom layer is a diffuse highly-absorbing layer, that results into a dark diffuse appearance that maximizes the iridescent color of the skin. We demonstrate our layered material on a wide range of appearances, and show that our model is able to qualitatively match the appearance of snake skin.
Abstract: A good match of material appearance between real-world objects and their digital on-screen representations is critical for many applications such as fabrication, design, and e-commerce. However, faithful appearance reproduction is challenging, especially for complex phenomena, such as gloss. In most cases, the view-dependent nature of gloss and the range of luminance values required for reproducing glossy materials exceeds the current capabilities of display devices. As a result, appearance reproduction poses significant problems even with accurately rendered images. This paper studies the gap between the gloss perceived from real-world objects and their digital counterparts. Based on our psychophysical experiments on a wide range of 3D printed samples and their corresponding photographs, we derive insights on the influence of geometry, illumination, and the display's brightness and measure the change in gloss appearance due to the display limitations. Our evaluation experiments demonstrate that using the prediction to correct material parameters in a rendering system improves the match of gloss appearance between real objects and their visualization on a display device.
Abstract: Despite advances in display technology, many existing applications rely on psychophysical datasets of human perception gathered using older, sometimes outdated displays. As a result, there exists the underlying assumption that such measurements can be carried over to the new viewing conditions of more modern technology. We have conducted a series of psychophysical experiments to explore contrast sensitivity using a state-of-the-art HDR display, taking into account not only the spatial frequency and luminance of the stimuli but also their surrounding luminance levels. From our data, we have derived a novel surroundaware contrast sensitivity function (CSF), which predicts human contrast sensitivity more accurately. We additionally provide a practical version that retains the benefits of our full model, while enabling easy backward compatibility and consistently producing good results across many existing applications that make use of CSF models. We show examples of effective HDR video compression using a transfer function derived from our CSF, tone-mapping, and improved accuracy in visual difference prediction.
Abstract: Single-image appearance editing is a challenging task, traditionally requiring the estimation of additional scene properties such as geometry or illumination. Moreover, the exact interaction of light, shape and material reflectance that elicits a given perceptual impression is still not well understood. We present an image-based editing method that allows to modify the material appearance of an object by increasing or decreasing high-level perceptual attributes, using a single image as input. Our framework relies on a two-step generative network, where the first step drives the change in appearance and the second produces an image with high-frequency details. For training, we augment an existing material appearance dataset with perceptual judgements of high-level attributes, collected through crowd-sourced experiments, and build upon training strategies that circumvent the cumbersome need for original-edited image pairs. We demonstrate the editing capabilities of our framework on a variety of inputs, both synthetic and real, using two common perceptual attributes (Glossy and Metallic), and validate the perception of appearance in our edited images through a user study.
Abstract: Translucent materials are ubiquitous in the real world, from organic materials such as food or human skin, to synthetic materials like plastic or rubber. While multiple models for translucent materials exist, understanding how we perceive translucent appearance, and how it is affected by illumination and geometry, remains an open problem. In this work, we analyze how well human observers esti- mate the density of translucent objects for static and dynamic illu- mination scenarios. Interestingly, our results suggest that dynamic illumination may not be critical to assess the nature of translucent materials.
Abstract: Material appearance hinges on material reflectance properties but also surface geometry and illumination. The unlimited number of potential combinations between these factors makes understanding and predicting material appearance a very challenging task. In this work, we collect a large-scale dataset of perceptual ratings of appearance attributes with more than 215,680 responses for 42,120 distinct combinations of material, shape, and illumination. The goal of this dataset is twofold. First, we analyze for the first time the effects of illumination and geometry in material perception across such a large collection of varied appearances. We connect our findings to those of the literature, discussing how previous knowledge generalizes across very diverse materials, shapes, and illuminations. Second, we use the collected dataset to train a deep learning architecture for predicting perceptual attributes that correlate with human judgments. We demonstrate the consistent and robust behavior of our predictor in various challenging scenarios, which, for the first time, enables estimating perceived material attributes from general 2D images. Since our predictor relies on the final appearance in an image, it can compare appearance properties across different geometries and illumination conditions. Finally, we demonstrate several applications that use our predictor, including appearance reproduction using 3D printing, BRDF editing by integrating our predictor in a differentiable renderer, illumination design, or material recommendations for scene design
Abstract: We present a single-image data-driven method to automatically relight images with full-body humans in them. Our framework is based on a realistic scene leveraging precomputed radiance transfer (PRT) and spherical harmonics (SH) lighting. In contrast to previous work, we lift the assumptions on Lambertian materials and explicitly model diffuse and specular reflectance in our data. Moreover, we introduce an additional light-dependent residual term that accounts for errors in the PRTbased image reconstruction. We propose a new deep learning architecture, tailored to the decomposition performed in PRT, that is trained using a of L1, logarithmic, and rendering losses. Our model outperforms the state of the art for full-body human relighting both with synthetic images and photographs.
Abstract: Painters are masters in replicating the visual appearance of materials. While the perception of material appearance is not yet fully understood, painters seem to have acquired an implicit understanding of the key visual cues that we need to accurately perceive material properties. In this study, we directly compare the perception of material properties in paintings and in renderings, by collecting professional realistic paintings of rendered materials. From both type of images, we collect human judgments of material properties and compute a variety of image features that are known to reflect material properties. Our study reveals that, despite important visual differences between the two types of depiction, material properties in paintings and renderings are perceived very similarly and are linked to the same image features. This suggests that we use similar visual cues independently of the medium and that the presence of such cues is sufficient to provide a good appearance perception of the materials
Abstract: Observing and recognizing materials is a fundamental part of our daily life. Under typical viewing conditions, we are capable of effortlessly identifying the objects that surround us and recognizing the materials they are made of. Nevertheless, understanding the underlying perceptual processes that take place to accurately discern the visual properties of an object is a long-standing problem. In this work, we perform a comprehensive and systematic analysis of how the interplay of geometry, illumination, and their spatial frequencies affect human performance on material recognition tasks. We carry out large-scale behavioral experiments where participants are asked to recognize different reference materials among a pool of candidate samples. In the different experiments, we carefully sample the information in the frequency domain of the stimuli. From our analysis, we find significant first-order interactions between the geometry and the illumination, of both the reference and the candidates. In addition, we observe that simple image statistics and higher-order image histograms do not correlate with human performance, therefore, we perform a high-level comparison of highly non-linear statistics by training a deep neural network on material recognition tasks. Our results show that such models can accurately classify materials, which suggests that they are capable of defining a meaningful representation of material appearance from labeled proximal image data. Last, we find preliminary evidence that these highly non-linear models and humans may use similar high-level factors for material recognition tasks.
Abstract: Establishing a robust measure for material similarity that correlates well with human perception is a long-standing problem. A recent work presented a deep learning model trained to produce a feature space that aligns with human perception by gathering human subjective measures. The resulting metric outperforms objective existing ones. In this work, we aim to understand whether this increased performance is a result of using human perceptual data or is due to the nature of feature learnt by deep learning models. We train similar networks with objective measures (BRDF similarity or classification task) and show that these networks can predict human judgements as well, suggesting that the non-linear features learnt by convolutional network might be a key to model material perception.
Abstract: We present a model to measure the similarity in appearance between different materials, which correlates with human similarity judgments. We first create a database of 9,000 rendered images depicting objects with varying materials, shape and illumination. We then gather data on perceived similarity from crowdsourced experiments; our analysis of over 114,840 answers suggests that indeed a shared perception of appearance similarity exists. We feed this data to a deep learning architecture with a novel loss function, which learns a feature space for materials that correlates with such perceived appearance similarity. Our evaluation shows that our model outperforms existing metrics. Last, we demonstrate several applications enabled by our metric, including appearance-based search for material suggestions, database visualization, clustering and summarization, and gamut mapping.
Abstract: We analyze the effect of motion in the perception of material appearance. First, we create a set of stimuli containing 72 realistic materials, rendered with varying degrees of linear motion blur. Then we launch a large-scale study on Mechanical Turk to rate a given set of perceptual attributes, such as brightness, roughness, or the perceived strength of reflections. Our statistical analysis shows that certain attributes undergo a significant change, varying appearance perception under motion. In addition, we further investigate the perception of brightness, for the particular cases of rubber and plastic materials. We create new stimuli, with ten different luminance levels and seven motion degrees. We launch a new user study to retrieve their perceived brightness. From the users’ judgements, we build two-dimensional maps showing how perceived brightness varies as a function of the luminance and motion of the material.
Abstract: Accurately modeling how light interacts with cloth is challenging, due to the volumetric nature of cloth appearance and its multiscale structure, where microstructures play a major role in the overall appearance at higher scales. Recently, significant effort has been put on developing better microscopic models for cloth structure, which have allowed rendering fabrics with unprecedented fidelity. However, these highly-detailed representations still make severe simplifications on the scattering by individual fibers forming the cloth, ignoring the impact of fibers' shape, and avoiding to establish connections between the fibers' appearance and their optical and fabrication parameters. In this work we put our focus in the scattering of individual cloth fibers; we introduce a physically-based scattering model for fibers based on their low-level optical and geometric properties, relying on the extensive textile literature for accurate data. We demonstrate that scattering from cloth fibers exhibits much more complexity than current fiber models, showing important differences between cloth type, even in averaged conditions due to longer views. Our model can be plugged in any framework for cloth rendering, matches scattering measurements from real yarns, and is based on actual parameters used in the textile industry, allowing predictive bottom-up definition of cloth appearance.
Abstract: Reproducing the appearance of real-world materials using current printing technology is problematic. The reduced number of inks available define the printer’s limited gamut, creating distortions in the printed appearance that are hard to control. Gamut mapping refers to the process of bringing an out-of-gamut material appearance into the printer’s gamut, while minimizing such distortions as much as possible. We present a novel two-step gamut mapping algorithm that allows users to specify which perceptual attribute of the original material they want to preserve (such as brightness, or roughness). In the first step, we work in the low-dimensional intuitive appearance space recently proposed by Serrano et al., and adjust achromatic reflectance via an objective function that strives to preserve certain attributes. From such intermediate representation, we then perform an image-based optimization including color information, to bring the BRDF into gamut. We show, both objectively and through a user study, how our method yields superior results compared to the state of the art, with the additional advantage that the user can specify which visual attributes need to be preserved. Moreover, we show how this approach can also be used for attribute-preserving material editing.
Abstract: During the last few years, many different techniques for measuring material appearance have arisen. These advances have allowed the creation of large public datasets, and new methods for editing BRDFs of captured appearance have been proposed. However, these methods lack intuitiveness and are hard to use for novice users. In order to overcome these limitations, Serrano et al. recently proposed an intuitive space for editing captured appearance. They make use of a representation of the BRDF based on a combination of principal components (PCA) to reduce dimensionality, and then map these components to perceptual attributes. This PCA representation is biased towards specular materials and fails to represent very diffuse BRDFs, therefore producing unpleasant artifacts when editing. In this paper, we build on top of their work and propose to use two separate PCA bases for representing specular and diffuse BRDFs, and map each of these bases to the perceptual attributes. This allows us to avoid artifacts when editing towards diffuse BRDFs. We then propose a new method for effectively navigate between both bases while editing based on a new measurement of the specularity of measured materials. Finally, we integrate our proposed method in an intuitive BRDF editing framework and show how some of the limitations of the previous model have been overcomed with our representation. Moreover, our new measure of specularity can be used on any measured BRDF, as it is not limited only to MERL BRDFs.
Abstract: Many different techniques for measuring material appearance have been proposed in the last few years. These have produced large public datasets, which have been used for accurate, data-driven appearance modeling. However, although these datasets have allowed us to reach an unprecedented level of realism in visual appearance, editing the captured data remains a challenge. In this paper, we present an intuitive control space for predictable editing of captured BRDF data, which allows for artistic creation of plausible novel material appearances, bypassing the difficulty of acquiring novel samples. We first synthesize novel materials, extending the existing MERL dataset up to 400 mathematically valid BRDFs. We then design a large-scale experiment, gathering 56,000 subjective ratings on the high-level perceptual attributes that best describe our extended dataset of materials. Using these ratings, we build and train networks of radial basis functions to act as functionals mapping the perceptual attributes to an underlying PCA-based representation of BRDFs. We show that our functionals are excellent predictors of the perceived attributes of appearance. Our control space enables many applications, including intuitive material editing of a wide range of visual properties, guidance for gamut mapping, analysis of the correlation between perceptual attributes, or novel appearance similarity metrics. Moreover, our methodology can be used to derive functionals applicable to classic analytic BRDF representations. We release our code and dataset publicly, in order to support and encourage further research in this direction.