News

Abstract

Saliency prediction in 360º video plays an important role in modeling visual attention, and can be leveraged for content creation, compression techniques, or quality assessment methods, among others. Visual attention in immersive environments depends not only on visual input, but also on inputs from other sensory modalities, primarily audio. Despite this, only a minority of saliency prediction models have incorporated auditory inputs, and much remains to be explored about what auditory information is relevant and how to integrate it in the prediction. In this work, we propose an audiovisual saliency model for 360º video content, AViSal360. Our model integrates both spatialized and semantic audio information, together with visual inputs. We perform exhaustive comparisons to demonstrate both the actual relevance of auditory information in saliency prediction, and the superior performance of our model when compared to previous approaches.

Downloads

Results

The qualitative comparisons between our model, AViSal360, and the three main state-of-the-art approaches can be found in our web-based browser for the D-SAV360 dataset.

Code

You can find the code and model for AViSal360 in our GitHub repository.

Bibtex (Coming soon)

Related Work

  • 2022: SST-Sal: A spherical spatio-temporal approach for saliency prediction in 360∘ videos
  • @article{bernal2022sst, title={SST-Sal: A spherical spatio-temporal approach for saliency prediction in 360∘ videos}, author={Bernal-Berdun, Edurne and Martin, Daniel and Gutierrez, Diego and Masia, Belen}, journal={Computers \& Graphics}, volume={106}, pages={200--209}, year={2022}, publisher={Elsevier} }
  • 2022: ScanGAN360: A Generative Model of Realistic Scanpaths for 360 Images
  • @article{martin2022scangan360, title={ScanGAN360: A Generative Model of Realistic Scanpaths for 360 Images}, author={Martin, Daniel and Serrano, Ana and Bergman, Alexander W and Wetzstein, Gordon and Masia, Belen}, journal={IEEE Transactions on Visualization \& Computer Graphics}, number={01}, pages={1--1}, year={2022}, publisher={IEEE Computer Society} }
  • 2020: Panoramic convolutions for 360º single-image saliency prediction
  • @inproceedings{martin20saliency, author={Martin, Daniel and Serrano, Ana and Masia, Belen}, title={Panoramic convolutions for $360^{\circ}$ single-image saliency prediction}, booktitle={CVPR Workshop on Computer Vision for Augmented and Virtual Reality}, year={2020} }


    This work has been supported by grant PID2022-141539NB-I00, funded by MICIU/AEI/10.13039/501100011033 and by ERDF, EU