Gaze-contingent auditory displays for improved spatial attention in virtual reality

  1. Get@NRC: Gaze-contingent auditory displays for improved spatial attention in virtual reality (Opens in a new window)
DOIResolve DOI:
AuthorSearch for: ; Search for: ; Search for:
Journal titleACM Transactions on Computer-Human Interaction
Article number19
Pages# of pages: 38
AbstractVirtual reality simulations of group social interactions are important for many applications, including the virtual treatment of social phobias, crowd and group simulation, collaborative virtual environments (VEs), and entertainment. In such scenarios, when compared to the real world, audio cues are often impoverished. As a result, users cannot rely on subtle spatial audio-visual cues that guide attention and enable effective social interactions in real-world situations. We explored whether gaze-contingent audio enhancement techniques driven by inferring audio-visual attention in virtual displays could be used to enable effective communication in cluttered audio VEs. In all of our experiments, we hypothesized that visual attention could be used as a tool to modulate the quality and intensity of sounds from multiple sources to efficiently and naturally select spatial sound sources. For this purpose, we built a gaze-contingent display (GCD) that allowed tracking of a user’s gaze in real-time and modifying the volume of the speakers’ voices contingent on the current region of overt attention. We compared six different techniques for sound modulation with a base condition providing no attentional modulation of sound. The techniques were compared in terms of source recognition and preference in a set of user studies. Overall, we observed that users liked the ability to control the sounds with their eyes. They felt that a rapid change in attenuation with attention but not the elimination of competing sounds (partial rather than absolute selection) was most natural. In conclusion, audio GCDs offer potential for simulating rich, natural social, and other interactions in VEs. They should be considered for improving both performance and fidelity in applications related to social behaviour scenarios or when the user needs to work with multiple audio sources of information.
Publication date
PublisherAssociation for Computing Machinery
AffiliationAerospace; National Research Council Canada
Peer reviewedYes
NPARC number23002401
Export citationExport as RIS
Report a correctionReport a correction
Record identifier332b2654-5599-4c64-bbe3-f379dc631e24
Record created2017-10-27
Record modified2017-10-27
Bookmark and share
  • Share this page with Facebook (Opens in a new window)
  • Share this page with Twitter (Opens in a new window)
  • Share this page with Google+ (Opens in a new window)
  • Share this page with Delicious (Opens in a new window)
Date modified: