Anhong Guo wearing a light blue shirt standing in front of a glass wall with drawings.

We design, develop, study, and deploy human-AI intelligent interactive systems to enhance the accessibility of the real world and the digital world. We empower people to interact with technology through usage, customization, and creation. By doing so, we highlight the unique needs of people (e.g., people with disabilities) and the importance of designing for a long-tail of needs, so that technologies can best support them.

📢📢 I am recruiting PhD students to join the UMich Human-AI Lab, focusing on AI & Accessibility, and multimodal AR systems. If you are interested in the type of work we do, e.g., ProgramAlly, WorldScribe, SoundShift, BrushLens, Please apply!

Anhong Guo is an Assistant Professor in Computer Science & Engineering at the University of Michigan, also affiliated with the School of Information. His research is at the intersection of human-computer interaction and artificial intelligence, which combines human and machine intelligence to create interactive systems for accessibility, collaboration, and beyond. His research has won best paper and artifact awards at top conferences including CHI, UIST, ASSETS, and MobileHCI. He is a Google Research Scholar, a Forbes' 30 Under 30 Scientist, an inaugural Snap Inc. Research Fellow, and a Swartz Innovation Fellow for Entrepreneurship. Anhong holds a Ph.D. in Human-Computer Interaction from Carnegie Mellon University, a Master’s in HCI from Georgia Tech, and a Bachelor's in Electronic Information Engineering from BUPT. He has also worked in the Ability and Intelligent User Experiences groups in Microsoft Research, the HCI group of Snap Research, the Accessibility Engineering team at Google, and the Mobile Innovation Center of SAP America.

News

Oct 12, 2024
🏆 Our ISWC 2014 paper on evaluating order picking methods was awarded 10-year impact award!
Oct 8, 2024
🏆 Excited to receive a Google Academic Research Award on project WorldScribe!
Oct 1, 2024
🏆 WorldScribe awarded Best Paper Award at UIST 2024!
Jun 29, 2024
📃 ProgramAlly for end-user DIY of visual assistive technology accepted to UIST 2024
Jun 29, 2024
📃 WorldScribe for context-aware live visual descriptions accepted to UIST 2024
Jun 29, 2024
📃 VRCopilot for human-AI co-creation of 3D layouts in VR accepted to UIST 2024
Jun 27, 2024
📃 EditScribe for non-visual image editing using LLM accepted to ASSETS 2024
Jun 27, 2024
📃 Our work on Audio Description Customization accepted to ASSETS 2024
Jun 21, 2024
🎉 Congratulations to 🎓 Dr. Lei Zhang on his wonderful Ph.D. defense! Lei will join NJIT Informatics as an Assistant Professor after a one-term postdoc at Princeton.
Jun 5, 2024
☕ The Human-AI Lab (HAIL) was at VISIONS 2024 hosted by the Ann Arbor District Library
Apr 3, 2024

Publications

ProgramAlly is an end-user programming tool for creating visual information filtering programs. The figure shows a diagram of how programs are run. For the example program 'find NUMBER on BUS', the app first looks for buses in the frame. Then, if a bus is found, the image is cropped to just contain the bus. In the cropped frame, text detection will be used to look for numbers.
Jaylin Herskovitz, Andi Xu, Rahaf Alharbi, Anhong Guo
UIST 2024
pdf · ACM DL · arXiv · video · 30s preview · website
WorldScribe is a system that generates automated live real-world visual descriptions that are customizable and adaptive to users' contexts.
Ruei-Che Chang, Yuxuan Liu, Anhong Guo
Best Paper AwardUIST 2024
Scaffolded creation in VRCopilot: Users can create wireframes by drawing on the floor while speaking, in addition to automatically generated wireframes; They can then turn the wireframes into specific furniture.
Lei Zhang, Jin Pan, Jacob Gettig, Steve Oney, Anhong Guo
UIST 2024
pdf · ACM DL · arXiv · video · 30s preview
EditScribe enables non-visual image editing using natural language verification loops powered by large multimodal models.
Ruei-Che Chang, Yuxuan Liu, Lotus Zhang, Anhong Guo
ASSETS 2024
pdf · ACM DL · arXiv · video
CustomAD interface consists of a video player (left), which allows users to play, pause, and seek the video, and a customization pane (right) where users can customize the properties of ADs. The customization properties are grouped into content settings and presentation settings. The content customization adjusts the script's content as users change the ADs length and emphasis. In presentation customization, users could adjust speed, voice, tone, gender, and grammatical syntax of the ADs to change how the ADs are read out. Users can also toggle the ADs on and off.
Rosiana Natalie, Ruei-Che Chang, Smitha Sheshadri, Anhong Guo, Kotaro Hara
ASSETS 2024
pdf · ACM DL · arXiv · video
We present SoundShift, a concept to manipulate sounds for improving mixed-reality awareness. SoundShift situates in the auditory Reality-Virtuality Continuum with full transparency and noise cancellation as two ends, and comprises six sound manipulators: Transparency Shift, Envelope Shift, Position Shift, Style Shift, Time Shift, and Sound Append.
Ruei-Che Chang, Chia-Sheng Hung, Bing-Yu Chen, Dhruv Jain, Anhong Guo
DIS 2024
pdf · ACM DL · arXiv · video
We present InteractOut, a suite of implicit input manipulation techniques that slightly inhibit the natural execution of common user gestures on smartphones, such as taps and swipes. These input manipulation techniques introduce interaction costs and decrease the smoothness of smartphone interaction to nudge users towards reducing usage.
Tao Lu, Hongxiao Zheng, Tianying Zhang, Xuhai "Orson" Xu, Anhong Guo
CHI 2024
pdf · ACM DL · arXiv · video · 30s preview · press release
BrushLens is a phone case with actuators that recognize screen elements, and use actuators to touch the screen on behalf of the user. It shows a user uses BrushLens to ``Brush'' on the touchscreen, and the actuator will touch if it is precisely on top of the button.
Chen Liang, Yasha Iravantchi, Thomas Krolikowski, Ruijie Geng, Alanson Sample, Anhong Guo
UIST 2023
An illustration of VRGit. A History Graph (HG) that represents non-linear version history is anchored on the user’s left arm, where each node is a 3D miniature of that version. Inside each miniature, objects are highlighted using color coding if they are changed compared to the previous version. Mini avatars are anchored in the HG to represent which version users are in. Users can also create portals to monitor other users’ first-person views. A shared history visualization facilitates group discussion by anchoring the HG on a surface and allowing users to preview a version and reuse objects collaboratively.
Lei Zhang, Ashutosh Agrawal, Steve Oney, Anhong Guo
CHI 2023
pdf · ACM DL · full video · 30s preview · talk
As an overview of XSpace's components, this is a sketch of a person using an AR application in their living room. There is a virtual avatar on a virtual chair, showing the mesh crop and overlay method. There is shared virtual content around the space.
Jaylin Herskovitz, Yi Fei Cheng, Anhong Guo, Alanson Sample, Michael Nebeling
ISS 2022
pdf · ACM DL · video · talk · code
OmniScribe to make 360° videos accessible. A describer is using the OmniScribe web authoring interface, which supports them to better understand the 360° content to author standard audio descriptions and create immersive labels, in order to enable BVI people to interact with 360° content immersively using smartphones and headphones. The picture shows a man wearing wireless headphones, holding a mouse in his right hand to operate a computer and use our OmniScribe for audio description. On the computer is the OmniScribe interface.
Ruei-Che Chang, Chao-Hsien Ting, Chia-Sheng Hung, Wan-Chen Lee, Liang-Jin Chen, Yu-Tzu Chao, Bing-Yu Chen, Anhong Guo
UIST 2022
CollabAlly system overview: (i) Extracts visual cue information from Google Doc's HTML DOM tree, including collaborators, cursors, comments, highlighted text, and text differences; (ii) Parses this data into a readable format and display it in three pop-up dialog boxes on-demand: (a) enhanced collaborator announcements, (b) comment tracking and navigation, (c) real-time and asynchronous text changes; (iii) Conveys contextual information using audio features, including spatial audio and voice fonts. The figure shows a Google doc page and 3 interfaces of CollabAlly dialog boxes. The annotations also include different visual cues (collaborator, cursor, comment, highlighted text, and text changes) and audio features (spatial audio and voice fonts).
Cheuk Yin Phipson Lee*, Zhuohao Zhang*, Jaylin Herskovitz, JooYoung Seo, Anhong Guo
Honorable MentionCHI 2022
pdf · ACM DL · full video · 30s preview · talk · code
TutorialLens records the finger location relative to device marker location in 3D for authoring mode, and then reproduces that based on detected device marker location in the access mode (shown in picture).
Junhan Kong, Dena Sabha, Jeffrey P. Bigham, Amy Pavel, Anhong Guo
SUI 2021
pdf · ACM DL · supplemental · video
A decision tree to predict the instances on which the Face API gender classifier is likely to make a mistake based on meta-attributes of the data. Each node of the tree is labeled by the counts of correct and wrong instances belonging to the clusters. Nodes are colored to represent the relative error rate, green shades for lower error rates and red shades for higher error rates.
Solon Barocas, Anhong Guo, Ece Kamar, Jacquelyn Krones, Meredith Ringel Morris, Jennifer Wortman Vaughan, W. Duncan Wadsworth, Hanna Wallach
AIES 2021
pdf · ACM DL · arXiv · poster · slides · talk
Leila, a black, non-binary person with a filtering face mask walks down a neighborhood street with one hand in their pocket and the other hand on their cane. They have a short mohawk and are wearing a jacket, shorts, tennis shoes, and glasses.
Cynthia L. Bennett, Cole Gleason, Morgan Klaus Scheuerman, Jeffrey P. Bigham, Anhong Guo, Alexandra To
Honorable MentionCHI 2021
pdf · ACM DL · slides · talk
Two example prototypes for making AR apps accessible. A: Foundational Accessibility. Screenshot of a virtual chair with a voice over target around it, a speech bubble shows the app announcing "Back of chair with blue cushion". B: Scanning. Screenshot of AR grid overlaid on a coffee table. Speech bubbles show the app announcing "Found a new horizontal surface" and "Scanned 2 surfaces totaling 2.3 square meters".
Jaylin Herskovitz, Jason Wu, Samuel White, Amy Pavel, Gabriel Reyes, Anhong Guo, Jeffrey P. Bigham
ASSETS 2020
pdf · ACM DL · video · slides · talk
Image of a person in a wheelchair in front of a swing gate
Shaun K. Kane, Anhong Guo, Meredith Ringel Morris
Best Paper NomineeASSETS 2020
pdf · ACM DL · slides · talk
Screenshot of a tweet by @CDCgov from April 1, 2020 3:55pm: Actions to reduce spread of the virus, such as social distancing, are key to #FlattenTheCurve. 2 of 3 (original tweet link: https://twitter.com/CDCgov/status/1245439600472084486) The tweet contains an image of the common public health infographic about “flattening the curve”, but the tweet did not include alt text for the image. The image shows an example of a common flatten the curve info-graphic. A tall peak indicates the height of the pandemic if left unchecked, and a shorter spread out curve depicts the effects of social distancing efforts.
Cole Gleason*, Stephanie Valencia*, Lynn Kirabo, Jason Wu, Anhong Guo, Elizabeth J. Carter, Jeffrey P. Bigham, Cynthia L. Bennett+ Amy Pavel+
ASSETS 2020
pdf · ACM DL · supplemental · slides · talk
The user is holding the phone in landscape mode with one hand, and aiming the camera towards a touchscreen coffee machine. The user’s other hand is wearing a fingercap exploring on the screen. The StateLens iOS app is providing audio guidance to the user.
Anhong Guo, Junhan Kong, Michael Rivera, Frank F. Xu, Jeffrey P. Bigham
UIST 2019
pdf · ACM DL · arXiv · full video · 30s preview · slides · talk · UIST live talk
Two Blocks users are collaboratively creating a table in augmented reality.
Anhong Guo, Ilter Canberk, Hannah Murphy, Andrés Monroy-Hernández, Rajan Vaish
Ubicomp 2019
pdf · ACM DL · arXiv · video
On the left: a screenshot of Android App Drawer taken using X-Ray. On the right: a user holding the phone showing the same screenshot in X-Ray image viewer. Talk back cursor is visible.
Sujeath Pareddy, Anhong Guo, Jeffrey P. Bigham
Best Artifact AwardASSETS 2019
pdf · ACM DL · video · code
A word cloud composed of words including AI fairness, accessibility, artifical intelligence, inclusion, and bias.
Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, Meredith Ringel Morris
ASSETS 2019 AI Fairness Workshop
Three images in the VizWiz-Priv dataset, including an image of a wall of photos containing faces, an image of a credit card, and an image of a pregnancy test. The private information regions in the images are highlighted and inpainted.
Danna Gurari, Qing Li, Chi Lin, Yinan Zhao, Anhong Guo, Abigale Stangl, Jeffrey P. Bigham
CVPR 2019
pdf · supplemental · CVF · vizwiz.org · poster
Interaction scenario of Minuet: after returning home, the user points at the Roomba and then the dirty area to ask Roomba to clean it up.
Runchang Kang, Anhong Guo, Gierad Laput, Yang Li, Xiang 'Anthony' Chen
SUI 2019
pdf · ACM DL · video
An example question sensor created in Zensors++ asking 'Is someone using a printer?' with a bounding box focusing on the printer area.
Anhong Guo, Anuraag Jain, Shomiron Ghose, Gierad Laput, Chris Harrison, Jeffrey P. Bigham
Ubicomp 2018
pdf · ACM DL · video · slides
A table with many boxes covered with white paper showing text such as glasses, butter, jam, etc. A user is holding and targeting his phone at one object, while touching the object with the other hand. This is showcasing the window cursor interaction technique that supports non-visual attention to items within a complex visual scene, in which the user moves the device itself to scan the scene and receives information about what is in the center of the image.
Anhong Guo, Saige McVea, Xu Wang, Patrick Clary, Ken Goldman, Yang Li, Yu Zhong, Jeffrey P. Bigham
ASSETS 2018
pdf · ACM DL · video · slides
Example printed overlays and legends generated by Facade. (a)-(d) demonstrate the different material combinations we tested in the design iterations (NinjaFlex with Braille, Flex+PLA Braille label, Flex+PLA Braille cover, and Flex+PLA embossed letter cover). Facade users can choose to print a legend for the abbreviations (e).
Anhong Guo, Jeffrey P. Bigham
IEEE Pervasive Computing 17(2), 2018, Maker Tech column
Distribution of the first six words for all questions in the VizWiz dataset. The innermost ring represents the first word and each subsequent ring represents a subsequent word. The arc size is proportional to the number of questions with that word/phrase.
Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, Jeffrey P. Bigham
Spotlight PresentationCVPR 2018
pdf · arXiv · website · poster · video · tech review
A user accessing the microwave augmented with tactile overlays generated by Facade.
Anhong Guo, Jeeeun Kim, Xiang 'Anthony' Chen, Tom Yeh, Scott E. Hudson, Jennifer Mankoff, Jeffrey P. Bigham
CHI 2017
pdf · ACM DL · full video · 30s preview · talk
A user holding a 3D-printed cup holder augmented with a flexible ring generated using our flexible buffers technique.
Jeeeun Kim, Anhong Guo, Tom Yeh, Scott E. Hudson, Jennifer Mankoff
DIS 2017
pdf · ACM DL · video
The user is holding the phone in portrait mode with one hand, and aiming the camera towards an inaccessible microwave control panel. The user’s other hand is exploring on the panel. The VizLens iOS app is providing audio feedback and guidance to the user.
Anhong Guo, Xiang 'Anthony' Chen, Haoran Qi, Samuel White, Suman Ghosh, Chieko Asakawa, Jeffrey P. Bigham
UIST 2016
pdf · ACM DL · full video · 30s preview · talk
Two tilt-based interaction techniques for enabling no-touch, wrist-only interactions on smartwatches. Left: AnglePoint, which directly maps the position of a virtual pointer to the tilt angle of the smartwatch. Right: ObjectPoint, which objectifies the underlying virtual pointer as an object imbued with a physics model.
Anhong Guo, Tim Paek
Honorable MentionMobileHCI 2016
pdf · ACM DL · video
A user working on a document on a smartwatch using WearWrite, by leveraging a crowd to help trans­late their ideas into text.
Michael Nebeling, Alexandra To, Anhong Guo, Adrian A. de Freitas, Jaime Teevan, Steven P. Dow, Jeffrey P. Bigham
CHI 2016
pdf · ACM DL · full video · 30s preview · talk
A system architecture diagram of an order picking system augmented with weight checking error detection.
Xiaolong Wu, Malcolm Haynes, Anhong Guo, Thad Starner
ISWC 2016
pdf · ACM DL · video
Four BeyondTouch interaction techniques, including tapping on a phone in the pocket, tapping on the back of a phone while holding it with two hands, tapping and sliding on the back of the phone while holding it with one hand, as well as tapping and sliding next to the phone on the table to control the device.
Cheng Zhang, Anhong Guo, Dingtian Zhang, Yang Li, Caleb Southern, Rosa I. Arriaga, Gregory D. Abowd
TiiS 6(2), 2016
pdf · ACM DL
Pick-by-head-up display system using a Google Glass with a opaque display to show the pick order instructions.
Anhong Guo, Xiaolong Wu, Zhengyang Shen, Thad Starner, Hannes Baumann, Scott Gilliland
Computer 48(6), 2015
Four BeyondTouch interaction techniques, including tapping on a phone in the pocket, tapping on the back of a phone while holding it with two hands, tapping and sliding on the back of the phone while holding it with one hand, as well as tapping and sliding next to the phone on the table to control the device.
Cheng Zhang, Anhong Guo, Dingtian Zhang, Caleb Southern, Rosa I. Arriaga, Gregory D. Abowd
IUI 2015
pdf · ACM DL · video
Image of a set of order picking bins with LED displays and buttons.
Xiaolong Wu, Malcolm Haynes, Yixin Zhang, Ziyi Jiang, Zhengyang Shen, Anhong Guo, Thad Starner, Scott Gilliland
ISWC 2015
pdf · ACM DL
Image of the experimental setup, including 24 pick bins (on two shelving units with four rows and three columns each) and three order bins on the right. An example pick list is annotated with superimposed labels.
Anhong Guo, Shashank Raghu, Xuwen Xie, Saad Ismail, Xiaohui Luo, Joseph Simoneau, Scott Gilliland, Hannes Baumann, Caleb Southern, Thad Starner
10-Year Impact AwardHonorable MentionISWC 2014
pdf · ACM DL · video · talk