Anhong Guo

My research develops Personal Assistive Technology — a new class of assistive systems that deeply adapts to users’ individual abilities, physical contexts, and intents, to deliver just the right information, in the right modality, at the right time.

Anhong Guo is a Morris Wellman Assistant Professor in Computer Science & Engineering at the University of Michigan, also affiliated with the School of Information. His research is at the intersection of HCI and AI, which leverages the synergy between human and machine intelligence to create interactive systems for accessibility, collaboration, and beyond. His research has received best paper, honorable mention, and artifact awards at CHI, UIST, ASSETS, and MobileHCI, and the 10-year impact award at ISWC on wearable technologies for warehouse order picking. He is a recipient of the NSF CAREER award, a Forbes' 30 Under 30 Scientist, a Google Research Scholar, an inaugural Snap Inc. Research Fellow, and a Swartz Innovation Fellow for Entrepreneurship. Anhong holds a Ph.D. in Human-Computer Interaction from Carnegie Mellon University, a Master’s in HCI from Georgia Tech, and a Bachelor's in Electronic Information Engineering from BUPT. He has also worked in the Ability and Intelligent User Experiences groups in Microsoft Research, the HCI group of Snap Research, and the Accessibility Engineering team at Google.

News

Sep 19, 2025

🏆 Honored to be named the Morris Wellman Endowed Assistant Professor

Aug 5, 2025

💰 Excited to receive an NSF award with Danna Gurari on Collaborative Research: HCC: Small: Accounting for Focus Ambiguity in Visual Questions

Jun 6, 2025

🏆 Excited to receive the NSF CAREER award on DIY personal assistive technology!

May 23, 2025

💰 Excited to receive Samsung START program funding with Ke Sun and Kang G. Shin to develop efficient wearable memory augmentation systems

May 6, 2025

🎉 Congratulations to Jaylin Herskovitz, who will join Tufts CS as an Assistant Professor in Jan 2026

Apr 18, 2025

☕ We will welcome three PhD students to join us this Fall: Yuxuan Liu, Ellie Seehorn, and Muzhe Wu

Mar 26, 2025

🎉 Congratulations to Ruei-Che Chang on receiving the Apple Scholars in AI/ML PhD fellowship

Mar 3, 2025

☕ New talk: Personal Assistive Technology at the Stanford HCI Seminar

Feb 12, 2025

💰 Excited to be part of the ARPA-H Michigan team led by Jason Corso to advance AI intelligent task guidance to upskill medical professionals for rural healthcare

Feb 10, 2025

🎉 Congratulations to Jaylin Herskovitz on Susan Lipschutz Award for Rackham graduate students

Oct 23, 2024

📃 Ellie and Jaylin wrote a nice article about our lab, published in the ACM Crossroads magazine

Oct 12, 2024

🏆 Our ISWC 2014 paper on evaluating order picking methods was awarded 10-year impact award

Oct 8, 2024

🏆 Excited to receive a Google Academic Research Award on project WorldScribe

Oct 1, 2024

🏆 WorldScribe, led by Ruei-Che Chang, awarded Best Paper Award at UIST 2024

Jul 14, 2024

💰 Excited to receive an NSF award SCH: Multimodal Techniques to Enhance Intra- and Post-operative Learning and Coordination between Attending and Resident Surgeons with PIs Xu Wang and Vitaliy Popov

Jun 21, 2024

🎉 Congratulations to 🎓 Dr. Lei Zhang on his wonderful Ph.D. defense! Lei will join NJIT Informatics as an Assistant Professor after a one-term postdoc at Princeton

Jun 5, 2024

☕ The Human-AI Lab (HAIL) was at VISIONS 2024 hosted by the Ann Arbor District Library

Publications

An illustration showing a photo of a fluffy orange-and-brown tabby cat cuddling with a light cream-colored dog on the grass. Below the photo, there is a question, 'Is there a cat in the image?' with an arrow pointing to both a human with low vision and a VLM agent. Both the human with low vision and the VLM agent respond, 'I can see some animal shapes, but I can’t tell if there is a cat,' indicating shared visual limitations.

Not There Yet: Evaluating Vision Language Models in Simulating the Visual Perception of People with Low Vision

Rosiana Natalie, Wenqian Xu, Ruei-Che Chang, Rada Mihalcea, Anhong Guo

Preprint

pdf · arXiv

A figure showing an overview of the HandProxy system in 3 stages. The user starts by giving a speech command (first stage), which is used and interpreted by the HandProxy system to control a virtual proxy hand (second stage), and the proxy hand will then interact within the virtual environment to perform hand interactions on behalf of the user (third stage).

HandProxy: Expanding the Affordances of Speech Interfaces in Immersive Environments with a Virtual Proxy Hand

Chen Liang, Yuxuan Liu, Martez Mott, Anhong Guo

IMWUT/Ubicomp 2025

pdf · ACM DL · arXiv · video · press release

A blind user interacts with an AI through a live video feed on their phone. The user asks, 'Can you read the label for me?'' The AI responds, 'Could you angle it slightly to reduce the glare?' After adjusting the product, the user asks, 'Is this better?' and the AI replies, 'Yes, that's much better,' then proceeds to read the label. The image illustrates how AI assists blind users by guiding them to capture clearer visuals for reading text.

Probing the Gaps in ChatGPT Live Video Chat for Real-World Assistance for People who are Blind or Visually Impaired

Ruei-Che Chang, Rosiana Natalie, Wenqian Xu, Jovan Zheng Feng Yap, Anhong Guo

ASSETS 2025

pdf · ACM DL · arXiv

A schematic overview of a blind or low-vision (BLV) user interacting with A11yShape to create 3-D models. Panel A shows the BLV user. Panel B illustrates the A11yShape interface, which includes three components: Code Editor panel, AI Assistant Panel providing textual descriptions, and Model Panel showing a hierarchical structure and 3-D renderings. A cross-representation highlighting mechanism links elements across these panels to support non-visual navigation in the user interface. Panel C displays examples of 3-D models created by the user.

A11yShape: AI-Assisted 3-D Modeling for Blind and Low-Vision Programmers

Zhuohao (Jerry) Zhang, Haichang Li, Chun Meng Yu, Faraz Faruqi, Junan Xie, Gene S-H Kim, Mingming Fan, Angus G. Forbes, Jacob O. Wobbrock, Anhong Guo, Liang He

ASSETS 2025

pdf · ACM DL · arXiv · video · press release

DeckFlow is an infinite canvas for creating multimodal content. In this case, the user drags a Goal Card from the Hand, which generates an Action Card connected to several Text Cards representing the decomposed specification. The Action Card spawns multiple Text Cards containing the constructed prompts, and images are generated using them so the user can explore the generative space. In a subsequent iteration of the task, the user moves some of them into a Cluster, and uses one as input to the Action Card.

DeckFlow: Iterative Specification on a Multimodal Generative Canvas

Gregory Croisdale, Emily Huang, John Joon Young Chung, Anhong Guo, Xu Wang, Austin Z. Henley, Cyrus Omar

VL/HCC 2025

pdf · arXiv

A student is learning to solve the Rubik's Cube using Rubikon, an intelligent tutoring system for Rubik’s Cube learning. It is supported by an ArUco marker AR setup that enables physical task reconfiguration.

Rubikon: Intelligent Tutoring for Rubik's Cube Learning Through AR-enabled Physical Task Reconfiguration

Haocheng Ren*, Muzhe Wu*, Gregory Croisdale, Anhong Guo, Xu Wang

DIS 2025

pdf · ACM DL · arXiv · video · 30s preview

Human-AI Lab members standing in front of our booth at the VISIONS 2024 event hosted by the Ann Arbor District Library (AADL).

VISIONS of Accessibility: Human-AI Lab (HAIL), University of Michigan

Ellie Seehorn, Jaylin Herskovitz

XRDS: Crossroads, The ACM Magazine for Students (Fall 2024)

pdf · ACM DL · VISIONS 2024 event

ProgramAlly is an end-user programming tool for creating visual information filtering programs. The figure shows a diagram of how programs are run. For the example program 'find NUMBER on BUS', the app first looks for buses in the frame. Then, if a bus is found, the image is cropped to just contain the bus. In the cropped frame, text detection will be used to look for numbers.

ProgramAlly: Creating Custom Visual Access Programs via Multi-Modal End-User Programming

Jaylin Herskovitz, Andi Xu, Rahaf Alharbi, Anhong Guo

UIST 2024

pdf · ACM DL · arXiv · video · 30s preview · talk · website

WorldScribe is a system that generates automated live real-world visual descriptions that are customizable and adaptive to users' contexts.

WorldScribe: Towards Context-Aware Live Visual Descriptions

Ruei-Che Chang, Yuxuan Liu, Anhong Guo

Best Paper AwardUIST 2024

pdf · ACM DL · arXiv · video · 30s preview · talk · press release · worldscribe.org

Scaffolded creation in VRCopilot: Users can create wireframes by drawing on the floor while speaking, in addition to automatically generated wireframes; They can then turn the wireframes into specific furniture.

VRCopilot: Authoring 3D Layouts with Generative AI Models in VR

Lei Zhang, Jin Pan, Jacob Gettig, Steve Oney, Anhong Guo

UIST 2024

pdf · ACM DL · arXiv · video · 30s preview · talk

EditScribe enables non-visual image editing using natural language verification loops powered by large multimodal models.

EditScribe: Non-Visual Image Editing with Natural Language Verification Loops

Ruei-Che Chang, Yuxuan Liu, Lotus Zhang, Anhong Guo

ASSETS 2024

pdf · ACM DL · arXiv · video

CustomAD interface consists of a video player (left), which allows users to play, pause, and seek the video, and a customization pane (right) where users can customize the properties of ADs. The customization properties are grouped into content settings and presentation settings. The content customization adjusts the script's content as users change the ADs length and emphasis. In presentation customization, users could adjust speed, voice, tone, gender, and grammatical syntax of the ADs to change how the ADs are read out. Users can also toggle the ADs on and off.

Audio Description Customization

Rosiana Natalie, Ruei-Che Chang, Smitha Sheshadri, Anhong Guo, Kotaro Hara

ASSETS 2024

pdf · ACM DL · arXiv · video

We present SoundShift, a concept to manipulate sounds for improving mixed-reality awareness. SoundShift situates in the auditory Reality-Virtuality Continuum with full transparency and noise cancellation as two ends, and comprises six sound manipulators: Transparency Shift, Envelope Shift, Position Shift, Style Shift, Time Shift, and Sound Append.

SoundShift: Exploring Sound Manipulations for Accessible Mixed-Reality Awareness

Ruei-Che Chang, Chia-Sheng Hung, Bing-Yu Chen, Dhruv Jain, Anhong Guo

DIS 2024

pdf · ACM DL · arXiv · video

App icon of ImageExplorer, which shows a hand with the index finger touching an image.

ImageExplorer Deployment: Understanding Text-Based and Touch-Based Image Exploration in the Wild

Andi Xu, Minyu Cai, Dier Hou, Ruei-Che Chang, Anhong Guo

W4A 2024

pdf · ACM DL · talk · ImageExplorer app

We present InteractOut, a suite of implicit input manipulation techniques that slightly inhibit the natural execution of common user gestures on smartphones, such as taps and swipes. These input manipulation techniques introduce interaction costs and decrease the smoothness of smartphone interaction to nudge users towards reducing usage.

InteractOut: Leveraging Interaction Proxies as Input Manipulation Strategies for Reducing Smartphone Overuse

Tao Lu, Hongxiao Zheng, Tianying Zhang, Xuhai "Orson" Xu, Anhong Guo

CHI 2024

pdf · ACM DL · arXiv · video · 30s preview · press release

BrushLens is a phone case with actuators that recognize screen elements, and use actuators to touch the screen on behalf of the user. It shows a user uses BrushLens to ``Brush'' on the touchscreen, and the actuator will touch if it is precisely on top of the button.

BrushLens: Hardware Interaction Proxies for Accessible Touchscreen Interface Actuation

Chen Liang, Yasha Iravantchi, Thomas Krolikowski, Ruijie Geng, Alanson Sample, Anhong Guo

UIST 2023

pdf · ACM DL · video · 30s preview · press release

AI-based assistive technologies tend to assume 'universal' needs of BVI people and are thus one-size-fits-all, rather than accounting for unique differences and desires. This figure shows an example where a blind user uses Seeing AI to read a letter and trying to find out about the sender name Rebecca. However, it'll read many other unnecessary and distracting details.

Hacking, Switching, Combining: Understanding and Supporting DIY Assistive Technology Design by Blind People

Jaylin Herskovitz, Andi Xu, Rahaf Alharbi, Anhong Guo

CHI 2023

pdf · ACM DL · full video · 30s preview · talk · dataset

An illustration of VRGit. A History Graph (HG) that represents non-linear version history is anchored on the user’s left arm, where each node is a 3D miniature of that version. Inside each miniature, objects are highlighted using color coding if they are changed compared to the previous version. Mini avatars are anchored in the HG to represent which version users are in. Users can also create portals to monitor other users’ first-person views. A shared history visualization facilitates group discussion by anchoring the HG on a surface and allowing users to preview a version and reuse objects collaboratively.

VRGit: A Version Control System for Collaborative Content Creation in Virtual Reality

Lei Zhang, Ashutosh Agrawal, Steve Oney, Anhong Guo

CHI 2023

pdf · ACM DL · full video · 30s preview · talk

This figure illustrates the scenario of a human support robot designed to perform tasks for the elderly or those with mobility impairments. The human is looking for his book, The Future of Ideas, which is the top book in the stack located on his desk, next to a computer monitor. However, 'The Future of Ideas' may be ambiguious, as the phrase is also displayed on the computer monitor. Using deferred inference, the robot may ask the human to reframe their query, 'Top Book', at which point the robot is confident what the user is referring to.

Human-Centered Deferred Inference: Measuring User Interactions and Setting Deferral Criteria for Human-AI Teams

Stephan J. Lemmer, Anhong Guo, Jason J. Corso

IUI 2023

pdf · ACM DL · code

XSpace: An Augmented Reality Toolkit for Enabling Spatially-Aware Distributed Collaboration

Jaylin Herskovitz, Yi Fei Cheng, Anhong Guo, Alanson Sample, Michael Nebeling

ISS 2022

pdf · ACM DL · video · talk · code

OmniScribe to make 360° videos accessible. A describer is using the OmniScribe web authoring interface, which supports them to better understand the 360° content to author standard audio descriptions and create immersive labels, in order to enable BVI people to interact with 360° content immersively using smartphones and headphones. The picture shows a man wearing wireless headphones, holding a mouse in his right hand to operate a computer and use our OmniScribe for audio description. On the computer is the OmniScribe interface.

OmniScribe: Authoring Immersive Audio Descriptions for 360° Videos

Ruei-Che Chang, Chao-Hsien Ting, Chia-Sheng Hung, Wan-Chen Lee, Liang-Jin Chen, Yu-Tzu Chao, Bing-Yu Chen, Anhong Guo

UIST 2022

pdf · ACM DL · video · 30s preview · omniscribe.org

Screenshot of the four step CustomizAR process: Step 1: Select Object, Step 2: Select Design, Step 3: Adjust & Measure, and Step 4: Finished. Currently showing Step 3: Adjust & Measure, where the user roughly indicates the measurement location. The selected design of the bicycle water bottle cage with zip-tie attachment is displayed on the top left corner. The measurement guide is in the center of the screen. On the bottom, it's showing that bottleDiameter is the parameter that's currently being measured, as well as controls and instructions for taking a photo.

CustomizAR: Facilitating Interactive Exploration and Measurement of Adaptive 3D Designs

Chen Liang, Anhong Guo, Jeeeun Kim

DIS 2022

pdf · ACM DL · 30s preview · talk

CollabAlly system overview: (i) Extracts visual cue information from Google Doc's HTML DOM tree, including collaborators, cursors, comments, highlighted text, and text differences; (ii) Parses this data into a readable format and display it in three pop-up dialog boxes on-demand: (a) enhanced collaborator announcements, (b) comment tracking and navigation, (c) real-time and asynchronous text changes; (iii) Conveys contextual information using audio features, including spatial audio and voice fonts. The figure shows a Google doc page and 3 interfaces of CollabAlly dialog boxes. The annotations also include different visual cues (collaborator, cursor, comment, highlighted text, and text changes) and audio features (spatial audio and voice fonts).

CollabAlly: Accessible Collaboration Awareness in Document Editing

Cheuk Yin Phipson Lee*, Zhuohao Zhang*, Jaylin Herskovitz, JooYoung Seo, Anhong Guo

Honorable MentionCHI 2022

pdf · ACM DL · full video · 30s preview · talk · code

Left: An image of a bed with a white outline around it. A speech bubble reads: Bed. Double tap to explore. Right: The same image of the bed, now with a variety of rectangular bounding boxes around the white pillow and blue sheets. Two speech bubbles read: A white pillow and The bed is blue.

ImageExplorer: Multi-Layered Touch Exploration to Encourage Skepticism Towards Imperfect AI-Generated Image Captions

Jaewook Lee, Jaylin Herskovitz, Yi-Hao Peng, Anhong Guo

CHI 2022

pdf · ACM DL · full video · 30s preview · talk

TutorialLens records the finger location relative to device marker location in 3D for authoring mode, and then reproduces that based on detected device marker location in the access mode (shown in picture).

TutorialLens: Authoring Interactive Augmented Reality Tutorials Through Narration and Demonstration

Junhan Kong, Dena Sabha, Jeffrey P. Bigham, Amy Pavel, Anhong Guo

SUI 2021

pdf · ACM DL · supplemental · video

A decision tree to predict the instances on which the Face API gender classifier is likely to make a mistake based on meta-attributes of the data. Each node of the tree is labeled by the counts of correct and wrong instances belonging to the clusters. Nodes are colored to represent the relative error rate, green shades for lower error rates and red shades for higher error rates.

Designing Disaggregated Evaluations of AI Systems: Choices, Considerations, and Tradeoffs

Solon Barocas, Anhong Guo, Ece Kamar, Jacquelyn Krones, Meredith Ringel Morris, Jennifer Wortman Vaughan, W. Duncan Wadsworth, Hanna Wallach

AIES 2021

pdf · ACM DL · arXiv · poster · slides · talk

Leila, a black, non-binary person with a filtering face mask walks down a neighborhood street with one hand in their pocket and the other hand on their cane. They have a short mohawk and are wearing a jacket, shorts, tennis shoes, and glasses.

"It's Complicated": Negotiating Accessibility and (Mis)Representation in Image Descriptions of Race, Gender, and Disability

Cynthia L. Bennett, Cole Gleason, Morgan Klaus Scheuerman, Jeffrey P. Bigham, Anhong Guo, Alexandra To

Honorable MentionCHI 2021

pdf · ACM DL · slides · talk

Image: "Walking in neighborhood with face mask", by Disabled And Here, licensed under CC BY 4.0

Two example prototypes for making AR apps accessible. A: Foundational Accessibility. Screenshot of a virtual chair with a voice over target around it, a speech bubble shows the app announcing "Back of chair with blue cushion". B: Scanning. Screenshot of AR grid overlaid on a coffee table. Speech bubbles show the app announcing "Found a new horizontal surface" and "Scanned 2 surfaces totaling 2.3 square meters".

Making Mobile Augmented Reality Applications Accessible

Jaylin Herskovitz, Jason Wu, Samuel White, Amy Pavel, Gabriel Reyes, Anhong Guo, Jeffrey P. Bigham

ASSETS 2020

pdf · ACM DL · video · slides · talk

Image of a person in a wheelchair in front of a swing gate

Sense and Accessibility: Understanding People with Physical Disabilities’ Experiences with Sensing Systems

Shaun K. Kane, Anhong Guo, Meredith Ringel Morris

Best Paper NomineeASSETS 2020

pdf · ACM DL · slides · talk

Image: "Swing Gate - SL 931 - Belgium", by Automatic Systems, licensed under CC BY 2.0

Screenshot of a tweet by @CDCgov from April 1, 2020 3:55pm: Actions to reduce spread of the virus, such as social distancing, are key to #FlattenTheCurve. 2 of 3 (original tweet link: https://twitter.com/CDCgov/status/1245439600472084486) The tweet contains an image of the common public health infographic about “flattening the curve”, but the tweet did not include alt text for the image. The image shows an example of a common flatten the curve info-graphic. A tall peak indicates the height of the pandemic if left unchecked, and a shorter spread out curve depicts the effects of social distancing efforts.

Disability and the COVID-19 Pandemic: Using Twitter to Understand Accessibility during Rapid Societal Transition

Cole Gleason*, Stephanie Valencia*, Lynn Kirabo, Jason Wu, Anhong Guo, Elizabeth J. Carter, Jeffrey P. Bigham, Cynthia L. Bennett⁺ Amy Pavel⁺

ASSETS 2020

pdf · ACM DL · supplemental · slides · talk

The user is holding the phone in landscape mode with one hand, and aiming the camera towards a touchscreen coffee machine. The user’s other hand is wearing a fingercap exploring on the screen. The StateLens iOS app is providing audio guidance to the user.

StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

Anhong Guo, Junhan Kong, Michael Rivera, Frank F. Xu, Jeffrey P. Bigham

UIST 2019

pdf · ACM DL · arXiv · full video · 30s preview · slides · talk · UIST live talk

Two Blocks users are collaboratively creating a table in augmented reality.

Blocks: Collaborative and Persistent Augmented Reality Experiences

Anhong Guo, Ilter Canberk, Hannah Murphy, Andrés Monroy-Hernández, Rajan Vaish

Ubicomp 2019

pdf · ACM DL · arXiv · video

On the left: a screenshot of Android App Drawer taken using X-Ray. On the right: a user holding the phone showing the same screenshot in X-Ray image viewer. Talk back cursor is visible.

X-Ray: Screenshot Accessibility via Embedded Metadata

Sujeath Pareddy, Anhong Guo, Jeffrey P. Bigham

Best Artifact AwardASSETS 2019

pdf · ACM DL · video · code

A word cloud composed of words including AI fairness, accessibility, artifical intelligence, inclusion, and bias.

Toward Fairness in AI for People with Disabilities: A Research Roadmap

Anhong Guo, Ece Kamar, Jennifer Wortman Vaughan, Hanna Wallach, Meredith Ringel Morris

ASSETS 2019 AI Fairness Workshop

pdf · SIGACCESS Newsletter · arXiv · slides · talk

Three images in the VizWiz-Priv dataset, including an image of a wall of photos containing faces, an image of a credit card, and an image of a pregnancy test. The private information regions in the images are highlighted and inpainted.

VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People

Danna Gurari, Qing Li, Chi Lin, Yinan Zhao, Anhong Guo, Abigale Stangl, Jeffrey P. Bigham

CVPR 2019

pdf · supplemental · CVF · vizwiz.org · poster

Interaction scenario of Minuet: after returning home, the user points at the Roomba and then the dirty area to ask Roomba to clean it up.

Minuet: Multimodal Interaction with an Internet of Things

Runchang Kang, Anhong Guo, Gierad Laput, Yang Li, Xiang 'Anthony' Chen

SUI 2019

pdf · ACM DL · video

An example question sensor created in Zensors++ asking 'Is someone using a printer?' with a bounding box focusing on the printer area.

Crowd-AI Camera Sensing in the Real World

Anhong Guo, Anuraag Jain, Shomiron Ghose, Gierad Laput, Chris Harrison, Jeffrey P. Bigham

Ubicomp 2018

pdf · ACM DL · video · slides

A table with many boxes covered with white paper showing text such as glasses, butter, jam, etc. A user is holding and targeting his phone at one object, while touching the object with the other hand. This is showcasing the window cursor interaction technique that supports non-visual attention to items within a complex visual scene, in which the user moves the device itself to scan the scene and receives information about what is in the center of the image.

Investigating Cursor-based Interactions to Support Non-Visual Exploration in the Real World

Anhong Guo, Saige McVea, Xu Wang, Patrick Clary, Ken Goldman, Yang Li, Yu Zhong, Jeffrey P. Bigham

ASSETS 2018

pdf · ACM DL · video · slides

Example printed overlays and legends generated by Facade. (a)-(d) demonstrate the different material combinations we tested in the design iterations (NinjaFlex with Braille, Flex+PLA Braille label, Flex+PLA Braille cover, and Flex+PLA embossed letter cover). Facade users can choose to print a legend for the abbreviations (e).

Making Everyday Interfaces Accessible: Tactile Overlays by and for Blind People

Anhong Guo, Jeffrey P. Bigham

IEEE Pervasive Computing 17(2), 2018, Maker Tech column

pdf · IEEE Xplore

Distribution of the first six words for all questions in the VizWiz dataset. The innermost ring represents the first word and each subsequent ring represents a subsequent word. The arc size is proportional to the number of questions with that word/phrase.

VizWiz Grand Challenge: Answering Visual Questions from Blind People

Danna Gurari, Qing Li, Abigale J. Stangl, Anhong Guo, Chi Lin, Kristen Grauman, Jiebo Luo, Jeffrey P. Bigham

Spotlight PresentationCVPR 2018

pdf · arXiv · website · poster · video · tech review

A user accessing the microwave augmented with tactile overlays generated by Facade.

Facade: Auto-generating Tactile Interfaces to Appliances

Anhong Guo, Jeeeun Kim, Xiang 'Anthony' Chen, Tom Yeh, Scott E. Hudson, Jennifer Mankoff, Jeffrey P. Bigham

CHI 2017

pdf · ACM DL · full video · 30s preview · talk

A user holding a 3D-printed cup holder augmented with a flexible ring generated using our flexible buffers technique.

Understanding Uncertainty in Measurement and Accommodating its Impact in 3D Modeling and Printing

Jeeeun Kim, Anhong Guo, Tom Yeh, Scott E. Hudson, Jennifer Mankoff

DIS 2017

pdf · ACM DL · video

The user is holding the phone in portrait mode with one hand, and aiming the camera towards an inaccessible microwave control panel. The user’s other hand is exploring on the panel. The VizLens iOS app is providing audio feedback and guidance to the user.

VizLens: A Robust and Interactive Screen Reader for Interfaces in the Real World

Anhong Guo, Xiang 'Anthony' Chen, Haoran Qi, Samuel White, Suman Ghosh, Chieko Asakawa, Jeffrey P. Bigham

UIST 2016

pdf · ACM DL · full video · 30s preview · talk

Two tilt-based interaction techniques for enabling no-touch, wrist-only interactions on smartwatches. Left: AnglePoint, which directly maps the position of a virtual pointer to the tilt angle of the smartwatch. Right: ObjectPoint, which objectifies the underlying virtual pointer as an object imbued with a physics model.

Exploring Tilt for No-Touch, Wrist-Only Interactions on Smartwatches

Anhong Guo, Tim Paek

Honorable MentionMobileHCI 2016

pdf · ACM DL · video

A user working on a document on a smartwatch using WearWrite, by leveraging a crowd to help translate their ideas into text.

WearWrite: Crowd-Assisted Writing from Smartwatches

Michael Nebeling, Alexandra To, Anhong Guo, Adrian A. de Freitas, Jaime Teevan, Steven P. Dow, Jeffrey P. Bigham

CHI 2016

pdf · ACM DL · full video · 30s preview · talk

A system architecture diagram of an order picking system augmented with weight checking error detection.

A Comparison of Order Picking Methods Augmented with Weight Checking Error Detection

Xiaolong Wu, Malcolm Haynes, Anhong Guo, Thad Starner

ISWC 2016

pdf · ACM DL · video

Four BeyondTouch interaction techniques, including tapping on a phone in the pocket, tapping on the back of a phone while holding it with two hands, tapping and sliding on the back of the phone while holding it with one hand, as well as tapping and sliding next to the phone on the table to control the device.

Beyond the Touchscreen: An Exploration of Extending Interactions on Commodity Smartphones

Cheng Zhang, Anhong Guo, Dingtian Zhang, Yang Li, Caleb Southern, Rosa I. Arriaga, Gregory D. Abowd

TiiS 6(2), 2016

pdf · ACM DL

Using CapAuth, a user can place their hand on a smartphone touchscreen for authentication. Left: initial state of CapAuth showing a handprint guide with grey background. Right: user touching the touchscreen and successfully authenticated, showing a green background.

CapAuth: Identifying and Differentiating User Handprints on Commodity Capacitive Touchscreens

Anhong Guo, Robert Xiao, Chris Harrison

ITS 2015

pdf · ACM DL · video · talk

Pick-by-head-up display system using a Google Glass with a opaque display to show the pick order instructions.

Order Picking with Head-Up Displays

Anhong Guo, Xiaolong Wu, Zhengyang Shen, Thad Starner, Hannes Baumann, Scott Gilliland

Computer 48(6), 2015

pdf · IEEE Xplore · video 1 · video 2

BeyondTouch: Extending the Input Language with Built-in Sensors on Commodity Smartphones

Cheng Zhang, Anhong Guo, Dingtian Zhang, Caleb Southern, Rosa I. Arriaga, Gregory D. Abowd

IUI 2015

pdf · ACM DL · video

Image of a set of order picking bins with LED displays and buttons.

Comparing Order Picking Assisted by Head-Up Display versus Pick-by-Light with Explicit Pick Confirmation

Xiaolong Wu, Malcolm Haynes, Yixin Zhang, Ziyi Jiang, Zhengyang Shen, Anhong Guo, Thad Starner, Scott Gilliland

ISWC 2015

pdf · ACM DL

Image of the experimental setup, including 24 pick bins (on two shelving units with four rows and three columns each) and three order bins on the right. An example pick list is annotated with superimposed labels.

A Comparison of Order Picking Assisted by Head-Up Display (HUD), Cart-Mounted Display (CMD), Light, and Paper Pick List

Anhong Guo, Shashank Raghu, Xuwen Xie, Saad Ismail, Xiaohui Luo, Joseph Simoneau, Scott Gilliland, Hannes Baumann, Caleb Southern, Thad Starner

10-Year Impact AwardHonorable MentionISWC 2014

pdf · ACM DL · video · talk

Morris Wellman Assistant Professor

Computer Science & Engineering

College of Engineering

University of Michigan

Curriculum Vitae (CV)

Google Scholar

Email: anhong@umich.edu

Twitter: @AnhongGuo

YouTube Channel

Teaching

Fall 2025

EECS 493: User Interface Development

Winter 2025

EECS 594: Human-AI Interaction & Systems

Apps Released

VizLens

ImageExplorer

Talks & Events

Jan 21, 2025

UM AI Seminar

Feb 7, 2025

Stanford

Feb 26, 2025

UM HCC Seminar

Mar 3, 2025

Northwestern

Mar 5, 2025

UChicago

Jun 11, 2025

Meta Reality Labs

Jun 13, 2025

Microsoft Research

Jun 16, 2025

Apple Research

Jun 17, 2025

Adobe Research

People

Funding

Our research is generously supported by: the National Science Foundation (NSF), the Advanced Research Projects Agency for Health (ARPA-H), and industry partners Google, Apple, Samsung, Adobe, and Snap.

Teaching

Apps Released

Talks & Events

People

PhD Students (mentoring plan)

Postdoc

PhD Graduates

Master's and Undergrad Students

Funding