Projects | Erwin Wu | TokyoTech

top of page

Assistant Professor, School of Computing, Tokyo Institute of Technology

RESEARCH PROJECTS

Skiing, Fast and Slow: Evaluation of Time Distortion for VR Ski Training

Takashi Matsumoto, Erwin Wu, and Hideki Koike. (AHs 2022)

Virtual reality-based sports simulators are widely developed, which makes training in a virtual environment possible. One of the advantages of virtual training is customizability and the user can be augmented by extra cues or different physical environments, which is difficult to realize in real training. In this paper, we study the effect of time distortion on alpine ski training to find out how modifying the temporal space can affect sports training. Two different experiments are conducted to investigate how fast/slow and static/dynamic time distortion-based training, respectively, can impact the performance of users.

Video

Poster

Code

SkiSim: A Comprehensive Study on Full Body Motion Capture and Real-Time Feedback in VR Ski Training

Jana Hoffard, Xuan Zhang, Erwin Wu et al. (AHs 2022)

Ski Training has many restrictions, such as environmental requirements and the constraint of indirect cyclic feedback given by coaches. Though several researchers introduced different systems in order to overcome these limitations, there is no comprehensive study on which body parts trainees concentrate on the most and which feedback is suited for the intuitive training of the skier. We conducted two user studies to determine which body parts the user concentrates on while using a ski simulator and whether or not more informative non-visual feedback on said body parts can compete to the state of the art visual feedback.

Video

Poster

Code

CV-Based Analysis for Microscopic Gauze Suturing Training

Mikihito Matsuura, Shio Miyafuji, Erwin Wu et al. (AHs 2021)

This paper proposes a microscopic suture support system aimed to reduce the time required for neurosurgeons to practice microscopic suturing. The system detects instruments in real-time from the video of the microscope camera and provides an immediate analysis. We introduce a sequential image dataset in which the surgery phases, as well as the bounding boxes of surgical instruments, are annotated. A YOLO V4 network is fine-tuned with the proposed dataset and achieves an accuracy of approximately 94%. We also propose a tool for a phase estimation after each suturing using a Dynamic Programming based algorithm from the tracking data, which estimates the phase of practice with 83 % accuracy.

Video

Poster

Code

SPinPong - Virtual Reality Table Tennis Skill Acquisition using Visual, Haptic and Temporal Cues

Erwin Wu, Mitsuki Piekenbrock et al. (IEEE VR 2021, TVCG)

In this paper, we show how to design an intuitive training system to acquire this specific skill using different cues in Virtual Reality (VR). In an initial study, by comparing real-world training with VR training, we showed the effect of VR training and obtained some insights about augmentation for training spin shots. The training system was then improved by adding three new conditions using different visualizations and temporal distortions, as well as a haptic racket for creating realistic feedback.

Poster

WeChat Image_20200929000626.png

BackHandPose: 3D Hand Pose Estimation for a Wrist-worn Camera via Dorsum Deformation Network

Erwin Wu, Ye Yuan et al. (UIST 2020)

We propose a vision-based 3D hand pose estimation framework using a wrist-worn camera. The main challenge is the oblique angle of the wrist-worn camera, which makes the fingers scarcely visible. To address this, a special network that observes deformations on the back of the hand is required. We introduce DorsalNet, a two-stream convolutional neural network to regress finger joint angles from spatio-temporal features of the dorsal hand region (the movement of bones, muscle, and tendons).

Poster

FuturePong: Real-time Table Tennis Trajectory Forecasting using Pose Prediction Network

Erwin Wu, Hideki Koike (CHI 2020)

This work introduces a real-time table tennis forecasting system using a long short-term pose prediction network.

Our system can predict the landing point of a serve before the pingpong ball is even hit using the previous and present motions of a player, which is captured only using a single RGB camera. In the precision evaluation, our system shows an acceptable accuracy and the max difference is 8.9 cm.

Code

Screenshot from 2020-01-15 15-54-51.png

How to VizSki: Visualizing Captured Skier Motion in a VR Ski Training Simulator

Erwin Wu, Florian Perteneder, et al. (ACM VRCAI 2019)

We propose a virtual reality ski training system based on an indoor ski simulator. For training, we captured the motion of professional athletes and replay them to the users to help them to improve their skills. In two studies, we explore the utility of visual cues to help users to effectively learn the motion patterns from the pro-skier.

Code

Screenshot from 2020-01-15 15-47-41.png

Screenshot from 2020-01-15 15-50-18.png

Opisthenar: Hand Poses and Finger Tapping Recognition by Observing Back of Hand Using Embedded Wrist Camera

Hui-Shyong Yeo, Erwin Wu, Juyoung Lee, et al. (ACM UIST'19)

We introduce a vision-based technique to recognize static hand poses and dynamic finger tapping gestures from a view of hand back area. Our approach employs a camera on the wrist to observes small movements and changes in tendons, skin and bones on the back hand. We train deep neural networks to recognize gestures and evaluated with a real-time user test and achieved a high accuracy of 89.4% (static poses) and 67.5% (dynamic gestures).

Poster

Screenshot from 2020-01-15 14-51-23.png

FuturePose - Mixed Reality Martial Arts Training Using Real-Time 3D Human Pose Forecasting With a RGB Camera

Erwin Wu, Hideki Koike (IEEE WACV 2019)

We proposed a novel mixed reality martial arts training system using deep learning based real-time human pose forecasting. Our training system is based on 3D pose estimation using a recurrent-residual neural network with input from an RGB camera. We evaluated the performance of our system when predicting 15 frames ahead in a 30-fps video (0.5s forecasting), the accuracy even outperforms some methods using depth cameras or fabric technologies.

Code

Screenshot from 2020-01-15 15-40-47.png

OmniGlobe: An Interactive I/O System For Symmetric 360-Degree Video Communication

Zhengqing Li, Shio Miyafuji, Erwin Wu, et al. (ACM DIS 2019)

To solve the problem of having narrow field of view during video communication, we introduce OmniGlobe, a novel symmetric full 360° video communication system which incorporates an omnidirectional camera, a full spherical display, and several visual or interactive techniques. We Our system is effective in reducing the inconvenience of observing the remote environment and increased the remote space awareness and user's gaze awareness to support remote collaboration.

Code

Screenshot from 2020-01-15 15-25-33.png

BitoBody: Real-time human contact detection and dynamic projection system

Erwin Wu, Mitski Piekenbrock, Hideki Koike (ACM AH 2019)

This is a novel human body contact detection and projection system with dynamic mesh collider. Motion capture cameras and generated human 3D models are used to detect the contact between user's bodies. A special algorithm that divides body meshes into small pieces of polygons to do collision detection is developed for precise detection. The maximum deviation of damage projection is about 7.9 cm under a 240-fps optitrack mocap system.

Poster

Code

To see other researches from our lab, please visit the home page of Koike Lab!

bottom of page