Physical Therapy Background
Physical therapists currently provide significant assistance to patients recovering from accidents and injuries. Typically, physical therapists will work with the patient twice a week, conducting a 45-minute physical therapy session with the patient. During this session, the physical therapist assesses the individual patient's range of motion and strength. The physical therapist will then outline a preferred set of exercises in order for the patient to gain further control of various muscle groups. Normally, the physical therapist will provide an exercise regimen for the patient to undergo while the physical therapist is not present. It is quite common for the physical therapist to demonstrate the exercises to the patient and coach her in their execution to ensure that the patient understands them properly, and then request that the patient carry out the exercise a number of times each day. For example, the patient might be instructed to move an arm through a certain range of movement for ten repetitions at three separate times during the day. During the next visit, the physical therapist will assess whether she has improved her range of motion and prescribe advancing sets of exercise regimens in order to restore full mobility.
Physical Therapy Instruction
When a physical therapist (or assistant) is with a patient, the patient is observed for quality and quantity of therapeutic exercise. The patient is frequently assigned exercises to perform on her own. The instructions consist primarily of paper sketches showing each exercise step, sometimes with patient-specific modifications noted in the sketch. Enhancements to these instructions can include two dimensional photos or videos of a model performing the assigned exercise. Patient-specific instructions are still noted separately because the model represents an “ideal” or “generic” standard, rather than an exercise specifically tailored for the individual patient. Unfortunately, the physical therapist does not currently have a way to easily monitor the exercises that the patient performs on their own.
Exercise or Fitness Instruction
When a person learns an exercise, he seeks to imitate an athlete or a model performing the exercise via a picture, a video, or live in a group or private exercise class. A live instructor may be able to provide real-time feedback to the student, possibly making unique modifications for that person. Automated exercise feedback attempts to duplicate real-time feedback with various apparatuses, but the goal is still an “ideal” or “generic” standard.
Video Games
Since Microsoft Corporation released the Kinect™ motion sensing input device in November 2010, numerous software games have used the Kinect™ to analyze a user's movements in three dimensions, and compare them to a pre-defined goal. One example is the Dance Central™ music video game by Harmonix Music Systems. See Pat Pub No US 2012/0143358. Another example is the Your Shape: Fitness Evolved system by Ubisoft Entertainment. Other examples are themed by sport, reflecting ball movement in tennis, for example, based on the movement of the arm during the swing. Goals are pre-determined. Comparisons, normalized for varying body sizes, are made to an “ideal” standard.
Other examples of video game technology that preceded the Kinect are the Nintendo Wii™ Remote (See U.S. Pat. No. 7,927,210, the Nintendo Wii Fit™ Balance Board (See U.S. Pat. No. 8,152,640), and the Sony PlayStation Move™ (See EP 2356545). As shown in the referenced patents, rehabilitation video games have been created based on this technology. Video games based on the Nintendo or Sony gaming systems require the user to either hold a sensor or stand on a Balance Board sensor.
Sensor-Based Systems
A sensor-based system is illustrated schematically in FIG. 1A. This system has at least one sensor array 110 with a color sensor camera 111, an infrared (IR) emitter 112, a depth sensor 113, an audio sensor 114, and a processor (not shown). In one embodiment, the RGB camera 111 delivers a three-color (Red, Green, Blue) image stream. The infrared emitter 112 combines with the infrared depth sensor 113 to deliver a depth stream. These data streams provide a computer system 130 the ability to recognize objects in the camera's field of view in three dimensions. The multi-microphone audio sensor 114 parses voices and sound input, while simultaneously extracting and nullifying ambient noise, delivering an audio stream. A processor with commercially-available, proprietary software can coordinate these input streams.
FIG. 1A also shows a display 120 and a computer system 130, which may or may not be combined in a single unit such as an all-in-one PC, a laptop, or a tablet. A television connected to the computer system could also serve as the display 120. Physical connectivity is provided by an internal or external system bus such as a USB cable 101 or an HDMI cable 102. Software running on the computer system 130 comprises system software and drivers 132, a Natural User Interface (NUI) Application Programming Interface (API) 133, and one or more application(s) 134. A sophisticated, commercially-available NUI software library and related tools help developers use the rich form of natural input coming from a sensor array to react to real-world events.
Kinect Environment
Microsoft's Kinect™ is a peripheral device that connects as an external interface to Microsoft's Xbox 360™ or to Microsoft Windows™ computers. The Kinect™ and the associated programmed computer or Xbox sense, recognize, and utilize the user's anthropomorphic form so the user can interact with software and media content without the need for a separate controller. A three-dimensional user interface enabled by Kinect™ hardware is disclosed in U.S. Pat. No. 8,106,421, and a general interactive video display system is disclosed in U.S. Pat. No. 7,348,963.
Microsoft provides a proprietary software layer (e.g., U.S. Pat. No. 8,213,080 to realize the Kinect's capabilities. Developers can alternatively use Microsoft's Kinect Software Development Kit (SDK), or various open source software libraries. The former will generally be used in the present description.
When the Microsoft Kinect SDK is used with a Kinect for Windows sensor array 110, the computer system 130 should comply with Microsoft's system requirements for hardware 131 and system software 132. Details of the Microsoft Kinect SDK environment are referenced on the Microsoft website, http://msdn.microsoft.com/en-us/library/jj131023.aspx (webpage visited on Nov. 16, 2012), and can be summarized as follows: Hardware 131: 32 bit (x86) or 64 bit (x64) dual-core 2.66-GHz or faster processor; 2 GB RAM; dedicated USB 2.0 bus; graphics card that supports DirectX 9.0c; Microsoft Kinect for Windows sensor; system software and drivers 132: Microsoft Windows 7 or 8, including the APIs for audio, speech, and media; DirectX end-user runtimes (June 2010); Kinect microphone array and DirectX Media Object (DMO); audio and video streaming controls (color, depth, and skeleton); device enumeration functions that enable more than one Kinect; Kinect NUI API 133: Skeleton tracking, audio, color and depth imaging.
Application(s) 134 may require or benefit from additional hardware 131, such as local data storage, audio output devices, or Internet connectivity. Additional system software and drivers 132 may be required to support this hardware. An application 134 may also employ other features supported by system software, such as the Windows Graphical User Interface (GUI) libraries.
Skeletal Tracking
A sensor-based system can provide a framework for determining positional information of a user's body, capturing motion for purposes of analysis. Various systems exist for capturing motion through sensors. For example, a system combining a camera with a depth sensor can be used to determine positional information about the user's body in three dimensions and produce a skeleton model. In other systems, transducers attached to the user's body are used to detect the positions of the user's limbs and produce a skeleton model. Other systems use infrared pointing devices, other motion tracking peripherals, or multiple cameras to enhance positional information in three dimensions.
As used herein, the terms “joint”, “bone”, and “skeleton” are intended to have the meaning that one of skill in the art of motion capture and animation would ascribe to them. For example, a skeleton can comprise bones, but the number of bones and their positions are a function of the motion capture equipment and software libraries, and may differ from the number and positions of bones that an anatomist or physician would recognize in a human skeleton. Similarly, a joint can be the distal endpoint of a single bone (e.g., a fingertip or the head), and need not necessarily be at a point where two bones come together.
As schematically illustrated in FIG. 1B, a typical skeletal model 150 is a collection of joints 151 through 170, and lines representing the bones connecting the joints. The model's output is a data structure that includes coordinates describing the location of each joint and connected lines of a human body's bones. An example for skeletal model representation and generation thereof can be found in US Patent Application Publication US 2010/0197399 to Geiss.
The Microsoft Kinect NUI API 133 includes skeletal tracking as described in US 2012/0162065 to Tossell, et al. In addition, it provides bone orientation in two forms. One is an absolute orientation in Kinect camera coordinates using the depth stream to provide raw data. The second is a hierarchical rotation, based on a bone relationship as defined in the skeleton joint structure. In the reference http://msdn.microsoft.com/en-us/library/hh973073.aspx (webpage visited on Nov. 16, 2012), Microsoft provides a detailed explanation of joint orientation and hierarchical rotation for developers of avatar animation, the stated target audience.
Comparison to High-End Sensor Technology
The Kinect sensor is sufficiently accurate to use in physical rehabilitation. The University of Southern California recently published a comparison of physical rehabilitation using the Kinect versus the substantially more expensive NaturalPoint OptiTrack™ optical system. Commercially-available physical rehabilitation software is not currently available for either system; the USC researchers wrote custom software to specifically compare External Rotation tracking on both systems. “External Rotation” is a common therapeutic exercise for injured shoulders, well known to physical therapists. The authors noted that the expensive, highly accurate OptiTrack system requires many reflective sensors to be attached to the patient's body, limiting the mobility and comfort of patients. The experiment results “showed that the Kinect can achieve competitive motion tracking performance as OptiTrack and provide ‘pervasive’ accessibility” for patients. See Chien-Yen Chang et al., “Towards Pervasive Physical Rehabilitation Using Microsoft Kinect,” International Conference on Pervasive Computing Technologies for Healthcare (Pervasive Health), San Diego, Calif., USA, May 2012.
There is an unfilled need in physical therapy for an improved device which can both monitor an exercise that is specifically designed for an individual patient, and provide enhanced real-time feedback to the patient. Better feedback could result in better patient compliance and better outcomes, measured in healing time and reduced re-injuries, which could reduce healthcare costs and enable physical therapists to treat more patients.