Archive for 十一月, 2006

Look, ma, I am using my hands!

星期一, 十一月 20th, 2006

Make mouse selections, move objects on screen around with your hand. Hand motions are tracked using the CamShift algorithm using color histograms. A fun 2-hour project. :) Check out the result:

embedded by Embedded Video

YouTube link to: A hand-mouse interface

Motivation for the ASSIST project

星期五, 十一月 17th, 2006

The United States is about to experience the greatest demographic change in its history. Most of this change will occur over the next 30 years, as 77 million baby boomers cease to work and pay payroll taxes and instead start to retire and collect benefits.

The problems arising from this change in demographics include spiraling health care costs and shortages of trained nurses and doctors \cite{Zarit1998}. The percentage of elders in the population will increase dramatically as the first segment of the baby boomer cohort becomes 65 in 2011 \cite{Hobbs2002}. Many of these elders will want to remain in the home, but with aging come higher rates of functional and cognitive deficits \cite{Gist2004} that, in many cases, result in limitations in at least one activity of daily living [Kassner, 2006]. Demographics for veterans follow similar patterns; there are 9.8 million veterans over 65 in the 2000 census (Older Americans 2000 Update, 2006), and those 85 and older will almost triple from 510,000 to 1.3 million by 2010 \cite{Atizado2004}. As the Veterans Adminstration (VA) shifts the locus of care to the community, gaps already exist regarding access by veterans to non-institutional care. At the same time, veterans are more likely to be older, disabled and have lower income than the general population \cite{Atizado2004}.

overtaxing of the traditional “mainframe” approach to medical services

It is likely that technologies for the home can help to relieve the inevitable stress on the medical infrastructure and to extend the period of time that elders can live independently. First and foremost, these technologies include assistance in the activities of daily living (ADL)—technologies that enhance safety and security, assistance in daily medical compliance, help with client calendars and daily chores such as household cleaning and grocery shopping. Second, elders are susceptible to isolation as they become less mobile[Pin et~al., 2005; Michael et~al., 1999]. Devices that facilitate communication and social elationships between peers, families, and the surrounding community can help these clients remain onnected socially. Third, the dearth of trained physicians and nurses together with a diminished capacity to travel independently means that more of this population must receive regular medical checkups in their homes by physicians that make virtual house calls rather than in a centralized facility. Technologies in the home that create an appropriate interface between the medical industry and the elderly client can help to make efficient use of the medical infrastructure and improve the frequency of care and oversight.

ASSIST project video phone application

星期日, 十一月 12th, 2006

The video phone application is built on top of existing commercial packages that handle the low-level video encoding/decoding and transmissions. This enabled us to focus our design attention more toward the interface issues involved rather than reinventing the wheel. The design paradigm we chose to follow can be summarized as:

Simplicity, clarity and modality

  • First, the interface has to be simple. Cluttering the interface with extra features that will not be used by the users would only cause confusion and intimidation. Menus and buttons that are not relevant to the functionality of video phone were hidden away from the user.
  • Secondly, the interface needs to be clear to an elder. Font employed by most instant messenger systems is in general too small. Our design uses large font size as well as large picture IDs of their contacts such that they can be easily identified and selected by the user.
  • Finally, we use different modality input to the system to overcome elder’s various physical deficiencies. For instance, for people with difficulties using the mouse as a pointing device, a touch-screen is overlaid on top of the monitor to allow pointing directly using the user’s hand. For people with motion disabilities, an alternative input modality can offered such that they can use heir voices to control the application.

VideoPhone Screenshot 1 VideoPhone Screenshot 2 VideoPhone Screenshot 3

The video phone application can be used whenever communication with the outside world is required. For instance, elders can use it to engage face-to-face conversations with family and friends to reduce sense of isolation and related symptoms such as depression. In time of crisis such as falling, the video phone application is integrated with the fall detection system such that emergency call will be placed directly. A priority list can be specified by the user a priori in case of at the time of occurrence, one of the contacts such as the user’s neighbor, close relative or doctor is absent. 911 can be listed as a last resort. The system will scan through the list, and place calls until someone is reached. This paper will discuss extensions to these applications as a mobile manipulator is integrated into our system.

Face Recognition

星期四, 十一月 2nd, 2006

This article contains two pieces of Face Recognition code:

HMM Face Recognition

This is my adaptation of the HMM face recognition algorithm described in “Face recognition using an embedded HMM” (1999) paper. The original source was found on the Yahoo OpenCV discussion group. This is an adapted version that stream-lines the process of training and testing of the algorithm.


An HMM approach for face recognitionHidden Markov Models (HMM) have been Hmm Face Recognitionsuccessfully used for speech and action recognition where the data that is to be modeled is one-dimensional. Although attempts to use these one-dimentional HMMs for face recognition have been moderately successful, images are two-dimensional (2-D). Since 2-D HMM’s are too complex for real-time face recognition, in this paper we present a new approach for face recognition using an embedded HMM and compare this approach to the eigenface, method for face recognition, and to other HMM-based methods. Specifically, an embedded HMM has equal or better performance than previous methods, with reduced computational complexity.

Download and Compile

Download and unzip into c: drive root directory. The unpacked directory structure should look like:

++FaceRecognition (Core source for HMM face recognition)
++database (faces database)
++FindFaces (generate training images from video sequence)
++FormatConvert (convert sample images into proper pgm format for training)
++testimages (test images after training)

The face recognition project requires Visual Studio .NET to compile. A linux implementation will be available available on request.

Face Recognition
Batch train face database by running FaceRecognition with no argment ( ”Note: run FaceRecognition in the Visual Studio debug environment, otherwise the program will crash due to some memory bug)
After training you may test the result using the images in the testimages directory. Or you may use a pre-recorded video or directly try from a live video cam stream. See parameter below for different test options. When you run the program in test mode, 3 windows will pop up: the “Video” window is the live cam/test image, “ID” window displays the recognition result, “search” window will display the clipped-out face if testing from live cam stream.

Can either find live video camera or input static image for Face Recognition
Syntax: FaceRecognition [choice [input_image_file_name] | --help]
choice=1: recognize face from static input image,
input_image_filename REQURIED
choice=2: recognize face from pre-recorded video sequence
choice=3: recognize face from LIVE cam
no argument FaceRecognition will run in batch training mode
--help: will display this help message

Adding more people into the face database

  • Record a video sequence of the person sitting in front of the camera, first looking straight, then more head slowly from side to side. This allows the training result to be invariant to head orientation changes
  • Run the FindFaces program to generate a sequence of training face images from the recorded video sequence
  • Run the FormatConvert program to convert the training images into the proper pgm format (need to be in the gray scale p5 format).
  • Place the result training into the database directory.
  • Run FaceRecognition without argument to train, WAIT 5 sec, and DONE!

Ravela Face Algorithm
The C++ implementation of the Ravela Face Algorithm can be found at the LPR wiki (LPR wiki access required)

Related paper: