|
When I talked about gesture being an important input type and demonstrated it through http://www.youtube.com/watch?v=tbA55znyGXA[^] we had some very interesting discussions over the issue.
I thought why not try something more complicated other than just taking control over the mouse. So I built a handwriting recognition system, wrote a small math parser, wrote a library for mapping hand position to gesture points and build a simple prototype called Gescal .
http://www.youtube.com/watch?v=-gFh7kJX4_w[^]
This time however I also logged the recognition accuracy of SDK to know how far the system is feasible. What it looks from initial version is , if we give a real shot at gesture, it might well prove to be one of the most preferred inputs in coming future.
As, suggested during my last video on the series, this time I have tried to keep the display quality high and audio clean.
Please share your thoughts. Please dont go into recognition accuracy. Consider that the application is developed only in a week and has to go a long way to go with several testing and case studies. I would appriciate your Input on what can be improved, what way the system could be made better and also invite your discussions.
Thank You
|
|
|
|
|
"Talk to the hand" maths input?
Sounds good to me.
I wanna be a eunuchs developer! Pass me a bread knife!
|
|
|
|
|
Or may be "Talk through Hands"?
|
|
|
|
|
I'd love to see this as an article. How are you managing mapping the gestures/geonodes to physical space, for instance?
|
|
|
|
|
Quote: I'd love to see this as an article. How are you managing mapping the gestures/geonodes to physical space, for instance?
Pete, actually I am writing a series of article to elaborate every minute details of my various work with SDk.
As far as mapping is concerned, I have mapped the 320x240 camera space to actual screen width and height ( resolution independent) which I have found out using some PInvoke calls.
I have reduced the frame rate to 10fps to perfectly synchronize human hand movement with mouse.
|
|
|
|
|
Ah hah. Sounds good - I'd been playing about with the world space to simulate zoom in/out, that's been fun.
|
|
|
|
|
I should have added that if you need someone to proof read, I'd be more than happy to take a look and offer advice if necessary.
|
|
|
|
|
|
Sweet. I'll have a look at these tonight - right now I've got images panning left and right and zooming in and out based on hand movements. The voice recognition has real problems with my accent so I'm putting a training module in place to let the application learn what the different commands are with my voice. If I get time, I'll put this together with facial recognition so that different people can maintain their own dictionaries.
Based on what I saw of your earlier YouTube videos I'd say that you have to be in with a great chance. Good luck my friend.
|
|
|
|
|
Way back in 2005 I had developed a desktop client for Yahoo, which was purely voice driven. I had used dictation mode and simple parsing to embed command. Like "START START" for start composing, "RIGHT RIGHT" for forward navigation etc. It started with system boot and synced received mails, with proper command it started reading out new mails. It was developed with SAPI 4.x something. But worked great in xp and still does. But this SDK, voice recognition module really is not smooth. Even after good bit of training, it is not even close to 2005 SAPI. So I dropped voice from my agenda.
And Then I also tried a mix mode module. Face+ Hand. Shared the session between two separate threads each for Hand gesture and Face. It reduced the frame rate to about 4fps par module. Are you able to work with both face and hand together with good speed?
I made about 11 Apps because I loved the concept of perception and wanted to check the integration with every possible usage like Ultrabook, Mathematics, Modelling, Kids, Security.
------------------------------------------------
Right now I am working on one of the most complicated works that I always wanted to work.
I take depth image first. I have written a simple module that converts this depth map image to point cloud [x y z] . Then I capture the RGB image. So I know the color value of each point. When you hold something close to camera, by thresholding Z axis ( distance from camera) It's model will be extracted. I have developed a custom 3-d pattern which is Model+Texture descriptor.
I am building a library of these models. Now User can create a 3-d scene by importing the models, change their sizes, then draw with hands and all. So No need to spend 1 hour with blender to create a model for a stone. Get a stone, hold before the app and your model and texture are all ready. Kind of too complicated than I thought it would be. But we as small firm runners strive on our desire to learn and do fancy new things. That separates us and give us jobs that bigger companies with monkey employees fail to deliver.
I will keep in touch with you Pete.You, Chris, Sacha, Dave, you guys are our role models
|
|
|
|
|
I'm throttling the events I get out of the gesture LoopFrames using Rx to bring it down to about 10FPS as it's fighting with voice for context. I will be fine tuning it as I go on. At first I was raw processing the gesture, so I was ignoring the pipeline, but the throughput was so fast that it overwhelmed the XAML renderer.
|
|
|
|
|
I don't know if I can pick one that I like the most. The FaceAnalytics was entertaining, but HeadRacing was fun and Ges3Draw is a topic that's close to my heart. You should be proud of yourself for your work - one package would be good, but 6 is outstanding.
|
|
|
|
|
Quote: You should be proud of yourself for your work
I am Pete. I have a wife and a Kid and a strt up ( which still is start up after 10 years). I lost my parents when I was less than 2 months old. Was brought up by orphanage. Had to work for 8 hours in my engineering days to raise money for my studies. Resigned from job after 15 days in work as there was no creativity. I have no relatives, not many friends. So I have made peace with my life. For last 10 years, all I do is work for 16 hours a day. I find creating new things, robotics, PI, Arduino, ECG, .Net, Image Processing and keep learning is a better way to keep my sadness aside than to drink out.
Not everybody's life is same . No complains, no sorrow. Learning is the best way for forgetting what I did not have in life. It felt so good when you appreciated the work. Thanks Pete. Actually I build 11 Prototypes .
Today morning ( My morning starts at 1 AM ) I finshed off the modlling. So I actually extracted the point cloud from RGB depth data and am able to render that model in 3d coordinate system. I want to develop it as a great modelling tool. So you dont have to spend two hours with blender to model a stone. Get a stone, hold it before my prototype, it extracts the 3d model. Now you rotate it in all directions. I get different 3d models of different view of same object. I then extract all the iso surfaces from all view and connect them through mesh to get a single 3d model and it's texture. Just use it readily with xna and other. So if you want a car in gaming project, buy a fancy toy car. Extract it's model and import it in your game.
I may not be able to complete it entirely by 20, but none the less, it is fun.
|
|
|
|
|
Hello Pete. How is it going with your work? Are you working with Images? I had built a complete set of Imaging functions in WPF for ImageGrassy. RGB, GRAY, Writing on Images, Convert between BitmapImage and GDI. Do let me know if you want any code on image processing ( though I feel stupid to offer the Pete' O a code help, I would rather be stupid than be of no use).
Your blog is awesome man. You are some programmer. It will take me another 5 years to reach a programming stage that you are right now in.
|
|
|
|
|
Don't put yourself down my friend. You have some serious coding chops. Thanks for the offer but I am okay with imaging functionality. The contest started last night so I have only just started the app proper. I like what you have done with your code. I never even considered the idea of controlling the mouse. That was a seriously cool idea.
|
|
|
|
|
Hello Pete. How are you doing? I am working with 2-D Augumented reality in perceptual computing now. I have used Dave Karr's card engine to build an augumented reality Solitaire where I put the depth map right in the scene and perform the play.
http://www.youtube.com/watch?v=wCOjuPdBooI[^]
And also I completed the preliminary prototype GesModello. Holding an object in front of the camera to extract it's 3d geometry.
http://www.youtube.com/watch?v=9WXYnDI6Nws[^]
Also I worked with Sacha's YouTube client. This has come out as a very funny and nice little good app.
http://www.youtube.com/watch?v=osvg38RZ4Pg[^]
When you get time, have a look at them and do let me know what you feel about the work.
Right now I am working on another theme in XNA. I call it GesPark. It is sort of a fun with gesture. So we have a 2-D park scene with waters, clouds, trees, apples on trees, fish on ponds.
1. I will hide the sun and slowly the scene will turn into evening scene. Stars will be visible. I can increse number of starts with some finger movement.
2. I will hit the clouds and it will start raining.
3. I will hit the tree for the apples to fall down.
4. Wave hands fast for a Storm.
5. There will be Baloons. If I cut the string with finger movement, they will fly. I can pinch a flying baloon for flowers to come out of it.
6. I can draw different particles .
7. I can wave hand in pond water and water texture effect will increase the water flow. Fishes will start jumping.
Actually I want to build something for my 21 months old kid. He loves spending time with my 6 year old Accer laptop and does funny things. I thought , I would tell you about this.
Also I am working with Volumetric Rendering to create 3-d skin model of medical image sequence like CT scan.
Put all my clients on hold till 20th
|
|
|
|
|
|
Nice. Have you managed to get your submissions in OK? There seem to be a lot of unhappy people having problems with the submission process.
|
|
|
|
|
Fortunately, Intel seems to be valuing my work. One lady by name Rebecca from the Competition team is in consistent touch with me. I had actually missed couple of youtube links in two submissions. She mailed me personally and got it corrected from back end.
As I see, probably I am the craziest of the lot and as it looks now will be accounted for more than 50% of the entries. So far it is 14 out of 27 valid submissions and I am on my way to submit another 3. So it will be 17 total submissions.
|
|
|
|
|
Good luck. I'm rooting for you.
|
|
|
|
|
Thanks Pete.
I just finished my 45 days of madness and 45 days of coding fun. wrapping it off with a work for my kid and for all those kids hidden in our hearts.
Please do watch this. Please.
http://youtu.be/AiRZ3spA0Z4[^]
And please tell me what you feel about the concept.
|
|
|
|
|
Looks very nice. Does your child enjoy it?
|
|
|
|
|
For last 5 hours all he is doing is sitting in front of my ultrabook park, and waving his hand. Prize or no prize, one day of his happiness is worth many such sleepless nights.
|
|
|
|
|
That's great. You have a winner when that happens.
Are you going to post the app on the AppUp store? I would, if I were you.
|
|
|
|
|
Intel had a pre condition that all the apps must be made freely available. So I guess they will be releasing it once the contest is over. Actually all the apps I tested with AppUp in mind only.
Anyways, even if they do not, I will share this entire work in codeproject any how.
You know what Pete, today out of no where I received a development kit from Leap motion. I had never applied. I do not know how they got my address in India and how they sent one without any mail or something.
The guys have created a great hardware, software. Even my wife is calling me with and wave signal. ha ha.
How about you? What kind of work are you into? it must be special. I did all sorts of things with SDK but failed to get FACE and Fingers working togather. Can you tell, why am I unable to get 20fps for face and finger both?
I have taken one session and querying it from two background processes concurrently. But it is so slow. Can you share some lights on this? If I can manage a face and hand togather, I can build some more stuff that were in my mind.
|
|
|
|