Updated GSoC Proposal

classic Classic list List threaded Threaded
3 messages Options
Reply | Threaded
Open this post in threaded view
|

Updated GSoC Proposal

Aditya Bhatt
Hi,

@Alex : This is my GSoC proposal for this project. What do you say?

@Marcel : Many thanks for the feedback. I've updated my proposal and made it more specific.

Mention on what test data this is based?

We've been testing our own pictures with frontal faces, and the detection is good. More cascades used in tandem == better detection accuracy. For profile and side faces, it is a trivial matter of applying more cascades. This increases the accuracy, but decreases the speed, I'll be working on that. It involves a tradeoff between image size and good detection. I can decrease image size to dramatically increase detection speed, but accuracy goes down. I can think of a lot of solutions for this, they will come in the subsequent commits.

Make clear you're talking about digikam code here.
(This refers to the current preview widget in digikam. We are (I am) unsure at
the moment about it's technological future, but that's not your problem for
this proposal.)
 
Ok, I haven't mentioned the Qt4 porting of the preview widget this time. That would anyway have been too much work, I guess.

There was talk about a Eigen/Fisherfaces and GSL library integration? If
that's going to be a major chunk of work, mention it.

Ok, mentioned.
 
You can also mention Nepomuk here, at least as possible integration. PIMO is
the relevant ontology:
http://dev.nepomuk.semanticdesktop.org/wiki/PimoOntology

Mentioned now, as optional work.

So here's my new proposal :

----------------------------------------
Project: Face Recognition
----------------------------------------

Name : Aditya Bhatt

E-mail : [hidden email]

Freenode/IRC : Adityab

Location : Ahmedabad, India

Proposal Title: Automatic face detection, recognition, and tagging in digiKam

--------------------------------------
Motivation
--------------------------------------

digiKam has always been my photo management program of choice. When coupled with KIPI, it becomes something akin to the swiss army knife of all photo management suites.
One of the most demanded features of digiKam has been the automatic tagging of photograps with names of the people in them by detecting faces and recongizing who's who.
In KDE's bugzilla, this is the feature request for digiKam with the most votes since a long time, and is the 26th most wanted feature in all of KDE.

digiKam will benefit a lot from this. Picasa's linux version does not have face recognition, nor does F-Spot have it. So digiKam will, as far as I know, be the first linux suite to have this feature.
This feature will be very useful for searching and organizing. After everything is implemented in digiKam, nepomuk/akonadi integration can be done.

-------------------------------------
Implementation
-------------------------------------

I've already been working with Alex Jironkin, the mentor for this project, on libface, the library that will be used to provide the biometric functionality.
libface is currently hosted on sourceforge : http://sourceforge.net/projects/libface/

The summer work will involve working on both finishing libface and digiKam.

Our interface for libface is almost ready, and from what I learned from Marcel, requires a minimal amount of polishing, related to accepting image data. The face detection part works pretty good for images with frontal faces.
The detection is tested using our respective personal albums.

Apart from improving the eigenfaces part for face recognition, I intend to work on the fisherfaces algorithm - which is known to provide better results with pose and lighting variation than eigenfaces.
I will also decide a proper face database to accompany libface before GSoC coding starts, so that better testing can be done and results can be benchmarked and compared by us and future developers.

This will require usage of the GNU Scientific Library (GSL). I'm getting familiar with fisherfaces, and have some MATLAB code for it. I will port some to C++ and incorporate it into libface.
Some work already has been done by Alex on this in libface.

I intend to implement a face tagging widget - this will involve modifying the nepomuk-peopletag project, and might take a while.

The face tagging widget will be merged into digiKam. The format of the tags to be generated will be decided after some discussions. The face detector will decide the region to be tagged, and the recognition part will decide the contents of the tag.

Now I'll work on storing tags in the image metadata and digiKam's database. I'll need help from the digiKam core people on this.

This is what I intend to do as part of GSoC.

If there is still time, I'd like to start with nepomuk integration using PIMO. Since I'm not familiar with nepomuk, I'll need lots of help from Marcel and the nepomuk people for that.

------------------------------------
Tentative Schedule
------------------------------------

First half of May:
Get familiar with the digiKam team, learn about digiKam's databases, how tags are organized, and start writing some working code for a plugin.
Work on libface meanwhile - Start porting Fisherfaces snippets from MATLAB to C++. Get familiar with GSL.

Mid-May to Mid-June:
Finish Fisherfaces. Create a working version of a people-tagging widget.

Mid-June to Beginning of July:
Make the people tagging widget work nicely with libface. This will have a possible voting system.

Start of July to July-End:
Start integrating the tagging widget and libface combo into digiKam. Might involve possible changes to how to accept tag input from user. Work more on the voting system for training.

August:
Clean-up of code, write documentation on usage of the new feature, and kill bugs :) Maybe start possible akonadi/nepomuk work.

-----------------------------------
About Me
-----------------------------------

I'm a second-year engineering student currently doing my Bachelors in Information and Communication Technology.
I've been using linux since about 1999, when I was eight or nine. I used X11 interfaces, and what I think was KDE at that time. I used gnome and windows from 2003 to 2006, then went back to KDE. I loved and still love KDE because of it's immense configurability. The new explosion of incoming developers and artists into KDE is wonderful, and I love the way new features are being rapidly integrated into KDE SC.

My primary fields of interest are Image Processing, Pattern Recognition, Cryptography, and Mathematical Computing.

I'm quite familiar with C and C++.

I speak on FOSS topics in my University's Open-Source Society, and also about emerging computing trends in our IEEE student branch's TechTalks.

A few links:
My bitbucket: http://bitbucket.org/aditya_bhatt/
My Blog : http://adityabhatt.wordpress.com
Libface : http://sourceforge.net/projects/libface/

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: Updated GSoC Proposal

Bugzilla from kare.sars@iki.fi
On Tuesday 23 March 2010, Aditya Bhatt wrote:

> Hi,
>
> @Alex : This is my GSoC proposal for this project. What do you say?
>
> @Marcel : Many thanks for the feedback. I've updated my proposal and made
> it more specific.
>
>  Mention on what test data this is based?
>
>
> We've been testing our own pictures with frontal faces, and the detection
> is good. More cascades used in tandem == better detection accuracy. For
> profile and side faces, it is a trivial matter of applying more cascades.
> This increases the accuracy, but decreases the speed, I'll be working on
> that. It involves a tradeoff between image size and good detection. I can
> decrease image size to dramatically increase detection speed, but accuracy
> goes down. I can think of a lot of solutions for this, they will come in
> the subsequent commits.
>

Just a comment about the image size tradeoff.

I have just updated the automatic image selection on previews in Skanlite
(actually libksane). What I did was to first do a rough auto selection on a
resized image ~100 * 150 pixels and then refine the selections on the full-
sized preview. It improved the speed dramatically and did not decrease the
accuracy. (actually it removed a lot of false positives)

Just an idea if you needed more :)

--
Kåre
_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel
Reply | Threaded
Open this post in threaded view
|

Re: Updated GSoC Proposal

Michael G. Hansen
In reply to this post by Aditya Bhatt
On 03/23/2010 10:11 AM, Aditya Bhatt wrote:
> The face tagging widget will be merged into digiKam. The format of the tags
> to be generated will be decided after some discussions. The face detector
> will decide the region to be tagged, and the recognition part will decide
> the contents of the tag.

During last year's coding sprint, I investigated existing formats to
highlight regions. You can find the document here:

http://websvn.kde.org/trunk/extragear/graphics/digikam/project/ImageAnnotation.odt

Feel free to add to it once you get SVN access ;-)

As for resizing of the images: This would risk making faces too small if
there are lots of people in the image, wouldn't it? Consider a group
photo of people, which may be 3000x2000 pixels, but the individual faces
are only 100x100 pixel. But that is probably a detail that can be found
out later.

Michael

_______________________________________________
Digikam-devel mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-devel