Words match in digiKam search

classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|

Words match in digiKam search

Jean-François Rabasse

Hello,

I have some questions about the way digiKam performs strings
comparisons when searching for keywords.

1. The digiKam handbook says "searches are case insensitive".
Well, doesn't seem to be the case in my environment (digiKam 1.2.0,
Linux OpenSuSE 11.3 with KDE 4) with non US ASCII characters.

I have several pictures with captions containing the word "église"
(for non French readers, église is a church). E.g. picture 1 is
titled "Petite église rouge", and picture 2 is titled "Église à
vendre". (With an uppercase É because all my titles are
capitalised.)

Now, if I type (in the words search input field) the word "église",
I get picture 1 but not picture 2. And if I type "Église", I get
picture 2 but not picture 1. Clearly a case sensitive match,
concerning only the ISO Latin characters (typing "éGLISE" finds
picture 1 too, and "ÉGLISE" finds picture 2).

I suspect it could be a problem with non ASCII characters encoding.
My desktop environment is set to French, ISO-8859-1 charset.
How digiKam internally encodes its text strings ? UTF-8 maybe ?
This could possibly explain a mismatch.

Has anyone already seen such an issue, and what could be the best
way to walk around ? Is there a way to tell digiKam about user input
encoding, us-ascii, iso-8859-x, utf-8 ? Or is it shadowed by the
X11/Qt/KDE layers ?

2. Another question related to strings match, just because I'm
curious :-)
In the digiKam configuration window, folder "Miscellaneous",
there's a configuration parameter named "String comparison type",
with a selection menu proposing two options, Natural or Normal.

I couldn't find anything about that in the handbook. What is a
"Normal" comparison type, and a "Natural" comparison type ?

Thanks in advance,
Jean-François

_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

Gilles Caulier-4
2011/8/28 Jean-François Rabasse <[hidden email]>:

>
> Hello,
>
> I have some questions about the way digiKam performs strings
> comparisons when searching for keywords.
>
> 1. The digiKam handbook says "searches are case insensitive".
> Well, doesn't seem to be the case in my environment (digiKam 1.2.0,
> Linux OpenSuSE 11.3 with KDE 4) with non US ASCII characters.
>
> I have several pictures with captions containing the word "église"
> (for non French readers, église is a church). E.g. picture 1 is
> titled "Petite église rouge", and picture 2 is titled "Église à
> vendre". (With an uppercase É because all my titles are
> capitalised.)
>
> Now, if I type (in the words search input field) the word "église",
> I get picture 1 but not picture 2. And if I type "Église", I get
> picture 2 but not picture 1. Clearly a case sensitive match,
> concerning only the ISO Latin characters (typing "éGLISE" finds
> picture 1 too, and "ÉGLISE" finds picture 2).
>
> I suspect it could be a problem with non ASCII characters encoding.
> My desktop environment is set to French, ISO-8859-1 charset.
> How digiKam internally encodes its text strings ? UTF-8 maybe ?
> This could possibly explain a mismatch.
>
> Has anyone already seen such an issue, and what could be the best
> way to walk around ? Is there a way to tell digiKam about user input
> encoding, us-ascii, iso-8859-x, utf-8 ? Or is it shadowed by the
> X11/Qt/KDE layers ?
>
> 2. Another question related to strings match, just because I'm
> curious :-)
> In the digiKam configuration window, folder "Miscellaneous",
> there's a configuration parameter named "String comparison type",
> with a selection menu proposing two options, Natural or Normal.
>
> I couldn't find anything about that in the handbook. What is a
> "Normal" comparison type, and a "Natural" comparison type ?
>

SHIFT+F1 over the drop-down menu doesn't help ?

Gilles Caulier
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

Jean-François Rabasse

Hello,

On Sun, 28 Aug 2011, Gilles Caulier wrote:

> Date: Sun, 28 Aug 2011 00:01:40 +0200
> From: Gilles Caulier <[hidden email]>
> Reply-To: digiKam - Home Manage your photographs as a professional with the
>     power of open source <[hidden email]>
> To: digiKam - Home Manage your photographs as a professional with the power of
>      open source <[hidden email]>
> Subject: Re: [Digikam-users] Words match in digiKam search
>
> 2011/8/28 Jean-François Rabasse <[hidden email]>:
>>
>> Hello,
>>
>> I have some questions about the way digiKam performs strings
>> comparisons when searching for keywords.
>>
>> 1. The digiKam handbook says "searches are case insensitive".
>> Well, doesn't seem to be the case in my environment (digiKam 1.2.0,
>> Linux OpenSuSE 11.3 with KDE 4) with non US ASCII characters.
>>
>> I have several pictures with captions containing the word "église"
>> (for non French readers, église is a church). E.g. picture 1 is
>> titled "Petite église rouge", and picture 2 is titled "Église à
>> vendre". (With an uppercase É because all my titles are
>> capitalised.)
>>
>> Now, if I type (in the words search input field) the word "église",
>> I get picture 1 but not picture 2. And if I type "Église", I get
>> picture 2 but not picture 1. Clearly a case sensitive match,
>> concerning only the ISO Latin characters (typing "éGLISE" finds
>> picture 1 too, and "ÉGLISE" finds picture 2).
>>
>> I suspect it could be a problem with non ASCII characters encoding.
>> My desktop environment is set to French, ISO-8859-1 charset.
>> How digiKam internally encodes its text strings ? UTF-8 maybe ?
>> This could possibly explain a mismatch.
>>
>> Has anyone already seen such an issue, and what could be the best
>> way to walk around ? Is there a way to tell digiKam about user input
>> encoding, us-ascii, iso-8859-x, utf-8 ? Or is it shadowed by the
>> X11/Qt/KDE layers ?
>>
>> 2. Another question related to strings match, just because I'm
>> curious :-)
>> In the digiKam configuration window, folder "Miscellaneous",
>> there's a configuration parameter named "String comparison type",
>> with a selection menu proposing two options, Natural or Normal.
>>
>> I couldn't find anything about that in the handbook. What is a
>> "Normal" comparison type, and a "Natural" comparison type ?
>>
>
> SHIFT+F1 over the drop-down menu doesn't help ?
>
> Gilles Caulier
Well, doesn't help a lot :-(
Shift-F1 says : "Not defined. There no "What's This" assigned to this
widget".
And the menu comparison options menu, Natural vs. Normal, has a tooltip
but not very readable. Tooltip text starts with "Sets the way in which
strings are compared inside digiKam, etc.", but seems to be a long text
and I can't have it displayed correctly. I can see that text embeds some
XHTML formating, e.g. "<br/>" to trig a line break. And this markups are
juste displayed as raw text, not processed, so the whole tooltip text
appears on one long line, truncated at the right edge of my screen.

But this was just a matter of curiousness.
My major problem is the case insensitive search when words contains
non US-ASCII characters.
Could be a problem with my graphic interface setup, or not.
Could non english writers digiKam users confirm - or not - that case
insensitive search works well with their country charset ?
If yes, I'll investigate in my X11/Qt/KDE configuration.
If not, I'll just do my searches twice, typing the lowercase version
then uppercase version of words containing Latin1 characters :-)

Thanks in advance,
Jean-François
_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

Rinus
Hi Jean-François,

Here you find the tooltip tekst as displayed in my dk 2.1.0
http://dl.dropbox.com/u/30489651/Schermafdruk-8.png

Regards,
Rinus

Op 28-08-11 21:33, Jean-François Rabasse schreef:

Hello,

On Sun, 28 Aug 2011, Gilles Caulier wrote:

Date: Sun, 28 Aug 2011 00:01:40 +0200
From: Gilles Caulier [hidden email]
Reply-To: digiKam - Home Manage your photographs as a professional with the
    power of open source [hidden email]
To: digiKam - Home Manage your photographs as a professional with the power of
     open source [hidden email]
Subject: Re: [Digikam-users] Words match in digiKam search

2011/8/28 Jean-François Rabasse [hidden email]:

Hello,

I have some questions about the way digiKam performs strings
comparisons when searching for keywords.

1. The digiKam handbook says "searches are case insensitive".
Well, doesn't seem to be the case in my environment (digiKam 1.2.0,
Linux OpenSuSE 11.3 with KDE 4) with non US ASCII characters.

I have several pictures with captions containing the word "église"
(for non French readers, église is a church). E.g. picture 1 is
titled "Petite église rouge", and picture 2 is titled "Église à
vendre". (With an uppercase É because all my titles are
capitalised.)

Now, if I type (in the words search input field) the word "église",
I get picture 1 but not picture 2. And if I type "Église", I get
picture 2 but not picture 1. Clearly a case sensitive match,
concerning only the ISO Latin characters (typing "éGLISE" finds
picture 1 too, and "ÉGLISE" finds picture 2).

I suspect it could be a problem with non ASCII characters encoding.
My desktop environment is set to French, ISO-8859-1 charset.
How digiKam internally encodes its text strings ? UTF-8 maybe ?
This could possibly explain a mismatch.

Has anyone already seen such an issue, and what could be the best
way to walk around ? Is there a way to tell digiKam about user input
encoding, us-ascii, iso-8859-x, utf-8 ? Or is it shadowed by the
X11/Qt/KDE layers ?

2. Another question related to strings match, just because I'm
curious :-)
In the digiKam configuration window, folder "Miscellaneous",
there's a configuration parameter named "String comparison type",
with a selection menu proposing two options, Natural or Normal.

I couldn't find anything about that in the handbook. What is a
"Normal" comparison type, and a "Natural" comparison type ?


SHIFT+F1 over the drop-down menu doesn't help ?

Gilles Caulier

Well, doesn't help a lot :-(
Shift-F1 says : "Not defined. There no "What's This" assigned to this
widget".
And the menu comparison options menu, Natural vs. Normal, has a tooltip
but not very readable. Tooltip text starts with "Sets the way in which
strings are compared inside digiKam, etc.", but seems to be a long text
and I can't have it displayed correctly. I can see that text embeds some
XHTML formating, e.g. "<br/>" to trig a line break. And this markups are
juste displayed as raw text, not processed, so the whole tooltip text
appears on one long line, truncated at the right edge of my screen.

But this was just a matter of curiousness.
My major problem is the case insensitive search when words contains
non US-ASCII characters.
Could be a problem with my graphic interface setup, or not.
Could non english writers digiKam users confirm - or not - that case
insensitive search works well with their country charset ?
If yes, I'll investigate in my X11/Qt/KDE configuration.
If not, I'll just do my searches twice, typing the lowercase version
then uppercase version of words containing Latin1 characters :-)

Thanks in advance,
Jean-François
_______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users


_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

Rinus
In reply to this post by Jean-François Rabasse
In my setup is works completely case insensitive Église and église or in either case both found.
dk 2.1.0
ubuntu 11.04
if you want oter details about the setup ask and provide details about where to find the wanted info.

Best,
Rinus

Op 28-08-11 21:33, Jean-François Rabasse schreef:

Hello,

On Sun, 28 Aug 2011, Gilles Caulier wrote:

Date: Sun, 28 Aug 2011 00:01:40 +0200
From: Gilles Caulier [hidden email]
Reply-To: digiKam - Home Manage your photographs as a professional with the
    power of open source [hidden email]
To: digiKam - Home Manage your photographs as a professional with the power of
     open source [hidden email]
Subject: Re: [Digikam-users] Words match in digiKam search

2011/8/28 Jean-François Rabasse [hidden email]:

Hello,

I have some questions about the way digiKam performs strings
comparisons when searching for keywords.

1. The digiKam handbook says "searches are case insensitive".
Well, doesn't seem to be the case in my environment (digiKam 1.2.0,
Linux OpenSuSE 11.3 with KDE 4) with non US ASCII characters.

I have several pictures with captions containing the word "église"
(for non French readers, église is a church). E.g. picture 1 is
titled "Petite église rouge", and picture 2 is titled "Église à
vendre". (With an uppercase É because all my titles are
capitalised.)

Now, if I type (in the words search input field) the word "église",
I get picture 1 but not picture 2. And if I type "Église", I get
picture 2 but not picture 1. Clearly a case sensitive match,
concerning only the ISO Latin characters (typing "éGLISE" finds
picture 1 too, and "ÉGLISE" finds picture 2).

I suspect it could be a problem with non ASCII characters encoding.
My desktop environment is set to French, ISO-8859-1 charset.
How digiKam internally encodes its text strings ? UTF-8 maybe ?
This could possibly explain a mismatch.

Has anyone already seen such an issue, and what could be the best
way to walk around ? Is there a way to tell digiKam about user input
encoding, us-ascii, iso-8859-x, utf-8 ? Or is it shadowed by the
X11/Qt/KDE layers ?

2. Another question related to strings match, just because I'm
curious :-)
In the digiKam configuration window, folder "Miscellaneous",
there's a configuration parameter named "String comparison type",
with a selection menu proposing two options, Natural or Normal.

I couldn't find anything about that in the handbook. What is a
"Normal" comparison type, and a "Natural" comparison type ?


SHIFT+F1 over the drop-down menu doesn't help ?

Gilles Caulier

Well, doesn't help a lot :-(
Shift-F1 says : "Not defined. There no "What's This" assigned to this
widget".
And the menu comparison options menu, Natural vs. Normal, has a tooltip
but not very readable. Tooltip text starts with "Sets the way in which
strings are compared inside digiKam, etc.", but seems to be a long text
and I can't have it displayed correctly. I can see that text embeds some
XHTML formating, e.g. "<br/>" to trig a line break. And this markups are
juste displayed as raw text, not processed, so the whole tooltip text
appears on one long line, truncated at the right edge of my screen.

But this was just a matter of curiousness.
My major problem is the case insensitive search when words contains
non US-ASCII characters.
Could be a problem with my graphic interface setup, or not.
Could non english writers digiKam users confirm - or not - that case
insensitive search works well with their country charset ?
If yes, I'll investigate in my X11/Qt/KDE configuration.
If not, I'll just do my searches twice, typing the lowercase version
then uppercase version of words containing Latin1 characters :-)

Thanks in advance,
Jean-François
_______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users


_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

Jean-François Rabasse

Hello,

On Sun, 28 Aug 2011, sleepless wrote:

> Hi Jean-François,
>
> Here you find the tooltip tekst as displayed in my dk 2.1.0
> http://dl.dropbox.com/u/30489651/Schermafdruk-8.png

Many thanks, so I can read the whole message without having to
buy a 25000 pixels width screen :-)

On Sun, 28 Aug 2011, sleepless wrote:

> In my setup is works completely case insensitive Église and église or in
> either case both found.
> dk 2.1.0
> ubuntu 11.04
> if you want oter details about the setup ask and provide details about
> where to find the wanted info.

Very interesting. So this should show my problem is outside digiKam
comparison code.
Anyway I suspect an insane mixture of ISO encoding and Unicode.
Metadata stored in images files by digiKam (at least the ones with
keywords, xmp.dc.description and xmp.digikam.tagslist) seems to be
UTF-8 encoded. It's probably the same in the digikam.db, and my
keybord inputs are ISO Latin !

Investigation in progress... :)

Thanks for all,
Jean-François

_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

Rinus
Op 29-08-11 20:42, Jean-François Rabasse schreef:

Hello,

On Sun, 28 Aug 2011, sleepless wrote:

Hi Jean-François,

Here you find the tooltip tekst as displayed in my dk 2.1.0
http://dl.dropbox.com/u/30489651/Schermafdruk-8.png

Many thanks, so I can read the whole message without having to
buy a 25000 pixels width screen :-)

On Sun, 28 Aug 2011, sleepless wrote:

In my setup is works completely case insensitive Église and église or in
either case both found.
dk 2.1.0
ubuntu 11.04
if you want oter details about the setup ask and provide details about
where to find the wanted info.

Very interesting. So this should show my problem is outside digiKam
comparison code.
Anyway I suspect an insane mixture of ISO encoding and Unicode.
Metadata stored in images files by digiKam (at least the ones with
keywords, xmp.dc.description and xmp.digikam.tagslist) seems to be
UTF-8 encoded. It's probably the same in the digikam.db, and my
keybord inputs are ISO Latin !
What your keyboard produces should be highly configurable those days
sysyem->preferences->keyboard
and beyond that there is more like xbindkeys but that's complicated I think
may be of no use to you
but have a look here: /home/jansen/Bureaublad/Schermafdruk-10.png

Rinus

Investigation in progress... :)

Thanks for all,
Jean-François
_______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users


_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

Rinus
Op 29-08-11 22:09, sleepless schreef:
Op 29-08-11 20:42, Jean-François Rabasse schreef:

Hello,

On Sun, 28 Aug 2011, sleepless wrote:

Hi Jean-François,

Here you find the tooltip tekst as displayed in my dk 2.1.0
http://dl.dropbox.com/u/30489651/Schermafdruk-8.png

Many thanks, so I can read the whole message without having to
buy a 25000 pixels width screen :-)

On Sun, 28 Aug 2011, sleepless wrote:

In my setup is works completely case insensitive Église and église or in
either case both found.
dk 2.1.0
ubuntu 11.04
if you want oter details about the setup ask and provide details about
where to find the wanted info.

Very interesting. So this should show my problem is outside digiKam
comparison code.
Anyway I suspect an insane mixture of ISO encoding and Unicode.
Metadata stored in images files by digiKam (at least the ones with
keywords, xmp.dc.description and xmp.digikam.tagslist) seems to be
UTF-8 encoded. It's probably the same in the digikam.db, and my
keybord inputs are ISO Latin !
What your keyboard produces should be highly configurable those days
sysyem->preferences->keyboard
and beyond that there is more like xbindkeys but that's complicated I think
may be of no use to you
but have a look here: /home/jansen/Bureaublad/Schermafdruk-10.png
this will work a lot better:
http://dl.dropbox.com/u/30489651/Schermafdruk-10.png

Rinus

Investigation in progress... :)

Thanks for all,
Jean-François
_______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users

_______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users


_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users
Reply | Threaded
Open this post in threaded view
|

Re: Words match in digiKam search

guenter
In reply to this post by Jean-François Rabasse
Am 28.08.2011 00:02, schrieb Jean-François Rabasse:

>
> Hello,
>
> I have some questions about the way digiKam performs strings
> comparisons when searching for keywords.
>
> 1. The digiKam handbook says "searches are case insensitive".
> Well, doesn't seem to be the case in my environment (digiKam 1.2.0,
> Linux OpenSuSE 11.3 with KDE 4) with non US ASCII characters.
>
> I have several pictures with captions containing the word "église"
> (for non French readers, église is a church). E.g. picture 1 is
> titled "Petite église rouge", and picture 2 is titled "Église à
> vendre". (With an uppercase É because all my titles are
> capitalised.)
>
> Now, if I type (in the words search input field) the word "église",
> I get picture 1 but not picture 2. And if I type "Église", I get
> picture 2 but not picture 1. Clearly a case sensitive match,
> concerning only the ISO Latin characters (typing "éGLISE" finds
> picture 1 too, and "ÉGLISE" finds picture 2).
[snip]

Hi,

same here: "Über" != "über".
May be a problem of the used string-compare library?

digikam --version:
Qt: 4.7.3
KDE Development Platform: 4.7.1 (4.7.1)
digiKam: 2.0.0

Greetings,
   Günter

_______________________________________________
Digikam-users mailing list
[hidden email]
https://mail.kde.org/mailman/listinfo/digikam-users