Hello, Digikam developers,
I am a active and happy user of Digikam. One feature I like and use is ability to save image comments to EXIF and IPTC tags. This way image comments entered in Digikam are also usable in other applications I use under Windows, e.g. Google Picasa, IrfanView or ACDSee. Unfortunately, I face a problem when comments are not in English. I am a Russian speaker and naturally want to annotate my pictures in Russian. However, Digikam will only save ASCII characters as IPTC comment. It may be in line with IPTC standard (although I'm not sure, as far as I can see the standard says that caption consists of "graphic characters" where "graphic characters" are defined as "characters that have visual representation") - but in any case Picasa or ACDSee happily read and write non-ASCII characters in IPTC caption. These Windows applications of course use Windows character set, CP1251 in my case, which Digikam won't read. In the end, no interoperability between Digikam and Windows world in terms of Russian image captions. I understand and share the concern about following the IPTC standard, but interoperability with popular image manipulation programs is equally important for me, plus I don't really see IPTC insisting on ASCII. To help my problem I have come up with small patches for Digikam and kipi-plugins, which I would like to offer to Digikam community for review and comment. This is what they do: 1. Add an option in Configure Digikam dialog, Metadata page, "IPTC Encoding". By default it is ASCII and Digikam's current behavior is preserved. 2. If the option is set to a non-ASCII encoding, Digikam will read and write IPTC tags in this encoding. Metadataedit KIPI plugin will do the same. 3. The setting is stored in kdeglobals (rather than digikamrc), since it's used not only by Digikam itself, but also by Digikam kioslaves and any applications that would load Metadataedit KIPI plugin. The patches are made against Digikam 0.9.0 and kipi-plugins 0.1.3 sources. I welcome any feedback about my patches and hope to see them in Digikam one day. Thanks, Leonid _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel digikam-0.9.0-iptc-encoding-lz.patch (9K) Download Attachment kipi-plugins-0.1.3-iptc-encoding-lz.patch (1K) Download Attachment |
Hi Leonid,
In first, thanks for your help. All contributions are very appreciate. Well, i know the IPTC char encoding problem. Look the bug report : http://bugs.kde.org/show_bug.cgi?id=132244 .. and the solution is written in this report.... About you patch, you must never use a stable inplementation to build a source code patch, but always the current implementation from svn. Look here : http://www.digikam.org/?q=contrib With the current implementation, i cannot use your patch directly, because i have created a new shared library named libkexiv2 witch is an Exiv2 interface for digiKam and kipi-plugins. This way remove all dupplicate code in Digikam::DMetadata class and KipiPlugins::Exiv2Iface class. The libkexiv2 is at the same place than kipi-plugins in svn : http://websvn.kde.org/trunk/extragear/libs/libkexiv2 To apply the solution explained into #132244 bug report, the libkexiv2 need to be patched. Others part of your big patch sound good (widgets, settings, etc.). of course, i need to test it indeep (:=))) Please review again your patch. Thanks in advance for your help Regards Gilles Caulier Le Mardi 6 Février 2007 22:02, Leonid Zeitlin a écrit : > Hello, Digikam developers, > I am a active and happy user of Digikam. One feature I like and use is > ability to save image comments to EXIF and IPTC tags. This way image > comments entered in Digikam are also usable in other applications I use > under Windows, e.g. Google Picasa, IrfanView or ACDSee. Unfortunately, I > face a problem when comments are not in English. I am a Russian speaker and > naturally want to annotate my pictures in Russian. However, Digikam will > only save ASCII characters as IPTC comment. It may be in line with IPTC > standard (although I'm not sure, as far as I can see the standard says that > caption consists of "graphic characters" where "graphic characters" are > defined as "characters that have visual representation") - but in any case > Picasa or ACDSee happily read and write non-ASCII characters in IPTC > caption. These Windows applications of course use Windows character set, > CP1251 in my case, which Digikam won't read. In the end, no > interoperability between Digikam and Windows world in terms of Russian > image captions. > > I understand and share the concern about following the IPTC standard, but > interoperability with popular image manipulation programs is equally > important for me, plus I don't really see IPTC insisting on ASCII. > > To help my problem I have come up with small patches for Digikam and > kipi-plugins, which I would like to offer to Digikam community for review > and comment. This is what they do: > 1. Add an option in Configure Digikam dialog, Metadata page, "IPTC > Encoding". By default it is ASCII and Digikam's current behavior is > preserved. > 2. If the option is set to a non-ASCII encoding, Digikam will read and > write IPTC tags in this encoding. Metadataedit KIPI plugin will do the > same. 3. The setting is stored in kdeglobals (rather than digikamrc), since > it's used not only by Digikam itself, but also by Digikam kioslaves and any > applications that would load Metadataedit KIPI plugin. > > The patches are made against Digikam 0.9.0 and kipi-plugins 0.1.3 sources. > > I welcome any feedback about my patches and hope to see them in Digikam one > day. > > Thanks, > Leonid Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Hi Gilles,
Thanks for your reply. I didn't realize the code was refactored since 0.9.0 release. I will get the latest from SVN and adapt my patch to the new code. Regarding the discussion in bug #132244. I tried setting Iptc.Envelope.CharacterSet with exiv2 command-line utility and then saving Iptc.Application2.Caption in UTF8 as described there. I've found that both Photoshop and IrfanView didn't decode UTF and showed it as is (unreadable), while Picasa simply didn't recognize the presence of caption at all. I also saw that Picasa doesn't set Iptc.Envelope.CharacterSet tag. Therefore I think my approach is orthogonal to what is discussed there and still would be a good feature. I will get back to you once I update the patch to the latest code. Thanks, Leonid On 2/6/07, Caulier Gilles <[hidden email]> wrote: > Hi Leonid, > > In first, thanks for your help. All contributions are very appreciate. > > Well, i know the IPTC char encoding problem. Look the bug report : > > http://bugs.kde.org/show_bug.cgi?id=132244 > > .. and the solution is written in this report.... > > About you patch, you must never use a stable inplementation to build a source > code patch, but always the current implementation from svn. Look here : > > http://www.digikam.org/?q=contrib > > With the current implementation, i cannot use your patch directly, because i > have created a new shared library named libkexiv2 witch is an Exiv2 interface > for digiKam and kipi-plugins. This way remove all dupplicate code in > Digikam::DMetadata class and KipiPlugins::Exiv2Iface class. > > The libkexiv2 is at the same place than kipi-plugins in svn : > > http://websvn.kde.org/trunk/extragear/libs/libkexiv2 > > To apply the solution explained into #132244 bug report, the libkexiv2 need to > be patched. > > Others part of your big patch sound good (widgets, settings, etc.). of course, > i need to test it indeep (:=))) > > Please review again your patch. Thanks in advance for your help > > Regards > > Gilles Caulier > Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Le mercredi 7 février 2007 12:37, vous avez écrit :
> Hi Gilles, > Thanks for your reply. I didn't realize the code was refactored since > 0.9.0 release. I will get the latest from SVN and adapt my patch to > the new code. > > Regarding the discussion in bug #132244. I tried setting > Iptc.Envelope.CharacterSet with exiv2 command-line utility and then > saving Iptc.Application2.Caption in UTF8 as described there. yes, this tag need to be set accordinly to the charset encoding used. I recommend to provide 2 charset (i have not yet checked your patch) : - ASSCI - UTF8 With the last one, all language will be supported. > I've > found that both Photoshop and IrfanView didn't decode UTF and showed > it as is (unreadable), while Picasa simply didn't recognize the > presence of caption at all. I also saw that Picasa doesn't set > Iptc.Envelope.CharacterSet tag. yes, i have suspected this problem, reading some web site about this subject. > Therefore I think my approach is > orthogonal to what is discussed there and still would be a good > feature. yes, it's look fine for me. but the enveloppe tag need to be set accordinly. > > I will get back to you once I update the patch to the latest code. fine for me. Please post it in the bugzilla file #132244. It's better than mailling list (this one is limited to attachment size)... Gilles _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Ok, I will post to bugzilla.
One note about setting Iptc.Envelope.CharacterSet though. I am afraid CP1251 is not even registered in the "International Registry for Coded Character Sets", and probably some other charsets that could be used by KDE aren't either. Therefore setting this tags is not always possible. Thanks, Leonid On 2/7/07, Caulier Gilles <[hidden email]> wrote: > Le mercredi 7 février 2007 12:37, vous avez écrit: > > Hi Gilles, > > Thanks for your reply. I didn't realize the code was refactored since > > 0.9.0 release. I will get the latest from SVN and adapt my patch to > > the new code. > > > > Regarding the discussion in bug #132244. I tried setting > > Iptc.Envelope.CharacterSet with exiv2 command-line utility and then > > saving Iptc.Application2.Caption in UTF8 as described there. > > yes, this tag need to be set accordinly to the charset encoding used. I > recommend to provide 2 charset (i have not yet checked your patch) : > > - ASSCI > - UTF8 > > With the last one, all language will be supported. > > > I've > > found that both Photoshop and IrfanView didn't decode UTF and showed > > it as is (unreadable), while Picasa simply didn't recognize the > > presence of caption at all. I also saw that Picasa doesn't set > > Iptc.Envelope.CharacterSet tag. > > yes, i have suspected this problem, reading some web site about this subject. > > > Therefore I think my approach is > > orthogonal to what is discussed there and still would be a good > > feature. > > yes, it's look fine for me. but the enveloppe tag need to be set accordinly. > > > > > I will get back to you once I update the patch to the latest code. > > fine for me. Please post it in the bugzilla file #132244. It's better than > mailling list (this one is limited to attachment size)... > > Gilles > _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Le mercredi 7 février 2007 15:15, Leonid Zeitlin a écrit :
> Ok, I will post to bugzilla. > > One note about setting Iptc.Envelope.CharacterSet though. I am afraid > CP1251 is not even registered in the "International Registry for Coded > Character Sets", and probably some other charsets that could be used > by KDE aren't either. Therefore setting this tags is not always > possible. Use UTF-8 instead. Look in B.K.O #132244. Andreas, the Exiv2 library author have posted a link with the solution about envelope IPTC tag value. From the page : http://www.annocpan.org/~BETTELLI/Image-MetaData-JPEG-0.15/lib/Image/MetaData/JPEG/TagLists.pod ... on "IPTC data (Editorial information and envelope record)" section, you can read : « ... 4) This dataset selects a character set, for use in character oriented datasets in records 2-6, according to the "International Register of Coded Character Sets" (ISO/IEC 2022 and ISO/IEC 2375, see for instance L<http://www.itscj.ipsj.or.jp/ISO-IR/>), and typically consist of the escape control character followed by one or more graphic characters. For instance, "\033/A" refers to ISO-8859-1 (latin-1) and "\033%G" refers to UTF-8 (a Unicode encoding). ... » I propose to give one option in digiKam setup and MetadataEdit kipi-plugin to set the charset encoding to ASCII or UTF-8. Gilles _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
On 2/8/07, Caulier Gilles <[hidden email]> wrote:
> Use UTF-8 instead. [...] > > I propose to give one option in digiKam setup and MetadataEdit kipi-plugin to > set the charset encoding to ASCII or UTF-8. > > Gilles > Hi Gilles, Using UTF-8 would allow Digikam to save IPTC comments in international languages, but it won't achieve interperability with Picasa/IrfanView/Photoshop etc., and that was my primary goal. Those Windows programs, as I can see from my testing, write and display IPTC comments as is, without any encoding/decoding. Therefore for interoperatilbity with them, comments should be in whatever encoding Windows is using, and that is not UTF-8 (for Russian language it's going to be CP1251). Thanks, Leonid _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Le jeudi 8 février 2007 13:26, vous avez écrit :
> On 2/8/07, Caulier Gilles <[hidden email]> wrote: > > Use UTF-8 instead. > > [...] > > > I propose to give one option in digiKam setup and MetadataEdit > > kipi-plugin to set the charset encoding to ASCII or UTF-8. > > > > Gilles > > Hi Gilles, > Using UTF-8 would allow Digikam to save IPTC comments in international > languages, but it won't achieve interperability with > Picasa/IrfanView/Photoshop etc., and that was my primary goal. Those > Windows programs, as I can see from my testing, write and display IPTC > comments as is, without any encoding/decoding. Therefore for > interoperatilbity with them, comments should be in whatever encoding > Windows is using, and that is not UTF-8 (for Russian language it's > going to be CP1251). You want mean than Picasa/IrfanView/Photoshop support IPTC charset like CP1251 but not UTF-8 ? But you have said before than CP1251 is not even registered in the "International Registry for Coded Character Sets" (:=)))... this is a non sence if commercial photo apps do not following at least the standard specification define into norms... Witch version of photoshop you use ? Gilles _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
On 2/8/07, Caulier Gilles <[hidden email]> wrote:
> > You want mean than Picasa/IrfanView/Photoshop support IPTC charset like CP1251 > but not UTF-8 ? > > But you have said before than CP1251 is not even registered in > the "International Registry for Coded Character Sets" (:=)))... this is a non > sence if commercial photo apps do not following at least the standard > specification define into norms... > > Witch version of photoshop you use ? > > Gilles > Gilles, I am trying to say that these Windows applications do not attempt to recode the IPTC values in any way. They take them as is, i.e. as being in whatever charset Windows is using. For Russian language Windows is using code page 1251 charset. And yes, Microsoft doesn't care to register it's code pages with ISO. I guess it applies to all Windows code pages, not only the Russian (Cyrillic) one. Photoshop version 7. Thanks, Leonid _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Hi Gilles,
It looks like IptcWidget in Digikam SVN is not yet converted to use libkexiv2. Is it going to be updated? Thanks, Leonid _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
On 2/9/07, Caulier Gilles <[hidden email]> wrote:
> Le vendredi 9 février 2007 14:26, vous avez écrit: > > Hi Gilles, > > It looks like IptcWidget in Digikam SVN is not yet converted to use > > libkexiv2. Is it going to be updated? > > yes, of course, but i need to add more method in libkexiv2 for that, > especially to extract a list of tags in metadata. > > It's not simple, and i'm busy on other part actually. Still on my TODO list, > excepted if you want do it of course (:=))) > > Gilles Hi Gilles, I see. Well, at the moment I don't feel I'm up to this task :-). But I am going to try to apply my encoding patch at the libkexiv2 level. If by the time I do that I feel comfortable with libkexiv2 and exiv2, I may even try approaching IptcWidget. Thanks, Leonid _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Free forum by Nabble | Edit this page |