------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 Summary: utf8 display and edit Product: digikam Version: unspecified Platform: SuSE RPMs OS/Version: Linux Status: UNCONFIRMED Severity: normal Priority: NOR Component: general AssignedTo: digikam-devel kde org ReportedBy: jdd dodin org Version: (using KDE KDE 3.4.2) Installed from: SuSE RPMs OS: Linux I use suse linux OSS 10.0. This distribution uses utf8 LANG=fr_FR.UTF8 in my case. I use to write comments in exifs. all goes well. comments are written, konqueror displays them. but if I use konqueror to copy a photo from a folder to an other, digikam do no more display correctly the utf8 characters (displays Valérie in place of Valérie), even in the edit comments utility and worst when exporting to html the same Valérie is exported :-( <div align="center">Valérie et Virginie</div> _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From jdd dodin org 2006-01-17 11:33 ------- in 0.8.1 (svn compiled locally), IN digikam * right clic copy/paste: the utf8 is cripled * mouse clic ans shift then copy: utf8 is _not_ cripled jdd _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From mikmach wp pl 2006-01-17 13:26 ------- > I use suse linux OSS 10.0. This distribution uses utf8 LANG=fr_FR.UTF8 > in my case. I use to write comments in exifs. Confirming problem for trunk, Mandriva 2005LE, whole KDE from SVN 3.5 branch (digiKam from trunk - 0.9svn). LANG=pl_PL (encoding iso-8859-2). When copying images with comments from one album to another non latin1 letters are broken. Looks like utf-8 bits are displayed directly in iso-8859-2. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From sebastian.roeder uni-bielefeld de 2006-03-28 17:44 ------- I can not confirm your utf8 problems here on Gentoo Linux with digikam 0.9.0 SVN and LANG=de_DE.UTF-8. However I found another strange behaviour while trying to reproduce it: after I copied the image to another dir/album with konqueror or inside digikam the comment gets completely lost (although I checks the "embedding the comments in exif"). I will have a look on it the next days cause Gilles is currently changing the internals of digikam that deal with exif data. Maybe my problem is related. Can you please provide an example image? _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From caulier.gilles free fr 2006-03-28 18:29 ------- Yes sebastian, let's me finish to remove libKExif depency from digiKam core and we will hack this problem using trunk branch. There are some working hours to use Exiv2 instead libKexif into digiKam core at all. I think completed this task this week. Remember me next week (:=)))... Gilles Caulier _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From caulier.gilles free fr 2006-04-03 16:27 ------- The core metadata class is now updated. Please try agian using trunk svn branch implementation. Thanks in advance Gilles Caulier _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From mikmach wp pl 2006-04-04 00:31 ------- Works for me. All old broken comments are now showed properly, also moving of images between albums don't destroy comments. (.9svn) _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 caulier.gilles free fr changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |ach mpe mpg de ------- Additional Comments From caulier.gilles free fr 2006-04-04 13:21 ------- *** Bug 98462 has been marked as a duplicate of this bug. *** _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From marcel.wiesweg gmx de 2006-05-05 23:17 ------- SVN commit 537807 by mwiesweg: Unicode support for JFIF and EXIF comments: - use UTF8 for JFIF comment - use Unicode (UCS-2) to write JPEG UserComment, support charset specification when reading the UserComment - add convertCommentValue method to DMetaData Using UTF8 for JFIF is simple and easy and should work. The UCS-2 support needs testing (and a decision if we always want to write Unicode, or a way to find out when we need to and when we can as well write ASCII) CCBUG: 120241 114211 M +84 -28 dmetadata/dmetadata.cpp M +3 -0 dmetadata/dmetadata.h M +11 -3 widgets/metadata/exifwidget.cpp --- trunk/extragear/graphics/digikam/libs/dmetadata/dmetadata.cpp #537806:537807 @ -28,6 +28,7 @ // Qt includes. #include <qfile.h> +#include <qtextcodec.h> #include <qwmatrix.h> // KDE includes. @ -635,43 +636,46 @ QString DMetadata::getImageComment() const { try - { + { + if (d->filePath.isEmpty()) + return QString(); + // In first we trying to get image comments, outside of Exif and IPTC. - QString comments(d->imageComments.c_str()); - + QString comments = QString::fromUtf8(d->imageComments.c_str()); + if (!comments.isEmpty()) - return comments; - - // In second, we trying to get Exif comments - + return comments; + + // In second, we trying to get Exif comments + if (!d->exifMetadata.empty()) { Exiv2::ExifKey key("Exif.Photo.UserComment"); Exiv2::ExifData exifData(d->exifMetadata); Exiv2::ExifData::iterator it = exifData.findKey(key); - + if (it != exifData.end()) { - QString ExifComment(it->toString().c_str()); - - if (!ExifComment.isEmpty()) - return ExifComment; + QString exifComment = convertCommentValue(*it); + + if (!exifComment.isEmpty()) + return exifComment; } } - - // In third, we trying to get IPTC comments - + + // In third, we trying to get IPTC comments + if (!d->iptcMetadata.empty()) { Exiv2::IptcKey key("Iptc.Application2.Caption"); Exiv2::IptcData iptcData(d->iptcMetadata); Exiv2::IptcData::iterator it = iptcData.findKey(key); - + if (it != iptcData.end()) { - QString IptcComment(it->toString().c_str()); - + QString IptcComment = QString::fromLatin1(it->toString().c_str()); + if (!IptcComment.isEmpty()) return IptcComment; } @ -683,15 +687,15 @ kdDebug() << "Cannot get Image comments using Exiv2 (" << QString::fromLocal8Bit(e.what().c_str()) << ")" << endl; - } - + } + return QString(); } bool DMetadata::setImageComment(const QString& comment) { try - { + { if (comment.isEmpty()) return false; @ -699,13 +703,21 @ // In first we trying to set image comments, outside of Exif and IPTC. - const std::string str(comment.latin1()); + const std::string str(comment.utf8()); d->imageComments = str; // In Second we write comments into Exif. - - d->exifMetadata["Exif.Photo.UserComment"] = comment.latin1(); - + + // Be aware that we are dealing with a UCS-2 string. + // Null termination means \0\0, strlen does not work, + // do not use any const-char*-only methods, + // pass a std::string and not a const char * to ExifDatum::operator=(). + const unsigned short *ucs2 = comment.ucs2(); + std::string exifComment("charset=\"Unicode\" "); + exifComment.append((const char*)ucs2, sizeof(unsigned short) * comment.length()); + d->exifMetadata["Exif.Photo.UserComment"] = exifComment; + //d->exifMetadata["Exif.Photo.UserComment"] = comment.latin1(); + // In Third we write comments into Iptc. Note that Caption IPTC tag is limited to 2000 char. setImageProgramId(); @ -713,7 +725,7 @ QString commentIptc = comment; commentIptc.truncate(2000); d->iptcMetadata["Iptc.Application2.Caption"] = commentIptc.latin1(); - + return true; } catch( Exiv2::Error &e ) @ -721,11 +733,55 @ kdDebug() << "Cannot set Comment into image using Exiv2 (" << QString::fromLocal8Bit(e.what().c_str()) << ")" << endl; - } - + } + return false; } +QString DMetadata::convertCommentValue(const Exiv2::Exifdatum &exifDatum) +{ + std::string comment = exifDatum.toString(); + std::string charset; + + // libexiv2 will prepend "charset=\"SomeCharset\" " if charset is specified + // Before conversion to QString, we must know the charset, so we stay with std::string for a while + if (comment.length() > 8 && comment.substr(0, 8) == "charset=") + { + // the prepended charset specification is followed by a blank + std::string::size_type pos = comment.find_first_of(' '); + if (pos != std::string::npos) + { + // extract string between the = and the blank + charset = comment.substr(8, pos-8); + // get the rest of the string after the charset specification + comment = comment.substr(pos+1); + } + } + + if (charset == "\"Unicode\"") + { + // QString expects a null-terminated UCS-2 string. + // Is it already null terminated? In any case, add termination for safety. + comment += "\0\0"; + return QString::fromUcs2((unsigned short *)comment.data()); + } + else if (charset == "\"Jis\"") + { + QTextCodec *codec = QTextCodec::codecForName("JIS7"); + return codec->toUnicode(comment.c_str()); + } + else if (charset == "\"Ascii\"") + { + return QString::fromLatin1(comment.c_str()); + } + else + { + // or from local8bit ?? + return QString::fromLatin1(comment.c_str()); + } +} + + /* Iptc.Application2.Urgency <==> digiKam Rating links: --- trunk/extragear/graphics/digikam/libs/dmetadata/dmetadata.h #537806:537807 @ -30,6 +30,7 @ // Exiv2 includes. #include <exiv2/types.hpp> +#include <exiv2/exif.hpp> // Local includes. @ -104,6 +105,8 @ PhotoInfoContainer getPhotographInformations() const; + static QString convertCommentValue(const Exiv2::Exifdatum &comment); + private: DImg::FORMAT fileFormat(const QString& filePath); --- trunk/extragear/graphics/digikam/libs/widgets/metadata/exifwidget.cpp #537806:537807 @ -155,9 +155,17 @ QString key = QString::fromLocal8Bit(md->key().c_str()); // Decode the tag value with a user friendly output. - std::ostringstream os; - os << *md; - QString tagValue = QString::fromLocal8Bit(os.str().c_str()); + QString tagValue; + if (key == "Exif.Photo.UserComment") + { + tagValue = DMetadata::convertCommentValue(*md); + } + else + { + std::ostringstream os; + os << *md; + tagValue = QString::fromLocal8Bit(os.str().c_str()); + } tagValue.replace("\n", " "); // We apply a filter to get only standard Exif tags, not maker notes. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From caulier.gilles free fr 2006-05-05 23:50 ------- Marcel, have you find some documentations about JFIF comments encoding ? Also, about a decision if we always want to write Unicode or ASCII, i propose to add an QCheckbox option in metadata setup dialog page. I think that Unicode must be always enable by default. Your viewpoint ? Gilles _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From sochor mathan cz 2006-05-09 01:03 ------- Created an attachment (id=15985) --> (http://bugs.kde.org/attachment.cgi?id=15985&action=view) fixed caption encoding when loading from jpeg exif _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From caulier.gilles free fr 2006-05-09 01:16 ------- SVN commit 538809 by cgilles: digikam from stable : fix JFIF comments section encoding extraction to respect UTF8 CCMAIL: digikam-devel kde org CCBUGS: 120241 M +1 -1 jpegmetadata.cpp --- branches/stable/extragear/graphics/digikam/libs/jpegutils/jpegmetadata.cpp #538808:538809 @ -118,7 +118,7 @ continue; } - comments = QString::fromAscii((const char*)marker->data, + comments = QString::fromUtf8((const char*)marker->data, marker->data_length); } else if (marker->marker == M_EXIF) _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 ------- Additional Comments From marcel.wiesweg gmx de 2006-05-21 17:32 ------- SVN commit 543272 by mwiesweg: Add some autodetection magic for charset support - DMetadata::detectEncodingAndDecode will check if a given string is in UTF8. If not, it will leave it to QTextCodec to decide if the local charset or latin1 will be used - use detectEncodingAndDecode when reading the JFIF comment and for Exif comments with undefined encoding - When writing the Exif comment, use UCS-2 only when necessary. Check with QTextCodec::canEncode if plain latin1 is enough. I have tested this successfully with some Arabian and cyrillic characters. But please test this with some more pictures. UTF-8 should be no problem, but the local8Bit vs. latin1 decision may be. CCBUGS: 120241, 114211 M +75 -15 dmetadata.cpp M +3 -0 dmetadata.h --- trunk/extragear/graphics/digikam/libs/dmetadata/dmetadata.cpp #543271:543272 @ -33,7 +33,9 @ // KDE includes. +#include <kapplication.h> #include <kdebug.h> +#include <kstringhandler.h> #include <ktempfile.h> // Exiv2 includes. @ -714,7 +716,7 @ // In first we trying to get image comments, outside of Exif and IPTC. - QString comments = QString::fromUtf8(d->imageComments.c_str()); + QString comments = detectEncodingAndDecode(d->imageComments); if (!comments.isEmpty()) return comments; @ -780,18 +782,32 @ // In Second we write comments into Exif. - // Be aware that we are dealing with a UCS-2 string. - // Null termination means \0\0, strlen does not work, - // do not use any const-char*-only methods, - // pass a std::string and not a const char * to ExifDatum::operator=(). - const unsigned short *ucs2 = comment.ucs2(); - std::string exifComment("charset=\"Unicode\" "); - exifComment.append((const char*)ucs2, sizeof(unsigned short) * comment.length()); - d->exifMetadata["Exif.Photo.UserComment"] = exifComment; - //d->exifMetadata["Exif.Photo.UserComment"] = comment.latin1(); + // Write as Unicode only when necessary. + QTextCodec *latin1Codec = QTextCodec::codecForName("iso8859-1"); + if (latin1Codec->canEncode(comment)) + { + // write as ASCII + std::string exifComment("charset=\"Ascii\" "); + exifComment += comment.latin1(); + d->exifMetadata["Exif.Photo.UserComment"] = exifComment; + } + else + { + // write as Unicode (UCS-2) - // In Third we write comments into Iptc. Note that Caption IPTC tag is limited to 2000 char. + // Be aware that we are dealing with a UCS-2 string. + // Null termination means \0\0, strlen does not work, + // do not use any const-char*-only methods, + // pass a std::string and not a const char * to ExifDatum::operator=(). + const unsigned short *ucs2 = comment.ucs2(); + std::string exifComment("charset=\"Unicode\" "); + exifComment.append((const char*)ucs2, sizeof(unsigned short) * comment.length()); + d->exifMetadata["Exif.Photo.UserComment"] = exifComment; + } + // In Third we write comments into Iptc. + // Note that Caption IPTC tag is limited to 2000 char and ASCII charset. + QString commentIptc = comment; commentIptc.truncate(2000); d->iptcMetadata["Iptc.Application2.Caption"] = commentIptc.latin1(); @ -815,7 +831,7 @ { std::string comment = exifDatum.toString(); std::string charset; - + // libexiv2 will prepend "charset=\"SomeCharset\" " if charset is specified // Before conversion to QString, we must know the charset, so we stay with std::string for a while if (comment.length() > 8 && comment.substr(0, 8) == "charset=") @ -830,7 +846,7 @ comment = comment.substr(pos+1); } } - + if (charset == "\"Unicode\"") { // QString expects a null-terminated UCS-2 string. @ -849,8 +865,7 @ } else { - // or from local8bit ?? - return QString::fromLatin1(comment.c_str()); + return detectEncodingAndDecode(comment); } } catch( Exiv2::Error &e ) @ -863,6 +878,51 @ return QString(); } +QString DMetadata::detectEncodingAndDecode(const std::string &value) +{ + // For charset autodetection, we could use sophisticated code + // (Mozilla chardet, KHTML's autodetection, QTextCodec::codecForContent), + // but that is probably too much. + // We check for UTF8, Local encoding and ASCII. + + if (value.empty()) + return QString(); + +#if KDE_IS_VERSION(3,2,0) + if (KStringHandler::isUtf8(value.c_str())) + { + return QString::fromUtf8(value.c_str()); + } +#else + // anyone who is still running KDE 3.0 or 3.1 is missing so many features + // that he will have to accept this missing feature. + return QString::fromUtf8(value.c_str()); +#endif + + // Utf8 has a pretty unique byte pattern. + // Thats not true for ASCII, it is not possible + // to reliably autodetect different ISO-8859 charsets. + // We try if QTextCodec can decide here, otherwise we use Latin1. + // Or use local8Bit as default? + + // load QTextCodecs + QTextCodec *latin1Codec = QTextCodec::codecForName("iso8859-1"); + //QTextCodec *utf8Codec = QTextCodec::codecForName("utf8"); + QTextCodec *localCodec = QTextCodec::codecForLocale(); + + // make heuristic match + int latin1Score = latin1Codec->heuristicContentMatch(value.c_str(), value.length()); + int localScore = localCodec->heuristicContentMatch(value.c_str(), value.length()); + + // convert string: + // Use whatever has the larger score, local or ASCII + if (localScore >= 0 && localScore >= latin1Score) + return localCodec->toUnicode(value.c_str(), value.length()); + else + return QString::fromLatin1(value.c_str()); +} + + /* Iptc.Application2.Urgency <==> digiKam Rating links: --- trunk/extragear/graphics/digikam/libs/dmetadata/dmetadata.h #543271:543272 @ -21,6 +21,8 @ #ifndef DMETADATA_H #define DMETADATA_H +#include <string> + // QT includes. #include <qcstring.h> @ -108,6 +110,7 @ PhotoInfoContainer getPhotographInformations() const; static QString convertCommentValue(const Exiv2::Exifdatum &comment); + static QString detectEncodingAndDecode(const std::string &value); private: _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
In reply to this post by jdd@dodin.org
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee. http://bugs.kde.org/show_bug.cgi?id=120241 marcel.wiesweg gmx de changed: What |Removed |Added ---------------------------------------------------------------------------- Status|UNCONFIRMED |RESOLVED Resolution| |FIXED ------- Additional Comments From marcel.wiesweg gmx de 2006-05-22 20:42 ------- We now have support for reading, autodetecting and writing comments as UTF8. Closing this bug. _______________________________________________ Digikam-devel mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-devel |
Free forum by Nabble | Edit this page |