Hi,
I did Extras > Management… (Extras > Wartung… in German) in my digikam 4.9 installation from the ppa. And then I did a database sync from the files to the database. It takes hours (after 4h it currently reached 14%) and it shows a high disk write rate: 14MB/s. Why does it write so much and where? The database size does not change much (in those 4 hours it has grown by 1MB) and the image files do not seem to be touched (as expected when doing a sync files->database). If it just changes the database contents with 14MB/s, couldn't it do that in memory and then dump the result into the file only once? So where does it write to with 14MB/s? Thanks for clarification, Joram _______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
Noek, I think database sync means that you will sync the metadata inyour database to your files.. So it is going trought all you pictures and writing the data in it. Thats maybe why that much disk activity. But i am not sure about it. hope that helps! Em sáb, 9 de mai de 2015 às 15:52, Noeck <[hidden email]> escreveu:
Hi, _______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
In reply to this post by Noeck
By Henrique Santos Fernandes:
> So it is going trought all you pictures and writing the data in it. > Thats maybe why that much disk activity. Thanks for your reply. I chose "From image metadata to database" and not "From database to image metadata", so according to my understanding, only the database changes. And indeed the image files were not modified (according to the file system). In the end, the mentioned database sync took 25h with a constant write rate of 12 - 15 MB/s. Which naively calculated sums up to 1.2 TB (!). The database only grew from 89 MB to 91 MB. My guess was that the database is updated for each file seperately and completely, no matter whether it contained changes or not. Such that all this amount of written data replaces the previous content – this would explain why no additional space was used (free space on disk only reduced by a few MBs). The thumbnails-digikam.db was not updated at the same time (no time stamp change). It has 950 MB. What puzzles me is that the sum of all images is only 200 GB (50k files) so much less than the data written to disk. Thanks for all further insights, Joram _______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
In reply to this post by Noeck
Hi Joram,
normally DigiKam uses SQLite, what's a database one file basis. Each change has to be written directly to prevent data loss. (and sometimes to read this databasefile back after each change) If you have to do a lot of changes it can take a lot of time. If you start this sync you have to wait until it comes to an end. In later use there is normally only a few changes and for this this type of database works well. Johannes Am 09.05.2015 um 20:51 schrieb Noeck: > Hi, > > I did Extras > Management⊠(Extras > Wartung⊠in German) in my digikam > 4.9 installation from the ppa. And then I did a database sync from the > files to the database. > > It takes hours (after 4h it currently reached 14%) and it shows a high > disk write rate: 14MB/s. Why does it write so much and where? The > database size does not change much (in those 4 hours it has grown by > 1MB) and the image files do not seem to be touched (as expected when > doing a sync files->database). If it just changes the database contents > with 14MB/s, couldn't it do that in memory and then dump the result into > the file only once? > > So where does it write to with 14MB/s? > > Thanks for clarification, > Joram > > _______________________________________________ > Digikam-users mailing list > [hidden email] > https://mail.kde.org/mailman/listinfo/digikam-users > Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
In reply to this post by Noeck
2015-05-10 23:54 GMT+02:00 Noeck <[hidden email]>:
> By Henrique Santos Fernandes: > >> So it is going trought all you pictures and writing the data in it. >> Thats maybe why that much disk activity. > > Thanks for your reply. I chose "From image metadata to database" and not > "From database to image metadata", so according to my understanding, > only the database changes. And indeed the image files were not modified > (according to the file system). yes exactly, but... image metadata must be read to re-populate DB. It's delegate to Exiv2 shared library. > > In the end, the mentioned database sync took 25h with a constant write > rate of 12 - 15 MB/s. Which naively calculated sums up to 1.2 TB (!). I don't understand this value. Are you sure that 15 MB/s is for writing and not reading ? > The database only grew from 89 MB to 91 MB. This value is correct. > > My guess was that the database is updated for each file seperately and > completely, no matter whether it contained changes or not. yes. > Such that all > this amount of written data replaces the previous content – this would > explain why no additional space was used (free space on disk only > reduced by a few MBs). yes > > The thumbnails-digikam.db was not updated at the same time (no time > stamp change). It has 950 MB. size can be correct as it use PGF wavelets compression image data (before following desktop.org paper, we unse PNG which square size easily). > > What puzzles me is that the sum of all images is only 200 GB (50k files) > so much less than the data written to disk. Possible problem can be relevant of a bug about wrong albums list passed to maintenance tools, discovered recently and fixed in 4.10.0 Q : did you use multicore option in Maintenance dialog ? Gilles Caulier _______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
Hi Gilles,
thanks for your reply and for confirming my guesses. Please find answers to your questions inline. >> I chose "From image metadata to database" and not >> "From database to image metadata", so according to my understanding, >> only the database changes. And indeed the image files were not modified >> (according to the file system). > > yes exactly, but... image metadata must be read to re-populate DB. > It's delegate to Exiv2 shared library. Ok. But read not written and the system monitor showed it as written. >> In the end, the mentioned database sync took 25h with a constant write >> rate of 12 - 15 MB/s. Which naively calculated sums up to 1.2 TB (!). > > I don't understand this value. Are you sure that 15 MB/s is for > writing and not reading ? Yes. This is what astonishes me. >> What puzzles me is that the sum of all images is only 200 GB (50k files) >> so much less than the data written to disk. > > Possible problem can be relevant of a bug about wrong albums list > passed to maintenance tools, discovered recently and fixed in 4.10.0 > > Q : did you use multicore option in Maintenance dialog ? No this was done without the multicore option. So there might be some gain here. However it seems pretty much i/o bound. And I chose all albums and all tags (no selection here). I was wondering if that scans all files twice? Is 4.10 already in the ppa for ubuntu, I could test it then. Cheers, Joram _______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
2015-05-13 19:18 GMT+02:00 Noeck <[hidden email]>:
> Hi Gilles, > > thanks for your reply and for confirming my guesses. Please find answers > to your questions inline. > >>> I chose "From image metadata to database" and not >>> "From database to image metadata", so according to my understanding, >>> only the database changes. And indeed the image files were not modified >>> (according to the file system). >> >> yes exactly, but... image metadata must be read to re-populate DB. >> It's delegate to Exiv2 shared library. > > Ok. But read not written and the system monitor showed it as written. > >>> In the end, the mentioned database sync took 25h with a constant write >>> rate of 12 - 15 MB/s. Which naively calculated sums up to 1.2 TB (!). >> >> I don't understand this value. Are you sure that 15 MB/s is for >> writing and not reading ? > > Yes. This is what astonishes me. > >>> What puzzles me is that the sum of all images is only 200 GB (50k files) >>> so much less than the data written to disk. >> >> Possible problem can be relevant of a bug about wrong albums list >> passed to maintenance tools, discovered recently and fixed in 4.10.0 >> >> Q : did you use multicore option in Maintenance dialog ? > > No this was done without the multicore option. So there might be some > gain here. However it seems pretty much i/o bound. > > And I chose all albums and all tags (no selection here). I was wondering > if that scans all files twice? yes, i suspect this. Look this entry in bugzilla : https://bugs.kde.org/show_bug.cgi?id=342791 This affect Thumbnail Generator, Quality Sorter, and Fingerprints Generator. We must take a look if other maintenance tools are affected (in your case DB synchronizer). Gilles Caulier _______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
This is a copy of private mail from another developer who have tried
to reproduce the problem : >The metadatasynchronizer processed TAlbums correctly. I have moved to test my >DB on the SSD. My about 20000 images then require approximately 6 minutes. One >should of course not select tags, because otherwise will double synronisiert. Gilles Caulier 2015-05-13 20:07 GMT+02:00 Gilles Caulier <[hidden email]>: > 2015-05-13 19:18 GMT+02:00 Noeck <[hidden email]>: >> Hi Gilles, >> >> thanks for your reply and for confirming my guesses. Please find answers >> to your questions inline. >> >>>> I chose "From image metadata to database" and not >>>> "From database to image metadata", so according to my understanding, >>>> only the database changes. And indeed the image files were not modified >>>> (according to the file system). >>> >>> yes exactly, but... image metadata must be read to re-populate DB. >>> It's delegate to Exiv2 shared library. >> >> Ok. But read not written and the system monitor showed it as written. >> >>>> In the end, the mentioned database sync took 25h with a constant write >>>> rate of 12 - 15 MB/s. Which naively calculated sums up to 1.2 TB (!). >>> >>> I don't understand this value. Are you sure that 15 MB/s is for >>> writing and not reading ? >> >> Yes. This is what astonishes me. >> >>>> What puzzles me is that the sum of all images is only 200 GB (50k files) >>>> so much less than the data written to disk. >>> >>> Possible problem can be relevant of a bug about wrong albums list >>> passed to maintenance tools, discovered recently and fixed in 4.10.0 >>> >>> Q : did you use multicore option in Maintenance dialog ? >> >> No this was done without the multicore option. So there might be some >> gain here. However it seems pretty much i/o bound. >> >> And I chose all albums and all tags (no selection here). I was wondering >> if that scans all files twice? > > yes, i suspect this. Look this entry in bugzilla : > > https://bugs.kde.org/show_bug.cgi?id=342791 > > This affect Thumbnail Generator, Quality Sorter, and Fingerprints Generator. > > We must take a look if other maintenance tools are affected (in your > case DB synchronizer). > > Gilles Caulier Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
Dear Gilles,
I tested it once again with these settings: all albums (+ checked), all tags (not checked), multicore and only the sync from files to the database (no other maintainance tools). The difference to the last time is: all tags is not checked and the multicore is selected. I am now using DigiKam 4.10.0 under KDE 4.13.3. It is now faster and shows these numbers on average: load: 4 on four cores 25% CPU, 45% wait disk read 25 MB/s, write 12 MB/s It froze about 20 times (and stopped writing and reading) but resumed after about 10-30 seconds each time. At 72% it stopped and a message popped up 3 times within 5 min saying »Der Prozess für das Protokoll digikamtags wurde unerwartet beendet.« which roughly translates as: »The process for the protocol digikamtags was terminated inexpectedly.« The total time was now 27:19, the final message was something like: »All processes terminated successfully«. The write rate is probably just the maximum speed of the drive. The database did not change this time (exact same number of bytes), which makes sense as there are no new images and no new information in the existing images. This is now much more consistent with my expectations. Considering the deadtimes summing up to about 15 min and the fact that my collection is twice the size of the mentioned 20000, this is quite close to the other developer. One more question: My database was created with digikam 2.5 and later used with 3.5 and 4.2. The switch to 4.9 was very recent. Might it be that it had to be rewritten drastically for that reason? tl;dr: Much faster in the 2nd run and with 4.10. Might an old database be the reason? Cheers, Joram PS @Gilles and all involved developers: Thank you very much for this great program! I use it for years now and I am very happy with it. Am 13.05.2015 um 23:18 schrieb Gilles Caulier: > This is a copy of private mail from another developer who have tried > to reproduce the problem : > >> The metadatasynchronizer processed TAlbums correctly. I have moved to test my >> DB on the SSD. My about 20000 images then require approximately 6 minutes. One >> should of course not select tags, because otherwise will double synronisiert. > > Gilles Caulier _______________________________________________ Digikam-users mailing list [hidden email] https://mail.kde.org/mailman/listinfo/digikam-users |
Free forum by Nabble | Edit this page |