View Issue Details
|ID||Project||Category||View Status||Date Submitted||Last Update|
|0002078||SkyChart||1-Software||public||19-02-07 06:50||19-02-21 15:59|
|Reporter||Sasa||Assigned To||Patrick Chevalley|
|Platform||Linux||OS||Ubuntu 64-bit||OS Version||18.10|
|Product Version||4.1 SVN|
|Target Version||4.2||Fixed in Version||4.1 SVN|
|Summary||0002078: Slow picture initialization when DSO is present - Ubuntu rise "Skychart is not resoinding:|
sudo apt-get install skychart-data-stars skychart-data-dso skychart-data-pictures
When ./skychart dir is deleted and start CDC, initialization process begins, however when start to process pictures it appears that is freeze. It actually working and update progress bar, however extremely slowly and in large intervals (comparing to default catalog). Ubuntu also complains that CDC is not responding...
Since this is tested on brand new and fast PC, I would suggest to update progress bar and execute Application.ProcessMessages not based on percent, but time interval. Or perhaps to make it threaded, which is an overkill IMO, as it happens only once.
|Tags||No tags attached.|
Screenshot from 2019-02-07 06-12-40.png (69,381 bytes)
Screenshot from 2019-02-07 06-12-40.png (69,381 bytes)
This not need a fast processor or multi-thread because the limitation is disk access to read the header from every FITS file.
For example on my computer it take about 30 seconds when run from a 5400rpm disk, about 5 seconds from a SSD, and less than 3 seconds when run a second time with all the files in memory cache.
But you are right there was not enough ProcessMessages, only 25 for the whole process.
I increase that to 250 and this must be enough to not get this message.
Sorry, I have to reopen this.
It is something really wrong here, since I have standard 5400rpm HDD and very fast PC, but this process last 15 minutes. There is 9701 items to process.
It may be SQLite version issue, issues with internal cache or any other. I have no idea, since this is really fast PC.
Is it possible with SQLite to make temp database in memory and simply flush all when done in the file? That would be really fast.
OK, I see the problem. For all the testing my sqlite db stay in my $home on the SSD.
If I move the db to the 5400rpm disk it is very slow as you describe. I not think sqlite liked so much the SSD.
There is already a "begin transaction" before the insertion but this not help.
This need more testing.
Interesting, the following change make it finish in a few second on the HDD:
In cu_database.pas insert this two lines between lines 1517 and 1518:
n := 0;
db.Query('PRAGMA journal_mode = MEMORY'); //NEW
db.Query('PRAGMA synchronous = OFF'); //NEW
Yes, but real time is about 1 min. If deleted ./skychart dir and try again, then it is in few seconds because of internal cache. Real speedup is 15x and that may be used for another issue I submitted while ago.
However, I'm not sure when journal file is introduced, maybe 3.4.x, I do not really remember. And I'm not sure did you supply specific SQLite library for specific platform or rely on OS repository package dependence due installation.
Second may be consideration about used memory on some specific platforms.
In any way, another speed up is to avoid plain SQL commands and use BINDing methods. In this case, that is irrelevant.
In general, for linux and PC, this solution is more than suitable with 15x speed up..
Yes, I will move this instruction just after opening the database so this also increase the performance for the asteroid and other operation.
But I still want to search a bit more because normally the StartTransaction function send a BEGIN to sqlite that must prevent it to do the sync I/O for every insert.
Memory journaling was added in 3.6.5 and the version I include for Windows is 3.7.2. On Ubuntu I am at 3.24.0, I will take the occasion to refresh the Windows version anyway.
I'm referring specifically about journal file and when it was introduced, since I'm not certain about behavior of the SQLite wrapper, if that PRAGMA is not supported for specific library...
And also another thing should be taken in consideration is cache size for SQLite. But I do not know details, as I do not use latest SQLite versions. Some older specific bug free version of the library was radically faster for some applications than current ones.
I think I understand how this sqlite transaction work, this is not the same that with mysql or oracle, because each query is unitary it not take account for the "BEGIN TRANSACTION" in a previous query.
So the complete transaction must be put in a single string starting by BEGIN; and ending by COMMIT; with all the INSERT between them.
Doing this way it take only one second to insert all the rows at the end when the progress bar is at 100%. There is no need for the PRAGMA and I remove them.
Now all the slow part with the progress bar is taken to read the fits files and compute the coordinates without any sqlite operation.
I have also updated the sqlite Windows dll to the latest version 3.27.
Performance is the same as previous method. However, I think I have found what was the main problem...
While I read the code carefully, I have noticed that is used Query method. This is incorrect, it should be used Execute method. Then transaction mechanism will work fine, adding each line between START and COMMIT, sending each line separately. I have returned back old transaction methods.
And once again, real execution time is about 1 minute. If delete ./skychart dir and test again, it is few seconds due previous internal cache.
As well I have added each second Application.ProcessMessages, which may be set even under 1s if needed. Old code may be removed or corrected accordingly .
I have attached the whole method for testing.
Update.txt (4,793 bytes)
procedure TCDCdb.ScanImagesDirectory(ImagePath: string; ProgressCat: Tlabel; ProgressBar: TProgressBar); var c, f: tsearchrec; i, j, n, p: integer; catdir, objn, fname, cmd: string; dummyfile: boolean; ra, de, w, h, r: double; ms : QWord; //NEW begin try if DirectoryExists(ImagePath) then begin try if DB.Active then begin ProgressCat.Caption := ''; ProgressBar.position := 0; ms :=GetTickCount64; //NEW j := findfirst(slash(ImagePath) + '*', faDirectory, c); while j = 0 do begin if ((c.attr and faDirectory) <> 0) and (c.Name <> '.') and (c.Name <> '..') then begin catdir := slash(ImagePath) + c.Name; ProgressCat.Caption := c.Name; ProgressBar.position := 0; Application.ProcessMessages; DB.UnLockTables; DB.starttransaction; cmd := 'delete from cdc_fits where catalogname="' + uppercase(c.Name) + '"'; DB.query(cmd); DB.commit; i := findfirst(slash(catdir) + '*.*', 0, f); n := 1; while i = 0 do begin Inc(n); i := findnext(f); end; ProgressBar.min := 0; ProgressBar.max := n; if (ProgressBar.Max > 250) then ProgressBar.Step := ProgressBar.Max div 250 else ProgressBar.Step := 1; i := findfirst(slash(catdir) + '*.*', 0, f); n := 0; db.StartTransaction; // USE THIS while i = 0 do begin if f.Name = 'README.TXT' then begin i := findnext(f); continue; end; Inc(n); if (n mod ProgressBar.step) = 0 then begin ProgressBar.stepit; Application.ProcessMessages; end; // NEW statement if GetTickCount64-ms > 1000 then begin Application.ProcessMessages; ms := GetTickCount64 end; dummyfile := uppercase((extractfileext(f.Name))) = '.NIL'; if dummyfile then begin ra := 99 + random(999999999999999); de := 99 + random(999999999999999); w := 0; h := 0; r := 0; fname := slash(catdir) + f.Name; end else begin // this is much faster than cdcwcs_getinfo FFits.FileName := slash(catdir) + f.Name; ra := FFits.Center_RA; de := FFits.Center_DE; w := FFits.Img_Width; h := FFits.img_Height; r := FFits.Rotation; fname := FFits.FileName; end; if FFits.header.valid or dummyfile then begin objn := extractfilename(f.Name); p := pos(extractfileext(objn), objn); objn := copy(objn, 1, p - 1); objn := uppercase(stringreplace(objn, blank, '', [rfReplaceAll])); cmd := 'INSERT INTO cdc_fits (filename,catalogname,objectname,ra,de,width,height,rotation) VALUES (' + '"' + stringreplace(fname, '\', '\\', [rfReplaceAll]) + '"' + ',"' + uppercase(c.Name) + '"' + ',"' + uppercase(objn) + '"' + ',"' + formatfloat(f5, ra) + '"' + ',"' + formatfloat(f5, de) + '"' + ',"' + formatfloat(f5, w) + '"' + ',"' + formatfloat(f5, h) + '"' + ',"' + formatfloat(f5, r) + '"' + ')'; // Separator is SQL command is only needed // if send group of commands at once DB.Execute(cmd); // CORRECT WAY, Query method was the problem //writetrace(Format(rsDBInsertFail, [f.Name, DB.ErrorMessage])); end else writetrace(Format(rsInvalidFITSF, [f.Name])); i := findnext(f); end; DB.commit; findclose(f); end; j := findnext(c); end; DB.flush('tables'); end; finally findclose(c); end; end else begin ProgressCat.Caption := 'Directory not found!'; end; except end; end;
Update.txt (4,793 bytes)
And BTW, this paslib do not provide support for SQLite binding ability, in order to avoid constant SQL command parsing for each line. For instance:
cmd := 'INSERT INTO cdc_fits (filename,catalogname,objectname,ra,de,width,height,rotation) '+
' VALUES (?,?,?,?,?,?,?,?);
This would be sent by sqlite3_prepare, then in the loop sqlite3_bind*, sqlite3_step and sqlite3_reset. That would be the fastest possible way, suitable for very slow computers as well.
Your method with Execute do not work, no data is actually inserted in the database.
The 1 minute execution time is not related to sqlite but to the time it need to read the header of the 9700 fits files. Try to comment all the sql instruction, you will see it take almost the same time.
You can purge the disk cache between test with the command (as root) : echo 3 > /proc/sys/vm/drop_caches
If you print the time before and after the DB.query(cmdl) in my last version you see it take only 1 or 2 seconds.
The passql library is old and not maintained, probably it as issue and not use correctly the feature that where introduced in sqlite since 15 years, I use it at the time there was no other alternative in Lazarus.
Now the Lazarus native DB component are good and it is worth to change for them.
I suggest we keep it as is for version 4.2 and after the release take the occasion we remove Mysql in 4.4 to change for a better sql component.
Yes, you are right, Execute method is not implemented correctly in the library. Let me some time to look and correct it.
BTW, I have made long time ago my own wrapper work fine even today and this transaction mode should work fine as implemented and very fast. I do not have 10K records, but more than 100K with much more columns...
I have implemented Executed method with plain SQLite3_Exec, however there is very high latency as it use journal file on HDD, and again we have to use PRAGMA to create it in memory. The library itself use by default quite a bit of LOCK/UNLOCK mechanism and that also may trigger creation of journal file...
With older fast version of SQLite (I have looked at old code) transaction worked fast up to some 10K inserts, then slow down progressively, thus if there is more records to insert I have separated in blocks of 10K per transaction, but all was executed line by line and it was extremely fast.
The best performance is to keep as you created already (it is under 10K anyway, which may not be an issue nowadays). Perhaps only to use TStringlist in order to avoid constant memory relocation when concat strings.
However there is another way...
From 3.7.11, it is possible to insert multiple rows with one line only:
INSERT INTO 'name' ('c1', 'c2') VALUES
Which I tested and it works fine. However, I'm not certain does MySQL support it and which version.
At end, since this code should work with MySQL until 4.4, perhaps BEGIN and COMMIT are not recognized as start and end of transaction.
Yes, I not tested my change with mysql and it is not sure it work.
I will test and if this not work I revert this change and let as is for 4.2.
Then for 4.4 we can work on a clean new implementation for sqlite only.
In this case all I need is binding for the sqlite function with some utility function to not mess with pchar everywhere.
I really prefer to work with Prepare, Execute, Fetch, ... if possible.
I can see you revert the changes. However, here may be used global variable which direct which databases is used then may be used specific code/procedure. Benefit of this speed up for SQLite is so massive that it worth to be kept and used, until are used both engines.
OK, I make the change again, but only for sqlite:
A big problem with this method is if a single insert fail all the changes is rollback.
This is why I not want to generalize it for other insert. But for this one the data are the same for all the users and is well tested, so this must not be a problem.
|19-02-07 06:50||Sasa||New Issue|
|19-02-07 14:18||Sasa||File Added: Screenshot from 2019-02-07 06-12-40.png|
|19-02-07 18:17||Patrick Chevalley||Assigned To||=> Patrick Chevalley|
|19-02-07 18:17||Patrick Chevalley||Status||new => resolved|
|19-02-07 18:17||Patrick Chevalley||Resolution||open => fixed|
|19-02-07 18:17||Patrick Chevalley||Fixed in Version||=> 4.1 SVN|
|19-02-07 18:17||Patrick Chevalley||Target Version||=> 4.2|
|19-02-07 18:17||Patrick Chevalley||Note Added: 0005355|
|19-02-07 23:20||Sasa||Status||resolved => new|
|19-02-07 23:20||Sasa||Resolution||fixed => reopened|
|19-02-07 23:20||Sasa||Note Added: 0005358|
|19-02-07 23:22||Sasa||Note Edited: 0005358||View Revisions|
|19-02-07 23:57||Patrick Chevalley||Note Added: 0005359|
|19-02-07 23:57||Patrick Chevalley||Status||new => assigned|
|19-02-08 00:15||Patrick Chevalley||Note Added: 0005360|
|19-02-08 01:36||Sasa||Note Added: 0005361|
|19-02-08 09:37||Patrick Chevalley||Note Added: 0005363|
|19-02-08 09:59||Sasa||Note Added: 0005364|
|19-02-08 10:21||Sasa||Note Edited: 0005364||View Revisions|
|19-02-08 17:33||Sasa||Note Edited: 0005358||View Revisions|
|19-02-08 23:43||Patrick Chevalley||Note Added: 0005366|
|19-02-09 08:08||Sasa||File Added: Update.txt|
|19-02-09 08:08||Sasa||Note Added: 0005367|
|19-02-09 08:20||Sasa||Note Added: 0005368|
|19-02-09 08:25||Sasa||Note Edited: 0005367||View Revisions|
|19-02-09 08:26||Sasa||Note Edited: 0005368||View Revisions|
|19-02-09 08:27||Sasa||Note Edited: 0005368||View Revisions|
|19-02-09 08:29||Sasa||Note Edited: 0005367||View Revisions|
|19-02-09 09:48||Patrick Chevalley||Note Added: 0005371|
|19-02-09 12:44||Sasa||Note Added: 0005377|
|19-02-10 08:40||Sasa||Note Added: 0005379|
|19-02-10 08:42||Sasa||Note Edited: 0005379||View Revisions|
|19-02-10 14:13||Patrick Chevalley||Note Added: 0005380|
|19-02-20 07:21||Sasa||Note Added: 0005424|
|19-02-20 07:40||Sasa||Note Edited: 0005424||View Revisions|
|19-02-20 07:40||Sasa||Note Edited: 0005424||View Revisions|
|19-02-21 15:59||Patrick Chevalley||Status||assigned => resolved|
|19-02-21 15:59||Patrick Chevalley||Resolution||reopened => fixed|
|19-02-21 15:59||Patrick Chevalley||Note Added: 0005430|