Requests to SQLite3 databases do not wait for the lock, causing errors on simultaneous access #206

Closed
opened 1 year ago by ChristopherCormier · 10 comments

This can be trivially reproduced by trying to submit two posts at the same time to a board, but it can happen commonly out in the wild on any board that has a high likelihood of two people trying to submit something at once.

The largest issue here is that, from a user's perspective, the post looks like it should've succeeded but it doesn't, and they lose everything that they've typed.

"Silently" isn't entirely true, as it does produce three warnings that are visible for a small fraction of a second in the same screen where it updates the thread and index, though it goes by so quickly that it'd be impossible for a user to understand it, and unintelligible even if they did: https://i.imgur.com/iUtTiuq.png

Though not thoroughly tested, this problem should also reasonably affect any and all database operations when using SQLite3, preventing anything from happening concurrently. Notably, accessing the management panel at the same time as someone is making a post causes an error in the management panel as well. I'm suspecting that the reverse is also true, that accessing the management panel would block someone's post if it was timed right.

My basic impression is that the optimal way to fix this would be to raise the SQLite busy timeout value so that other requests will properly wait on the lock to close, "queueing" for it instead of immediately erroring out. Let me know if I'm misunderstanding anything.

This can be trivially reproduced by trying to submit two posts at the same time to a board, but it can happen commonly out in the wild on any board that has a high likelihood of two people trying to submit something at once. The largest issue here is that, from a user's perspective, the post looks like it should've succeeded but it doesn't, and they lose everything that they've typed. "Silently" isn't *entirely* true, as it does produce three warnings that are visible for a small fraction of a second in the same screen where it updates the thread and index, though it goes by so quickly that it'd be impossible for a user to understand it, and unintelligible even if they did: https://i.imgur.com/iUtTiuq.png Though not thoroughly tested, this problem should also reasonably affect any and all database operations when using SQLite3, preventing anything from happening concurrently. Notably, accessing the management panel at the same time as someone is making a post causes an error in the management panel as well. I'm suspecting that the reverse is also true, that accessing the management panel would block someone's post if it was timed right. My basic impression is that the optimal way to fix this would be to raise the SQLite busy timeout value so that other requests will properly wait on the lock to close, "queueing" for it instead of immediately erroring out. Let me know if I'm misunderstanding anything.

Small follow-up: I think I've been able to reproduce this simply from deleting two posts separately in relatively quick succession (7-10 seconds between). There's a decently powerful server backing my instance with no other activity outside of testing. It seems like the lock is lasting longer than is reasonable, especially with the tiny size of the database and the triviality of the database operation. Is it waiting on something else before it closes?

Small follow-up: I think I've been able to reproduce this simply from deleting two posts separately in relatively quick succession (7-10 seconds between). There's a decently powerful server backing my instance with no other activity outside of testing. It seems like the lock is lasting longer than is reasonable, especially with the tiny size of the database and the triviality of the database operation. Is it waiting on something else before it closes?
tslocum added the
bug
label 1 year ago
tslocum closed this issue 1 year ago
Owner

Thanks for reporting this.

Thanks for reporting this.

Reopening for now, as it seems like this has been happening again when my board's at peak hours of activity. Bewilderingly, this includes moments when no posts are even being made, but the board is being accessed a lot.

Anecdotally, it seems like it happens much more frequently when TINYIB_AUTOREFRESH is set to a very low value, but I don't know if that makes sense. It also tends to happen when navigating through the moderation section, presumably from the lastactive value being updated for staff accounts in the database on each page load.

The most common error I've seen lately is:
PHP Warning: SQLite3::exec(): database is locked in /var/www/html/inc/database/sqlite3.php on line 4
But the line number has been almost anything, especially including lines 15, 17, 19, and 33 recently. I've had user reports of this error occasionally eating posts as well, and it's possibly connected to some other issues I've experienced (such as threads sometimes only partially generating, creating a listing on the board index and catalog but without creating an associated res/X.html file).

I don't know if there's any good way for me to test or diagnose the cause of this. If I query the setting in the database with PRAGMA busy_timeout;, it claims to be set to 0 instead of the expected 60000, but I don't know if that has any actual relation to the timeout that's being set in sqlite3_link.php

Reopening for now, as it seems like this has been happening again when my board's at peak hours of activity. Bewilderingly, this includes moments when no posts are even being made, but the board is being accessed a lot. Anecdotally, it seems like it happens *much* more frequently when TINYIB_AUTOREFRESH is set to a very low value, but I don't know if that makes sense. It also tends to happen when navigating through the moderation section, presumably from the lastactive value being updated for staff accounts in the database on each page load. The most common error I've seen lately is: `PHP Warning: SQLite3::exec(): database is locked in /var/www/html/inc/database/sqlite3.php on line 4` But the line number has been almost anything, especially including lines 15, 17, 19, and 33 recently. I've had user reports of this error occasionally eating posts as well, and it's possibly connected to some other issues I've experienced (such as threads sometimes only partially generating, creating a listing on the board index and catalog but without creating an associated res/X.html file). I don't know if there's any good way for me to test or diagnose the cause of this. If I query the setting in the database with `PRAGMA busy_timeout;`, it claims to be set to 0 instead of the expected 60000, but I don't know if that has any actual relation to the timeout that's being set in sqlite3_link.php
ChristopherCormier reopened this issue 11 months ago
Owner

Please try 5d191f3513 from the sqlitefix branch, which enables write-ahead logging.

Please try 5d191f351316793c4b31a3819ddd7bc29cfae41c from the sqlitefix branch, which enables [write-ahead logging](https://www.sqlite.org/wal.html).
Owner

Also, please try replacing this line

$db->busyTimeout(60000);

with

$db->exec("PRAGMA busy_timeout = 60000");

and let me know if busy_timeout is updated.

Also, please try replacing this line `$db->busyTimeout(60000);` with `$db->exec("PRAGMA busy_timeout = 60000");` and let me know if `busy_timeout` is updated.

Sorry for the long wait, I only just now got around to adopting those fixes. I've replaced that line in my sqlite3_link.php file with $db->exec("PRAGMA busy_timeout = 60000"); and $db->exec("PRAGMA journal_mode = wal"); as shown here: https://i.imgur.com/9d5D416.png

Time will tell if this fixes the problem, but I'll note that querying busy_timeout through the interactive shell still returns 0, even after TinyIB has modified the database with this new code added.

That said, perhaps this issue stems from my inexperience with SQL. Even if I copy the database to a separate file that's unmanaged by TinyIB, manually run PRAGMA busy_timeout = 60000; on it, confirm it saved by querying the pragma immediately after, and then save and exit the database, the value has reset whenever I freshly open it again.

Is this pragma value meant to be persistently stored in the database after being set? Or is it only remembered as long as your connection with the database remains open? I suspect I'm doing something wrong.

Sorry for the long wait, I only just now got around to adopting those fixes. I've replaced that line in my sqlite3_link.php file with `$db->exec("PRAGMA busy_timeout = 60000");` and `$db->exec("PRAGMA journal_mode = wal");` as shown here: https://i.imgur.com/9d5D416.png Time will tell if this fixes the problem, but I'll note that querying busy_timeout through the interactive shell still returns `0`, even after TinyIB has modified the database with this new code added. That said, perhaps this issue stems from my inexperience with SQL. Even if I copy the database to a separate file that's unmanaged by TinyIB, manually run `PRAGMA busy_timeout = 60000;` on it, confirm it saved by querying the pragma immediately after, and then save and exit the database, the value has reset whenever I freshly open it again. Is this pragma value meant to be persistently stored in the database after being set? Or is it only remembered as long as your connection with the database remains open? I suspect I'm doing something wrong.

Update: Doesn't seem to have made a difference unfortunately. During another spike of activity, my error.log saw a sizable number of "PHP Warning: SQLite3::exec(): database is locked in [...]/inc/database/sqlite3.php on line 4" entries.

Worth noting that "PRAGMA journal_mode;" does print wal now, but "PRAGMA busy_timeout;" continues printing 0

Update: Doesn't seem to have made a difference unfortunately. During another spike of activity, my error.log saw a sizable number of "PHP Warning: SQLite3::exec(): database is locked in [...]/inc/database/sqlite3.php on line 4" entries. Worth noting that "PRAGMA journal_mode;" *does* print `wal` now, but "PRAGMA busy_timeout;" continues printing `0`
Owner

Thanks. I will look into this more when I have some time. If this is being caused by a race condition, #251 should fix this.

Thanks. I will look into this more when I have some time. If this is being caused by a race condition, #251 should fix this.
Owner

Please let me know whether this was indeed fixed by #251 (which I've just implemented).

Please let me know whether this was indeed fixed by #251 (which I've just implemented).

My boards have been less active anyways lately so I can't say for sure, but I haven't had any instances of this error since pulling in 57b50658a2 a couple months ago. I'll close this for now and assume it's fixed. Thank you for your work as always!

My boards have been less active anyways lately so I can't say for sure, but I haven't had any instances of this error since pulling in 57b50658a2 a couple months ago. I'll close this for now and assume it's fixed. Thank you for your work as always!
ChristopherCormier closed this issue 1 month ago
Sign in to join this conversation.
No Milestone
No Assignees
2 Participants
Notifications
Due Date

No due date set.

Dependencies

This issue currently doesn't have any dependencies.

Loading…
There is no content yet.