FS#6619 - Games freezes at start-up if -n command line option is used.

Attached to Project: OpenTTD
Opened by sincx (sincx) - Tuesday, 05 September 2017, 06:53 GMT
Type Bug
Category Core
Status New
Assigned To No-one
Operating System All
Severity Critical
Priority Normal
Reported Version 1.7.1
Due in Version Undecided
Due Date Undecided
Percent Complete 0%
Votes 0
Private No


Games freezes at start-up if -n command line option to connect to a specific server is used.

On 1.6.0, using the command line option "openttd -n [server ip]" works fine.
On 1.7.1 or 1.7.0, using the same command line to connect to the exact same server (after adjusting the version of the game running on the server to match) results in a freeze at the Scanning NewGRFs screen.
This task depends upon

Comment by James (james1101) - Tuesday, 05 September 2017, 21:51 GMT
Able to reproduce on Win 7 Ultimate.
Ram: 4 GB, CPU: 2 cores, 2 GHz

Task Manager says CPU usage of client is on average, near 0.
Comment by Peter (gpsoft) - Monday, 27 November 2017, 12:21 GMT
I can also reproduce it on Windows 8.1
It seems it's a deadlock on mutexes.
1.6.1 is still fine, 1.7.0 locks up.
Will try to debug more and check where it was broken exactly.
Comment by Peter (gpsoft) - Monday, 27 November 2017, 23:19 GMT
The problem is since revision 27775 (doesn't happen in revision 27774):
introducing AcquireBlitterLock() and ReleaseBlitterLock() interface which seems it locks (enters into critical section) it with a _draw_mutex object.
In the same time PaintWindowThread is also in the _draw_mutex critical section so two threads are in dead-lock here. It is possible there is another interaction with a _modal_progress_paint_mutex and _modal_progress_work_mutex but I am not sure about it, possibly just a _draw_mutex is dead-locked.
I included the stack trace in a deadlocked state. It is running with a source code of version 1.7.0 (revision 27840).
Author of the modification was frosch, the message on revision 27775 is:
-Fix [ FS#6510 ]: Insufficient thread synchronisation when switching blitters. (JGR)
Comment by Peter (gpsoft) - Tuesday, 28 November 2017, 00:07 GMT
It seems the difference in other cases is the AcquireBlitterLock() is called from the main thread.
The patch seems to fix this problem (although there might be another reason to run that particular code from new thread named "ottd:newgrf-scan"). In this patch I am running it from main thread and it doesn't get locked up.
Comment by frosch (frosch) - Sunday, 11 March 2018, 12:34 GMT
This disables the 'Cancel scanning' button, doesn't it?