FS#2038 - Corrupt savegames after autoreplace fail

Attached to Project: OpenTTD
Opened by Technofrood (Technofrood) - Sunday, 25 May 2008, 13:49 GMT
Last edited by frosch (frosch) - Saturday, 16 August 2008, 14:18 GMT
Type Bug
Category Vehicles → Autoreplace
Status Closed
Assigned To No-one
Operating System Windows
Severity Critical
Priority Urgent
Reported Version trunk
Due in Version 0.7.0
Due Date Undecided
Percent Complete 100%
Votes 0
Private No


When building r13232 (also happens on r13027) in Visual Studio 2005, a failed auto replace can result in a corrupt save game.

As far as I can tell if any train which has cargo in any of its wagons fails an auto replace any save games created after the failed auto replace will fail to load with a too many orders error.

This does not happen if you build with BuildOTTD.

I have attached a save file which uses UKRS and PB's Industries (Happens without industries as well), to see the problem send the train in the station to the depot, then when it is in the depot or after it leaves the depot save the game, you should be unable to load the save game.
This task depends upon
 FS#2212 - autoreplace rewrite 

Closed by  frosch (frosch)
Saturday, 16 August 2008, 14:18 GMT
Reason for closing:  Fixed
Additional comments about closing:  in r14083
Comment by TheThiefMaster (TheThiefMaster) - Friday, 30 May 2008, 09:36 GMT
Following is EXACTLY what happens:
When autoreplace begins, it backs up all train info, to restore if the replace fails.
Trains are now C++ entities, and it's backing them up with memcpy(). This is bad practice for a start.
Trains contain a cargo packets list. This class contains a pointer to the actual list data.
To avoid keeping a pointer to list data that could get played with or destroyed during the autoreplace, the list is swapped with an empty list (from a local instance of list) before copying the train.
In GCC/BuildOTTD, an empty list contains no dangerous data, it's all NULL etc.
In VC++, and empty list has a pointer to a "head/tail" list item. Accessible in the debugger as emptylist._myHead
At the end of that loop, the local (empty) list is destroyed. In VC++, this frees the "_myHead". The list in the backed-up train now points to freed memory.
On the next loop, the list is recreated. This new list ends up with the SAME "_myHead" pointer as on the last loop.
This second backed-up train's cargo list now points to the same data.
When the replace fails, the backup trains are memcpy'd over the originals.
The original cargo lists aren't properly destroyed, and leak their data.
The backup ("empty") lists all point to the same data.
All the cargo is added back to the train from a seperate backup.
It tries to split it among the carriages, but all the cargo ends up in all the lists because they all have the same head pointer.
However, the list's "size" is set to how big the list for that carriage SHOULD be. i.e. the size of the list doesn't match the amount of data in the list.
When saving in this state, the "size" of the list is saved, followed by ALL the data in the list (which doesn't match the size), which means the savegame can't be loaded.
And that's if it doesn't crash first from the list head being in freed memory.
Comment by Michael Lutz (michi_cc) - Monday, 07 July 2008, 22:54 GMT
The attached patch fixes the problem at least for Visual Studio 2008. More testing, especially with gcc is needed.

It replaces the empty list hack with a proper solution by using in-place new.
It doesn't fix the underlying problem that the current vehicle backup code is
rather ugly and very much prone to break again, memcpy'ing C++ objects being the
most prominent, but solving this would be a lot of work.
Comment by TheThiefMaster (TheThiefMaster) - Tuesday, 08 July 2008, 09:38 GMT
I'm pretty sure that will cause a memory leak on every autoreplace of the size of the "empty list" element.
When restoring the backup, it will memcopy this over the vehicle's existing list (should be empty), losing the pointer to the empty list element from the old vehicle. I'm not sure about this one as I don't have the code to hand.
When not restoring the backup, the memory from the backup is freed without running the destructor of the vehicle, or, more importantly, this list, so the pointer to the empty list element gets lost again.
Comment by Michael Lutz (michi_cc) - Tuesday, 08 July 2008, 17:48 GMT
Restoring the backup is okay as the old vehicle is properly destructed beforehand. But you're right
that not restoring the backup will lead to a memory leak, thanks for spotting that.

I've prepared an updated patch that should not leak memory.
When memcpy'ing the backup vehicle, just don't to anything about
the cargo packet list. The original cargo list will probably be
destroyed later, but actually we don't care about that, as the
cargo packets are saved separately anyway. Instead, simply
construct a new list in-place on restoring the backup.