-
Bug
-
Resolution: Unresolved
-
None
-
20w14a
-
None
-
Plausible
-
Chunk loading
-
Low
-
Platform
The problem lies within the method
public void updateLevels() { super.updateLevels(); if (!this.positionsAffected.isEmpty()) { LongIterator longIterator = this.positionsAffected.iterator(); while(longIterator.hasNext()) { long l = longIterator.nextLong(); int i = this.distances.get(l); int j = this.getLevel(l); if (i != j) { ChunkTicketManager.this.levelUpdateListener.updateLevel(new ChunkPos(l), () -> { return this.distances.get(l); }, j, (ix) -> { if (ix >= this.distances.defaultReturnValue()) { this.distances.remove(l); } else { this.distances.put(l, ix); } }); this.updateTicket(l, j, this.isWithinViewDistance(i), this.isWithinViewDistance(j)); } } this.positionsAffected.clear(); } }
It basically aims to go over all chunks that changed the distance to the nearest player, compared to the last execution, and schedule additions and removals of chunk tickets accordingly. The problem is that distances which seems to be used as a reference for the old value does in fact not necessarily contain the values of the last execution, but its values are only set in a callback to ChunkTaskPrioritySystem. Hence the values might be stale, in case the ChunkTaskPrioritySystem does not handle this task fast enough, and hence some changes might not be properly detected. As a result, if we switch the distance to some new value and then quickly back to the old one, before the ChunkTaskPrioritySystem has processed the first switch, then updateLevels() will not see the second switch, as it still reads the original values of distances, and hence will neither schedule a second update to distances nor a new chunk ticket task. Hence, when all tasks have finished, the result can contain wrong values for distances and incorrect chunk tickets.
More precisely, you can do the following steps to simulate some lag in the ChunkTaskPrioritySystem and trigger the issue. This method uses breakpoints as a reliable way of simulating certain conditions that can occur in practice, but might be not so reliably reproducible by accident on some setups. However, the chances for encountering the issue by accident might still be "large" on weaker hardware that runs out of threads to process the ChunkTaskPrioritySystem sufficiently fast.
- Load any world and teleport far away from the spawn chunks (e.g. create a new world and /tp ~2000 ~ ~)
- Spawn some entities that can later be used as indicator of chunks being unloaded wrongly
- Set a breakpoint to stop the ChunkTaskPrioritySystem used in ChunkTicketManager, e.g. by setting a breakpoint in the line if (ix >= this.distances.defaultReturnValue()) {. It is important that this breakpoint is configured to only pause the thread hitting it; we want the server thread to continue throughout the rest of the steps.
- Teleport far away from your location (not near the spawnchunks) (e.g. /tp ~2000 ~ ~)
- Now the breakpoint should be hit by some Server-Worker
- Teleport back precisely to your original chunk (e.g. /tp ~-2000 ~ ~)
- Disable the breakpoint and resume the Server-Worker
- At this point the distances got broken and are at a high value for your original (= current) location and a small value for the location you teleported to, which is exactly the opposite of what it should be. Also, chunk tickets got removed for your original location and chunk tickets were scheduled for the location you teleported to. However, as the addition of chunk tickets sanity checks whether the action is still valid, using the correct distanceFromNearestPlayer instead of the broken distances, these scheduled chunk tickets actually won't be added, but distances is broken nevertheless. Hence you end up with no chunk tickets left (except for the spawn chunks)
- Observe that the entities around you won't move or do anything at all, indicating that the chunks are indeed unloaded. Furthermore, the ServerChunkCache is way smaller than it should be.
- As soon as you move one chunk, the distance tracking will be updated again. Since the broken distances indicate that currently no player is around, this update will now schedule all the chunk tickets again. The ServerChunkCache goes up and the world around you springs back to life.
- Teleport to the other location (/tp ~2000 ~ ~). There distances wrongly indicates that some player was around, but no chunk tickets are actually present by the above remark. Hence, when arriving there, no new tickets will be created, and as a result the chunks won't load, no matter how long you wait or move around. Only after traveling to an unaffected area, chunks will start loading again.
The obvious solution to this problem would be to update the values of distances directly, rather than in a callback. I don't know for sure why it is done this way, but I think the idea is that the chunk level/priority changes atomically from the point of view of the ChunkTaskPrioritySystem by putting the level/priority under its control. This way, all tasks get scheduled with this well-defined priority and the priority of all pending tasks can be updated atomically.
In view of this, the proper solution would be to leave distances as it is, purely for the ChunkTaskPrioritySystem and add another map tracking the previous values (or actually make positionsAffected a map tracking those previous values).
In fact, there is another problem related to this discussion. All invocations to the ChunkTaskPrioritySystem resolve the level/priority lazily, in the context of the ChunkTaskPrioritySystem, except for updateTickets(...) called by the above method, which forwards the eagerly resolved distance. Priorities should always be resolved lazily in the context of ChunkTaskPrioritySystem, as otherwise the priority might be changed concurrently while scheduling a task and then all the automatic priority tracking and rewriting will fail. Since this is properly done for all other invocations, I think this is just an oversight here. So, inside updateTickets(...) the eagerly resolved priority () -> return distance; should be changed to the lazily resolved () -> this.distances.get(pos).
Finally, with these changes, Long2IntMaps.synchronize(...) can be removed from distances as it will now only be accessed in the context of a single ChunkTaskPrioritySystem.
Best,
PhiPro