Flow Fields

Flow fields are a technique applicable when many units need to pathfind to the same location. It is an extension of traditional pathfinding algorithms, but instead of only calculating one path per unit it calculates a path for every cell on the grid, reusing information from other known paths. This has a high immediate cost, but once the flow field has been built it can be reused indefinitely with no additional cost: Units that pass over the same tile reuse data, and if a unit is pushed off the path it had planned, a new one is already available.

Flow fields have two components: an integration field that records the distance to target, and a gradient field that informs units where to move next.

The integration field is calculated first using an existing pathfinding technique, which is typically breadth-first search. Data reuse is trivial since each node record has total distance travelled and how we got here. However, instead of stopping once a path from A to B is found, the pathfinder runs until it exhausts the grid, generating a node record for every connected cell without running the backtracking step. The integration field is then set to each cell’s distance to the target.

Note: It is technically more accurate to say “cost” here, but “distance” reads much more intuitively. For most purposes, these are the same.

The gradient field can be calculated in two ways. The “correct” way is to take the 2D derivative of the integration field, similar to an edge detection kernel. However, this may lead to an edge case where a unit may move left or right to get closer to its target, but the forces even out and cause it to run into a wall. A more practical and performant solution is to reuse data from node records and ignore the integration field entirely. This is one case where race conditions are surprisingly useful.


I implemented flow fields for Lanternbound, a tower defense with free building on a grid. This is a good use case since it has enough units that it can’t afford to pathfind for each of them, and the target is stationary (for the most part).

In practice, units may want to consider more than one target. One approach is to create separate parallel grids for each, and have steering blend them every frame. However, if all units use the same target prioritization, this can be precalculated on one grid: instead of starting with one entry in the priority queue, add more.

Either way when handling multiple targets, one must be careful with prioritization criteria, as it can generate entirely different types of conic sections. If targets are prioritized by proportion (cost to A is twice as bad as cost to B), the area of influence for the lower-priority target will be a circle, and units approaching from far away may path around A to get to B. However, if targets are prioritized by a fixed offset, the area of influence will be a parabola.

n*dist(A,P) <= dist(B,P)
For n 0.5 to 1
Rendered using Desmos
n+dist(A,P) <= dist(B,P)
For n -2 to 0
Rendered using Desmos

This implementation is also interruptible, performing a limited adaptive amount of work in the time between frames that would otherwise send the main thread to sleep. However, this introduces several issues. First, cells also must carry information about whether they are dirty and/or the ID of the last update cycle. Second, new updates will either block an update that is already running, or will be blocked until that update is complete, at which point the update may be out of date, especially if it is a moving target. This can be somewhat remedied by increasing the work per frame, but this may just hurt performance for less-powerful hardware.

Further improvements

Interruptible pathfinding does not work well with frequent updates or moving targets. However, this might be a flaw in how my implementation prioritizes new updates, or might just be that the maximum allowed steps per frame is set too low. Also, as can be seen in the video, the dirty flag often just doesn’t work correctly, and cells that should be recalculated aren’t.

Since this algorithm only needs input from adjacent cells it is easy to parallelize, and it may be possible to implement this on the GPU through a compute shader or a pixel shader. This will likely speed up processing for larger graphs, as it decreases the number of cycles needed to solve, but smaller graphs may instead be bottlenecked by transferring the buffer or texture between GPU and CPU.