ICFP 2023
Mon 4 - Sat 9 September 2023 Seattle, Washington, United States
Mon 4 Sep 2023 16:00 - 16:30 at Grand Crescent - HIW: Session 3 Chair(s): Iavor Diatchki

Call-by-name evaluation and lazy evaluation both evaluate values only when demanded, but the defining feature of lazy evaluation is that it guarantees that values will only be computed once. This is achieved by representing the suspended computation as a thunk in memory, and then updating that representation with the result once it has been computed. Fundamental though it may be, in some cases call-by-name is preferable:

  • Some data in Haskell programs is really control information, with each piece of data containing some kind of instruction of what to do next, as well as a link to the next instruction; an important example is streaming abstractions such as conduits, pipes, etc. In this case, lazy evaluation results in a long linked list from each instruction to the next; if for whatever reason something is retaining the first instruction in this list (due to accidental sharing), garbage collection cannot clean up the list as it is being generated and executed, and we end up with a memory leak. The solution is to prevent the accidental sharing, but doing so can be fiendishly difficult.

  • Accidental sharing is however not our only foe. If for whatever reason the thunk for one of these instructions is allocated early but evaluated much later, it might end up being moved to the garbage collector’s old generation. Evaluation of this instruction will now result in a linked list of instructions that can be garbage collected, but since we have a link from the old generation to the new (where most instructions will be allocated), doing so requires a major GC; in large applications, this may be very expensive. Since this data is always short-lived, one might hope that a minor GC should always suffice.

Without thunk update, we can sever the ties from one instruction to the next and the problem disappears: the evaluation of an instruction would reveal a link to the next instruction, but if we don’t update the original thunk, we never create the linked list. In this talk, we discuss how the dupIO package (a revamped version of the old ghc-dup package) can be used to address these problems, albeit in a low-level manner; ideally this problem would be addressed in a much more fundamental way.

Mon 4 Sep

Displayed time zone: Pacific Time (US & Canada) change