Severing ties: the need for non-updateable thunks
Call-by-name evaluation and lazy evaluation both evaluate values only when demanded, but the defining feature of lazy evaluation is that it guarantees that values will only be computed once. This is achieved by representing the suspended computation as a thunk in memory, and then updating that representation with the result once it has been computed. Fundamental though it may be, in some cases call-by-name is preferable:
-
Some data in Haskell programs is really control information, with each piece of data containing some kind of instruction of what to do next, as well as a link to the next instruction; an important example is streaming abstractions such as conduits, pipes, etc. In this case, lazy evaluation results in a long linked list from each instruction to the next; if for whatever reason something is retaining the first instruction in this list (due to accidental sharing), garbage collection cannot clean up the list as it is being generated and executed, and we end up with a memory leak. The solution is to prevent the accidental sharing, but doing so can be fiendishly difficult.
-
Accidental sharing is however not our only foe. If for whatever reason the thunk for one of these instructions is allocated early but evaluated much later, it might end up being moved to the garbage collector’s old generation. Evaluation of this instruction will now result in a linked list of instructions that can be garbage collected, but since we have a link from the old generation to the new (where most instructions will be allocated), doing so requires a major GC; in large applications, this may be very expensive. Since this data is always short-lived, one might hope that a minor GC should always suffice.
Without thunk update, we can sever the ties from one instruction to the next and the problem disappears: the evaluation of an instruction would reveal a link to the next instruction, but if we don’t update the original thunk, we never create the linked list. In this talk, we discuss how the dupIO
package (a revamped version of the old ghc-dup
package) can be used to address these problems, albeit in a low-level manner; ideally this problem would be addressed in a much more fundamental way.
Mon 4 SepDisplayed time zone: Pacific Time (US & Canada) change
16:00 - 17:30 | |||
16:00 30mTalk | Severing ties: the need for non-updateable thunks HIW Edsko de Vries Well-Typed LLP | ||
16:30 30mTalk | ghc-specter: a GHC plugin that inspects the GHC state on live HIW Ian-Woo Kim Mercury Technologies, Inc | ||
17:00 15mTalk | Execution domains in GHC/Haskell (Lightning Talk) HIW Ben Gamari Well-Typed LLP | ||
17:15 15mTalk | Kudzu (Lightning Talk) HIW |