4 回答
Referential transparency is about equality statements and variable references. When you say x = y and your language is referentially transparent, then you can replace every occurence of x by y (modulo scope).
If you haven't specified x = ()
, then you can't replace x
by ()
safely, such as in your case. This is because you are wrong about the inhabitants of ()
, because in Haskell there are two of them: One is the only constructor of ()
, namely ()
. The other is the value that is never calculated. You may call it bottom or undefined:
x :: ()
x = x
You certainly can't replace any occurrence of x
by ()
here, because that would chance semantics. The existence of bottom in the language semantics allows some awkward edge cases, especially when you have a seq
combinator, where you can even prove every monad wrong. That's why for many formal discussions we disregard the existence of bottom.
However, referential transparency does not suffer from that. Haskell is still referentially transparent and as such a purely functional language.
Your definition of referential transparency is incorrect. Referential transparency does not mean you can replace x :: ()
with () :: ()
and everything stays the same; it means you can replace all occurrences of a variable with its definition and everything stays the same. seq
and rseq
are not in conflict with referential transparency if you use this definition.
seq
is a red herring here.
unitseq :: () -> a -> a
unitseq x y = case x of () -> y
This has the same semantics as seq
on unit, and requires no magic. In fact, seq
only has something "magic" when working on things you can't pattern match on -- i.e., functions.
Replacing undefined :: ()
with ()
has the same ill effects with unitseq
as with the more magical seq
.
In general, we think of values not only as "what" they are, but how defined they are. From this perspective, it should be clear why we can't go implementing definedness-changing transformations willy-nilly.
Thanks all for the inspiring answers. After thinking about it more, I come to the following views:
View 1: Considering terminating expressions only
If we restrict ourselves to a normalizing system, then seq
(or rseq
etc.) has no influence on the result of a program. It can change the amount of memory the program uses and increase the CPU time (by evaluating expressions we won't actually need), but the result is the same. So in this case, the rule x :: () --> () :: ()
is admissible - nothing is lost, CPU time and memory can be saved.
View 2: Curry-Howard isomorphism
Since Haskell is Turing-complete and so any type is inhabited by an infinite loop, the C-H corresponding logic system is inconsistent (as pointed out by Vitus) - anything can be proved. So actually any rule is admissible in such a system. This is probably the biggest problem in my original thinking.
View 3: Expansion rules
Natural deduction rules give us reductions and expansions for different type operators. For example for ->
it's β-reduction and η-expansion, for (,)
it is
fst (x, y)
--reduce-->x
(and similarly forsnd
)
x : (a,b)
--expand-->(fst x, snd x)
etc. With the presence of bottom, the reducing rules are still admissible (I believe), but the expansion rules not! In particular, for ->
:
rseq (undefined :: Int -> Int) = undefined
but it's η-expansion
rseq (\x -> (undefined :: Int -> Int) x) = ()
or for (,)
:
case undefined of (a,b) -> ()
but
case (fst undefined, snd undefined) of (a,b) -> ()
etc. So in the same way, the expansion rule for ()
is not admissible.
View 4: Lazy pattern matching
Allowing x :: ()
to expand to () :: ()
could be viewed as a special case of forcing all pattern matching on single-constructor data types to be lazy. I've never seen lazy pattern matching to be used on a multi-constructor data type, so I believe this change would allow us to get rid of the ~
lazy pattern matching symbol completely. However, it would change the semantic of the language to be more lazy (see also Daniel Fischer's comment to the question). And moreover (as described in the question) break Strategy
.