I am trying to share as much code as possible between emulators and a CLaSH implementations for CPUs. As part of this, I am writing instruction fetching & decoding as something along the lines of
fetchInstr :: (Monad m) => m Word8 -> m Instr
This is trivial to run in the emulator using a monad that has a program counter in its state and direct access to memory. For the hardware version, I make a fixed-size buffer (since the instruction byte-length is bounded) and in every cycle, short-circuit the fetching if there is not enough data in the buffer yet.
data Failure
= Underrun
| Overrun
deriving Show
data Buffer n dat = Buffer
{ bufferContents :: Vec n dat
, bufferNext :: Index (1 + n)
}
deriving (Show, Generic, Undefined)
instance (KnownNat n, Default dat) => Default (Buffer n dat) where
def = Buffer (pure def) 0
remember :: (KnownNat n) => Buffer n dat -> dat -> Buffer n dat
remember Buffer{..} x = Buffer
{ bufferContents = replace bufferNext x bufferContents
, bufferNext = bufferNext + 1
}
newtype FetchM n dat m a = FetchM{ unFetchM :: ReaderT (Buffer n dat) (StateT (Index (1 + n)) (ExceptT Failure m)) a }
deriving newtype (Functor, Applicative, Monad)
runFetchM :: (Monad m, KnownNat n) => Buffer n dat -> FetchM n dat m a -> m (Either Failure a)
runFetchM buf act = runExceptT $ evalStateT (runReaderT (unFetchM act) buf) 0
fetch :: (Monad m, KnownNat n) => FetchM n dat m dat
fetch = do
Buffer{..} <- FetchM ask
idx <- FetchM get
when (idx == maxBound) overrun
when (idx >= bufferNext) underrun
FetchM $ modify (+ 1)
return $ bufferContents !! idx
where
overrun = FetchM . lift . lift . throwE $ Overrun
underrun = FetchM . lift . lift . throwE $ Underrun
The idea is that this would be used by storing a Buffer n dat
in the CPU's state during instruction fetching, and remember
ing values coming in from memory while there is a buffer underrun:
case cpuState of
Fetching buf -> do
buf' <- remember buf <$> do
modify $ \s -> s{ pc = succ pc }
return cpuInMem
instr_ <- runFetchM buf' $ fetchInstr fetch
instr <- case instr_ of
Left Underrun -> goto (Fetching buf') >> abort
Left Overrun -> errorX "Overrun"
Right instr -> return instr
goto $ Fetching def
exec instr
This works just fine in the CLaSH simulator.
The problem is, if I start using it this way, it needs a lot bigger inlining limit for CLaSH to be able to synthesize it. For example, on a CHIP-8 implementation, this commit starts using the above-described FetchM
. Before this change, an inlining depth of just 100 is enough to get through the CLaSH synthesizer; after this change, 300 is not enough and 1000 causes CLaSH to just churn until it runs out of memory.
What is so evil about FetchM
that the inliner chokes on it?