I have a need to process a sequence of historical tick data of millisecond timeframe. The ability is required to filter in opening ticks of certain timespans (hourly, minute, etc.). The sequence may have gaps greater, than the span, so the first tick after such gap must be picked as opening one, otherwise the opening tick is one that is closest to pass of calendar beginning of correspondent timespan.
The first thing that comes to my mind is the following stateful filtering function opensTimespan:Timespan->(Timestamp->bool)
that captures timespanId of each gap-opening or interval-opening tick into a closure for passing between invocations:
let opensTimespan (interval: Timespan)=
let lastTakenId = ref -1L // Timestamps are positive
fun (tickAt: Timestamp) ->
let tickId = tickAt / interval in
if tickId <> !lastTakenId then lastTakenId := tickId; true
else false
and can be applied like this:
let hourlyTicks = readTicks @"EURUSD-history.zip" "EURUSD-2012-04.csv"
|> Seq.filter (opensTimespan HOUR) |> Seq.toList
This works fine, but opensTimespan
having the side effect is definitely not idiomatic.
One alternative may be using the fact that the decision upon a tick is opening one or not requires just the pair of timestamps of the self and the previous one to come up with the following stateless filtering function opensTimespanF:Timespan->Timestamp*Timestamp->bool
:
let opensTimespanF interval (ticksPair: Timestamp*Timestamp) =
fst ticksPair/ interval <> snd ticksPair/ interval
that can be applied as:
let hourlyTicks=
seq {
yield 0L;
yield! readTicks @"EURUSD-history.zip" "EURUSD-2012-04.csv"
}
|> Seq.pairwise |> Seq.filter (opensTimespanF HOUR)
|> Seq.map snd
|> Seq.toList
This approach being pure functional produces equivalent results with only a slight (~11%) performance penalty.
What other way(s) of approaching this task in pure functional manner I may be missing?
Thank you.