For a loop in a list
You're right, writing for i in [0 .. 10] do something
creates a list and has significant overhead. Although you can also omit the square brackets, in which case it just creates a lazy sequence (and it turns out the compiler even optimizes that case). I usually prefer to write in 0 .. 100 do
it because it looks the same as code that iterates over a sequence.
Using the function #time
for F # interactive, do a simple analysis:
for i in [ 0 .. 10000000 ] do // 3194ms (yikes!)
last <- i
for i in 0 .. 10000000 do // 3ms
last <- i
for i = 0 to 10000000 do // 3ms
last <- i
for i in seq { 0 .. 10000000 } do // 709ms (smaller yikes!)
last <- i
So it turns out that the compiler is actually optimizing in 0 .. 10000000 do
in the same way as the loop 0 to 10000000 do
. You can force it to create a lazy sequence explicitly (the latter case), which is faster than a list, but still very slow.
source to share
Giving a slightly different answer, but hopefully interesting for some
You are correct that the F # compiler cannot apply the fast loop optimization in this case. The good news is the F # compiler is open source and we can improve its behavior.
So here's a freebie from me:
fast-for-loop optimization happens in tastops.fs . It's pretty primitive at the moment, a great opportunity for us to improve.
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <finishExpr> do <bodyExpr>' expression over integers
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <step> .. <finishExpr> do <bodyExpr>' expression over integers when step is positive
let (|CompiledInt32ForEachExprWithKnownStep|_|) g expr =
match expr with
| Let (_enumerableVar, RangeInt32Step g (startExpr, step, finishExpr), _,
Let (_enumeratorVar, _getEnumExpr, spBind,
TryFinally (WhileLoopForCompiledForEachExpr (_guardExpr, Let (elemVar,_currentExpr,_,bodyExpr), m), _cleanupExpr))) ->
let spForLoop = match spBind with SequencePointAtBinding(spStart) -> SequencePointAtForLoop(spStart) | _ -> NoSequencePointAtForLoop
Some(spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
| _ ->
None
let DetectFastIntegerForLoops g expr =
match expr with
| CompiledInt32ForEachExprWithKnownStep g (spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
// fast for loops only allow steps 1 and -1 steps at the moment
when step = 1 || step = -1 ->
mkFastForLoop g (spForLoop,m,elemVar,startExpr,(step = 1),finishExpr,bodyExpr)
| _ -> expr
The problem is that RangeInt32Step
it only detects patterns like 0..10
and 0..1..10
. He misses, for example,[0..10]
Let's introduce another active template SeqRangeInt32Step
that matches these expressions:
let (|SeqRangeInt32Step|_|) g expr =
match expr with
// detect '[n .. m]'
| Expr.App(Expr.Val(toList,_,_),_,[TType_var _],
[Expr.App(Expr.Val(seq,_,_),_,[TType_var _],
[Expr.Op(TOp.Coerce, [TType_app (seqT, [TType_var _]); TType_var _],
[RangeInt32Step g (startExpr, step, finishExpr)], _)],_)],_)
when
valRefEq g toList (ValRefForIntrinsic g.seq_to_list_info) &&
valRefEq g seq g.seq_vref &&
tyconRefEq g seqT g.seq_tcr ->
Some(startExpr, step, finishExpr)
| _ -> None
How do you know if this is what you need to match the pattern? The approach I often take is that I make a simple F # program with the correct properties and put a compile-time breakpoint to test the expression. From this I create a template to match:
Let's put the two templates together:
let (|ExtractInt32Range|_|) g expr =
match expr with
| RangeInt32Step g range -> Some range
| SeqRangeInt32Step g range -> Some range
| _ -> None
CompiledInt32ForEachExprWithKnownStep
updated to use ExtractInt32Range
overRangeInt32Step
The complete solution will look something like this:
let (|SeqRangeInt32Step|_|) g expr =
match expr with
// detect '[n .. m]'
| Expr.App(Expr.Val(toList,_,_),_,[TType_var _],
[Expr.App(Expr.Val(seq,_,_),_,[TType_var _],
[Expr.Op(TOp.Coerce, [TType_app (seqT, [TType_var _]); TType_var _],
[RangeInt32Step g (startExpr, step, finishExpr)], _)],_)],_)
when
valRefEq g toList (ValRefForIntrinsic g.seq_to_list_info) &&
valRefEq g seq g.seq_vref &&
tyconRefEq g seqT g.seq_tcr ->
Some(startExpr, step, finishExpr)
| _ -> None
let (|ExtractInt32Range|_|) g expr =
match expr with
| RangeInt32Step g range -> Some range
| SeqRangeInt32Step g range -> Some range
| _ -> None
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <finishExpr> do <bodyExpr>' expression over integers
// Detect the compiled or optimized form of a 'for <elemVar> in <startExpr> .. <step> .. <finishExpr> do <bodyExpr>' expression over integers when step is positive
let (|CompiledInt32ForEachExprWithKnownStep|_|) g expr =
match expr with
| Let (_enumerableVar, ExtractInt32Range g (startExpr, step, finishExpr), _,
Let (_enumeratorVar, _getEnumExpr, spBind,
TryFinally (WhileLoopForCompiledForEachExpr (_guardExpr, Let (elemVar,_currentExpr,_,bodyExpr), m), _cleanupExpr))) ->
let spForLoop = match spBind with SequencePointAtBinding(spStart) -> SequencePointAtForLoop(spStart) | _ -> NoSequencePointAtForLoop
Some(spForLoop,elemVar,startExpr,step,finishExpr,bodyExpr,m)
| _ ->
None
Using a simple test program
let print v =
printfn "%A" v
[<EntryPoint>]
let main argv =
for x in [0..10] do
print x
0
Before optimization, the relevant C # code would look something like this (IL code is better to check, but can be a little difficult to understand if not used):
// Test
[EntryPoint]
public static int main(string[] argv)
{
FSharpList<int> fSharpList = SeqModule.ToList<int>(Operators.CreateSequence<int>(Operators.OperatorIntrinsics.RangeInt32(0, 1, 10)));
IEnumerator<int> enumerator = ((IEnumerable<int>)fSharpList).GetEnumerator();
try
{
while (enumerator.MoveNext())
{
Test.print<int>(enumerator.Current);
}
}
finally
{
IDisposable disposable = enumerator as IDisposable;
if (disposable != null)
{
disposable.Dispose();
}
}
return 0;
}
F # creates a list and then uses an enumerator to iterate over it. Unsurprisingly, it is quite slow compared to the classic loop.
After applying the optimization, we get this code:
// Test
[EntryPoint]
public static int main(string[] argv)
{
for (int i = 0; i < 11; i++)
{
Test.print<int>(i);
}
return 0;
}
Significant improvement.
So, steal this code, submit a PR to https://github.com/Microsoft/visualfsharp/ and bask in glory. Of course you need to add unit tests and emit IL code tests which can be somewhat tricky, to find the level you want, check commit for inspiration
PS. Probably should also support[|0..10|]
seq {0..10}
PS. It is for v in 0L..10L do print v
also for v in 0..2..10 do print v
as well as also inefficiently implemented in F #.
source to share
The previous form requires a special language construct (for var from ... to ... by), this is the way ancient programming languages followed:
- 'do' in Fortran
- for var: = expr to expr in Pascal
- and etc.
The last form (for var in some ways) is more pronounced. It works on simple lists, but also generators (like in python), etc. You may not need to create a complete list before running the list. This allows you to write loops on potentially infinite lists.
In any case, a decent compiler / interpreter should recognize the rather frequent special case [expr1..expr2] and avoid evaluating and storing an intermediate list.
source to share