In this post, I will show simple loop, its alternative implementation with imaginary JIT compiler that “optimizes” its internals. I will also show speed benchmark.

Loops are core building blocks in every program. In Python, they iterate over sequence. However in C, they keep looping while some condition is meet.

Take for example:

// loop1.c
// gcc -c loop1.c && gcc -o loop1 loop1.o && time ./loop1
// clang -c loop1.c && clang -o loop1 loop1.o && time ./loop1
#include <stdio.h>
#include <stdlib.h>

int f() {
    int i;
    int r = 0;
    int l = 100000000;

    for (i = 0; i < l; i++) {
        r += i;
    }

    return r;
}

int main(int argc, char ** argv) {
    int x = f();
    printf("%d\n", x);
    return 0;
}

Tracing JIT would recognize loop and try to optimize it. If code above is in C, there is really no need to do such a thing. However, if your code is in Python, JIT will dramatically improve speed over loops which “captured” variables stay with same type.

Anyway, we will keep C example, and try to work on it. Code above would be JIT compiled to look like following:

// loop2.c
// gcc -c loop2.c && gcc -o loop2 loop2.o && time ./loop2
// clang -c loop2.c && clang -o loop2 loop2.o && time ./loop2
#include <stdio.h>
#include <stdlib.h>

void _loop(int i, int l, int * r) {
    for (; i < l; i++) {
        *r = *r + i;
    }
}

int f() {
    int i;
    int r = 0;
    int l = 100000000;
    int _count = 0;

    for (i = 0; i < l; i++) {
        r += i;
        _count++;

        if (_count > 10000) {
            _loop(i, l, &r);
            break;
        }
    }

    return r;
}

int main(int argc, char ** argv) {
    int x = f();
    printf("%d\n", x);
    return 0;
}

Speed benchmark:

compiler loop1 loop2
gcc 0.293s 0.261s
clang 0.284s 0.284s