• My YouTube Channel

    I started YouTube channel in 2020 during COVID-19 pandemics. It was perfect timing for me to finally align my thoughts. After years of thinking, I just started recording and publishing videos. My friends and colleagues asked me for years to record my talks and publish it for others to see. All opinions and advices are my own, and take them with grain of salt, I am just a regular guy doing interesting things online.

    I cover broad topics from computer science and engineering. Including BSD, Linux, data structures, algorithms, machine learning, programming languages design and implementation, etc. Please check my YouTube channel and subscribe if you like topics that I cover. Feel free to leave comments and suggestions.

    I am also active on Twitter, so you can get in touch with me over there as well.

  • Language Virtual Machine

    Hi there,

    I will start here series about language virtual machines. I will first show native examples in C and in few other dynamic languages such as Python, Ruby and Lua. Their speeds will be compared.

    After we have insight into what kind of speeds we expect from which programming language implementation, we will try to explore approaches in VM design and implementation.

    Our main goal is to show speed of the simplest but functional possible VM’s and some basic optimizations.

    VM’s will be written in C, C99 and in some cases C11 standard.

    System (my home desktop computer) on which benchmarks will be run is following:

    $ lscpu 
    Architecture:          x86_64
    CPU op-mode(s):        32-bit, 64-bit
    Byte Order:            Little Endian
    CPU(s):                4
    On-line CPU(s) list:   0-3
    Thread(s) per core:    1
    Core(s) per socket:    4
    Socket(s):             1
    NUMA node(s):          1
    Vendor ID:             AuthenticAMD
    CPU family:            18
    Model:                 1
    Model name:            AMD A8-3850 APU with Radeon(tm) HD Graphics
    Stepping:              0
    CPU MHz:               2900.000
    CPU max MHz:           2900.0000
    CPU min MHz:           800.0000
    BogoMIPS:              5792.15
    Virtualization:        AMD-V
    L1d cache:             64K
    L1i cache:             64K
    L2 cache:              1024K
    NUMA node0 CPU(s):     0-3
    
    $ uname -nsrm
    Linux arch 4.1.4-1-ARCH x86_64
    
    $ cat /proc/meminfo | grep MemTotal
    MemTotal:        8170204 kB
  • Loops

    In this post, I will show simple loop, its alternative implementation with imaginary JIT compiler that “optimizes” its internals. I will also show speed benchmark.

    Loops are core building blocks in every program. In Python, they iterate over sequence. However in C, they keep looping while some condition is meet.

    Take for example:

    // loop1.c
    // gcc -c loop1.c && gcc -o loop1 loop1.o && time ./loop1
    // clang -c loop1.c && clang -o loop1 loop1.o && time ./loop1
    #include <stdio.h>
    #include <stdlib.h>
    
    int f() {
        int i;
        int r = 0;
        int l = 100000000;
    
        for (i = 0; i < l; i++) {
            r += i;
        }
    
        return r;
    }
    
    int main(int argc, char ** argv) {
        int x = f();
        printf("%d\n", x);
        return 0;
    }

    Tracing JIT would recognize loop and try to optimize it. If code above is in C, there is really no need to do such a thing. However, if your code is in Python, JIT will dramatically improve speed over loops which “captured” variables stay with same type.

    Anyway, we will keep C example, and try to work on it. Code above would be JIT compiled to look like following:

    // loop2.c
    // gcc -c loop2.c && gcc -o loop2 loop2.o && time ./loop2
    // clang -c loop2.c && clang -o loop2 loop2.o && time ./loop2
    #include <stdio.h>
    #include <stdlib.h>
    
    void _loop(int i, int l, int * r) {
        for (; i < l; i++) {
            *r = *r + i;
        }
    }
    
    int f() {
        int i;
        int r = 0;
        int l = 100000000;
        int _count = 0;
    
        for (i = 0; i < l; i++) {
            r += i;
            _count++;
    
            if (_count > 10000) {
                _loop(i, l, &r);
                break;
            }
        }
    
        return r;
    }
    
    int main(int argc, char ** argv) {
        int x = f();
        printf("%d\n", x);
        return 0;
    }

    Speed benchmark:

    compiler loop1 loop2
    gcc 0.293s 0.261s
    clang 0.284s 0.284s
  • Welcome!

    I’ve decided to start writing about things that I think about daily. For almost 10 years, I’ve been thinking about successfully implementing VM for Lua or Python.

    Lua has more educational purpose in my life. I have learned a lot from its simple implementation.

    Python is language of choice for me. I like its elegance and simplicity, but I hate its internal ecosystem. It is huge. I wish we could install subprocess or re modules using pip instead of they coming included inside python implementations.

    This blog will be mainly about implementing Virtual Machines. Practices might not be the best, but idea is to implement cool stuff, so other can learn from it.

    I will text with intermediate level of knowledge. I expect that you understand compilation, basic VM principles, JIT compilation, etc.