Get it, cause this one is about multi-threading a raytracer. On second thought, I hadn’t said that yet so no one could get it but yes, we’re back to my terrible jokes, they are what I live for.
This blog is less about the raytracer and more about the multi-threading that I used to achieve the performance increase and the mildly upsetting news that I got from it. In order to learn about multi-threading and optimization what better way than to use a raytracer that takes approximately 80 seconds to render on my “AMD FX(tm)-8350 Eight-Core Processor” without any threading and 18 seconds with my basic threading. So what did I do you ask? Aside from attempt to compile a bunch of times because I had no idea why things were broken, I setup my threading to take full advantage of how many cores and threads that I had. This was actually remarkably easy as c++ has a handy “#include <thread>” which lets you do all of your multi-threading without an external library or anything like that. By including ‘thread’ I had access to everything I needed to multi-thread, one such function handily returns an unsigned integer with the number of threads available.
From there I needed to loop through creating a thread and beginning to render a section of the image based on how many available threads there are. This was also done fairly easily as creating a thread is a surprisingly simple process that is very similar to creating a function.
^Again, surprisingly easy^
In this case I used pointers to threads I was creating as simply creating a thread caused issues when trying to make creating them dependent on threadcount. This was because I needed to create a list of the threads so I could join them at the end but without pointers it would try and make a copy and do all sorts of funky horribleness. Once I had added them all to the list I looped over them and joined them with another handy function then cleared the list.
I almost feel like this is utter blasphemy with how easy it was to multi-thread as I had expected multi-threading to be this super hard thing that took a long time to learn. With the ease that I had implementing it simply like this it became an incredibly encouraging way for me to deal with optimization and something beckoning further research.
As promised, some mildly upsetting news I discovered from this. In testing and developing this threading I was using my laptop, 2 cores, 4 threads so when I went to test on my desktop with the “AMD FX(tm)-8350 Eight-Core Processor” I was super excited to see it running much faster than my laptop. The problem occurred when running the raytracer and only being able to see 8 threads, meaning 4 cores and a very confused me. As it turns out, at least for a while, AMD referred to threads as cores so what I was thinking was an awesome 8 core that I got a few years back is really just a 4 core and that broke my heart a little bit.