As a part of two semesters research of undergraduate research at Taylor University, I development a multithreaded, tile-based software rasterizer. The pipeline rasterizes and shades four fragments in parallel using SSE instructions, and utilizes and extensive custom written SIMD optimized math library for all vertex transformations. Features supported include custom vertex and fragment shaders written purely in C++ (thanks to some operator overloading tricks), perspective correct texture mapping, clipping and backface culling, the flexibility to select different render targets, and early Z rejection. In addition, rendering order is preserved. On a Hyper-Threaded Quad-Core Intel I7 mobile processor, the rasterizer performs a full four times faster with eight threads than with one. It utilizes SDL for frame buffer, thread, and bitmap management. In the following link, you can find the code repository and research paper.
Multithreaded Software Rasterizer