simd example code


Sure, here is a simple C example that uses SIMD (Single Instruction, Multiple Data) instructions with Intel's SSE (Streaming SIMD Extensions) or AVX (Advanced Vector Extensions) intrinsics. This code multiplies two arrays of floats element-wise using AVX instructions.

First, make sure you have a modern GCC version (like 14) that supports these instructions. You can compile this code with gcc -o simd_example simd_example.c -mavx.

#include <stdio.h>
#include <immintrin.h>  // AVX intrinsics

#define ARRAY_SIZE 8  // AVX can process 8 floats at once (256 bits / 32 bits per float)

void multiply_arrays(float *a, float *b, float *result, int size) {
    int i;
    for (i = 0; i < size; i += 8) {
        // Load 8 floats from each array into AVX registers
        __m256 vec_a = _mm256_loadu_ps(&a[i]);
        __m256 vec_b = _mm256_loadu_ps(&b[i]);

        // Multiply the two vectors
        __m256 vec_result = _mm256_mul_ps(vec_a, vec_b);

        // Store the result back to the result array
        _mm256_storeu_ps(&result[i], vec_result);

int main() {
    float a[ARRAY_SIZE] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0};
    float b[ARRAY_SIZE] = {1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0};
    float result[ARRAY_SIZE];

    multiply_arrays(a, b, result, ARRAY_SIZE);

    for (int i = 0; i < ARRAY_SIZE; i++) {
        printf("%f ", result[i]);

    return 0;


  1. Headers: We include <immintrin.h> which provides the AVX intrinsics.
  2. Array Size: We define ARRAY_SIZE as 8 since AVX works on 8 floats (256 bits wide, with each float being 32 bits).
  3. Function multiply_arrays:
    • Uses AVX intrinsics to load, multiply, and store 8 floats at a time.
    • _mm256_loadu_ps loads 8 floats from memory into an AVX register.
    • _mm256_mul_ps multiplies two AVX registers containing floats.
    • _mm256_storeu_ps stores the result from an AVX register back to memory.
  4. Main Function:
    • Initializes two float arrays a and b.
    • Calls multiply_arrays to multiply the arrays element-wise.
    • Prints the resulting array.

To compile and run the program:

gcc -o simd_example simd_example.c -mavx

This will produce the element-wise multiplication of the two arrays using AVX SIMD instructions.