mercredi 1 juillet 2015

C++ array to Halide Image (and back)

I'm getting started with Halide, and whilst I've grasped the basic tenets of its design, I'm struggling with the particulars (read: magic) required to efficiently schedule computations.

I've posted below a MWE of using Halide to copy an array from one location to another. I had assumed this would compile down to only a handful of instructions and take less than a microsecond to run. Instead, it produces 4000 lines of assembly and takes 40ms to run! Clearly, therefore, I have a significant hole in my understanding.

  1. What is the canonical way of wrapping an existing array in a Halide::Image?
  2. How should the function copy be scheduled to perform the copy efficiently?

Minimal working example

#include <Halide.h>

using namespace Halide;

void _copy(uint8_t* in_ptr, uint8_t* out_ptr, const int M, const int N) {

    Image<uint8_t> in(Buffer(UInt(8), N, M, 0, 0, in_ptr));
    Image<uint8_t> out(Buffer(UInt(8), N, M, 0, 0, out_ptr));

    Var x,y;
    Func copy;
    copy(x,y) = in(x,y);
    copy.realize(out);
}

int main(void) {
    uint8_t in[10000], out[10000];
    _copy(in, out, 100, 100);
}

Compilation Flags

clang++ -O3 -march=native -std=c++11 -Iinclude -Lbin -lHalide copy.cpp

Aucun commentaire:

Enregistrer un commentaire