gpgpu - metal compute function limitations -



gpgpu - metal compute function limitations -

i experienced mtlbuffers computionally intensive shader functions tend stop calculating before threadgroups done. when utilize mtlcomputepipelinestate , mtlcomputecommandencoder blur image big blur radii resulting image half way processed , 1 can see half finished threadgroups. did not narrow downwards exact amount of blur radius, 16 pixels works fine, 32 much , not half groups computed.

so there limitations on how long shader function phone call should take finish or that? finished of documentation how utilize metal framework , cannot recall stumbling upon such statements.

edit

since in case problem not simple timeout internal error i'm going add together code.

the expensive part of block-matching algorithm finds matching blocks in 2 images (i.e consecutive frames in movie)

//exhaustive search block-matching algorithm kernel void naivemotion( texture2d<float,access::read> inputimage1 [[ texture(0) ]], texture2d<float,access::read> inputimage2 [[ texture(1) ]], texture2d<float,access::write> outputimage [[ texture(2) ]], uint2 gid [[ thread_position_in_grid ]] ) { //area search matches float searchsize = 10.0; int searchradius = searchsize/2; //window size search in int kernelsize = 6; int kernelradius = kernelsize/2; //this store motion direction float2 vector = float2(0.0,0.0); float2 maxvector = float2(searchsize,searchsize/2); float maxvectorlength = length(maxvector); //maximum error caused noise float error = kernelsize*kernelsize*(10.0/255.0); (int y = -searchradius; y < searchradius; ++y) { (int x = 0; x < searchsize; ++x) { float diff = 0; (int b = - kernelradius; b < kernelradius; ++b) { (int = - kernelradius; < kernelradius; ++a) { uint2 textureindex(gid.x + x + a, gid.y + y + b); float4 targetcolor = inputimage2.read(textureindex).rgba; float4 referencecolor = inputimage1.read(gid).rgba; float targetgray = 0.299*targetcolor.r + 0.587*targetcolor.g + 0.114*targetcolor.b; float referencegray = 0.299*referencecolor.r + 0.587*referencecolor.g + 0.114*referencecolor.b; diff = diff + abs(targetgray - referencegray); } } if ( error > diff ) { error = diff; //vertical motion rather irrelevant negative values can't stored take absolute value vector = float2(x, abs(y)); } } } float intensity = length(vector)/maxvectorlength; outputimage.write(float4(normalize(vector), intensity, 1),gid); }

i using shader on 960x540px image. searchsize of 9 , kernelsize of 8 shader runs on whole image. changing searchsize 10 , shader stop error code 1.

gpgpu metal

Comments

Popular posts from this blog

java - How to set log4j.defaultInitOverride property to false in jboss server 6 -

c - GStreamer 1.0 1.4.5 RTSP Example Server sends 503 Service unavailable -

Using ajax with sonata admin list view pagination -