CUDA — Dr. Atip Asvanund

using CUDAdrv, CUDAnative function kernel(x) i = threadIdx().x shared = @cuStaticSharedMem(Int64,1) if i == 1 shared[1] = 255 end sync_threads() x[i] = shared[1] return nothing end d_x = CuArray{Int64,1}(10) @cuda (1, 10) kernel(d_x) x = Array(d_x) println(x) OUTPUT: [255, 255, 255, 255, 255, 255, 255, 255, 255, 255]

This series outlines what I am learning from building a machine learning algorithm (Neuroevolution) in Julia and attempting to make it run on GPUs (CUDA) using CUDAnative.

Recursive programming is among the things that you lose when programming on GPUs. Hence, if you need quicksort, you would need its iterative version.

Why do you want program on GPUs? Currently my Neuroevolution runs many times faster on GPUs than on CPUs. My CPU is a 2.7 GHz Intel Core i7 on a late-2016 MacBook Pro. But, my GPU is a GeForce GTX 1080 Ti, which I initially bought for gaming.

Here I adapted an iterative quicksort by Darel Rex Finley to Julia.

Please note that the algorithm would fail and return false if i ever reaches 1,000.

This version runs on Julia on CPUs, but to get it to run on GPUs (CUDA) using CUDAnative, you must also remove all the returns.

Please also note that this version of quicksort is not optimal on GPUs unless you are running multiple quicksorts concurrently to saturate the GPUs.

function qsort!(xs::Array{Int64,1}, n::Int64)
    i = 1
    begins = Array{Int64}(1000)
    ends = Array{Int64}(1000)
    begins[1] = 1
    ends[1] = n + 1
    while i >= 1
        left = begins[i]
        right = ends[i] - 1
        if left < right
            pivot = xs[left]
            if i == 1000
                return false
            end
            while left < right
                while xs[right] >= pivot && left < right
                    right -= 1
                end
                if left < right
                    xs[left] = xs[right]
                    left += 1
                end
                while xs[left] <= pivot && left < right
                    left += 1
                end
                if left < right
                    xs[right] = xs[left]
                    right -= 1
                end
            end
            xs[left] = pivot
            begins[i + 1] = left + 1
            ends[i + 1] = ends[i]
            ends[i] = left
            i += 1
        else
            i -= 1
        end
    end
    return true
end

xs = [11, 13, 14, 15, 12, 1233, 32, 1, -3, 442, 45, 55, 5, 9, 3, 132, 4, 111, 2, 994]
len = length(xs)
qsort!(xs, len)
println(xs)

OUTPUT: [-3, 1, 2, 3, 4, 5, 9, 11, 12, 13, 14, 15, 32, 45, 55, 111, 132, 442, 994, 1233]

ดร.อธิป อัศวานันท์

เป็นผู้บริหารของบริษัททรูคอร์ปอเรชัน จำกัด (มหาชน) ซึ่งเป็นธุรกิจโทรคมนาคม บรอดคาสติ้ง และ ดิจิตอลคอนเวอร์เจนซ์ ชั้นนำของประเทศ และเป็นกรรมการในองค์กรสาธารณะ ได้แก่ สมาพันธ์สมาคมวิชาชีพวิทยุกระจายเสียงและวิทยุโทรทัศน์ และ สภาวิชาชีพกิจการการแพร่ภาพและการกระจายเสียง ดร.อธิป เป็นรองประธานกิจการไอซีทีหอการไทย และเคยดำรงตำแหน่งทางการเมืองเป็นเลขาธิการรองประธานสภาผู้แทนราษฎร

DrJoke

DrJoke

Dr. Atip Asvanund

DrJoke

Shared Memory in CUDA using Julia and CUDAnative

Iterative Quicksort in Julia that also works on CUDA

DrJoke

ดร.อธิป อัศวานันท์