Class Kernel
- All Implemented Interfaces:
Cloneable
To write a new kernel, a developer extends the Kernel class and overrides the Kernel.run() method.
To execute this kernel, the developer creates a new instance of it and calls Kernel.execute(int globalSize) with a suitable 'global size'. At runtime
Aparapi will attempt to convert the Kernel.run() method (and any method called directly or indirectly
by Kernel.run()) into OpenCL for execution on GPU devices made available via the OpenCL platform.
Note that Kernel.run() is not called directly. Instead,
the Kernel.execute(int globalSize) method will cause the overridden Kernel.run()
method to be invoked once for each value in the range 0...globalSize.
On the first call to Kernel.execute(int _globalSize), Aparapi will determine the EXECUTION_MODE of the kernel.
This decision is made dynamically based on two factors:
- Whether OpenCL is available (appropriate drivers are installed and the OpenCL and Aparapi dynamic libraries are included on the system path).
- Whether the bytecode of the
run()method (and every method that can be called directly or indirectly from therun()method) can be converted into OpenCL.
Below is an example Kernel that calculates the square of a set of input values.
class SquareKernel extends Kernel{
private int values[];
private int squares[];
public SquareKernel(int values[]){
this.values = values;
squares = new int[values.length];
}
public void run() {
int gid = getGlobalID();
squares[gid] = values[gid]*values[gid];
}
public int[] getSquares(){
return(squares);
}
}
To execute this kernel, first create a new instance of it and then call execute(Range _range).
int[] values = new int[1024];
// fill values array
Range range = Range.create(values.length); // create a range 0..1024
SquareKernel kernel = new SquareKernel(values);
kernel.execute(range);
When execute(Range) returns, all the executions of Kernel.run() have completed and the results are available in the squares array.
int[] squares = kernel.getSquares();
for (int i=0; iinvalid input: '<' values.length; i++){
System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]);
}
A different approach to creating kernels that avoids extending Kernel is to write an anonymous inner class:
final int[] values = new int[1024];
// fill the values array
final int[] squares = new int[values.length];
final Range range = Range.create(values.length);
Kernel kernel = new Kernel(){
public void run() {
int gid = getGlobalID();
squares[gid] = values[gid]*values[gid];
}
};
kernel.execute(range);
for (int i=0; iinvalid input: '<' values.length; i++){
System.out.printf("%4d %4d %8d\n", i, values[i], squares[i]);
}
- Version:
- Alpha, 21/09/2010
-
Nested Class Summary
Nested ClassesModifier and TypeClassDescriptionstatic @interfaceWe can use this Annotation to 'tag' intended constant buffers.classstatic enumDeprecated.final classThis class is for internal Kernel state managementstatic @interfaceWe can use this Annotation to 'tag' intended local buffers.static @interfaceAnnotation which can be applied to either a getter (with usual java bean naming convention relative to an instance field), or to any method with void return type, which prevents both the method body and any calls to the method being emitted in the generated OpenCL.protected static @interfaceThis annotation is for internal use onlyprotected static @interfaceThis annotation is for internal use onlystatic @interfaceWe can use this Annotation to 'tag' __private (unshared) array fields. -
Field Summary
FieldsModifier and TypeFieldDescriptionprivate static final IntBinaryOperatorprivate static final ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> private static final ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> private booleanstatic final StringWe can use this suffix to 'tag' intended constant buffers.private Iterator<Kernel.EXECUTION_MODE> Deprecated.private Kernel.EXECUTION_MODEDeprecated.private final LinkedHashSet<Kernel.EXECUTION_MODE> Deprecated.private KernelRunnerprivate Kernel.KernelStatestatic final StringWe can use this suffix to 'tag' intended local buffers.private static final doubleprivate static Loggerprivate static final ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> private static final ValueCache<Class<?>, Map<String, String>, RuntimeException> private static final IntBinaryOperatorprivate static final IntBinaryOperatorprivate static final ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> private static final IntBinaryOperatorprivate static final doublestatic final StringWe can use this suffix to 'tag' __private buffers.(package private) booleanprivate static final IntBinaryOperator -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescriptionprotected doubleabs(double _d) Delegates to eitherMath.abs(double)(Java) orfabs(double)(OpenCL).protected floatabs(float _f) Delegates to eitherMath.abs(float)(Java) orfabs(float)(OpenCL).protected intabs(int n) Delegates to eitherMath.abs(int)(Java) orabs(int)(OpenCL).protected longabs(long n) Delegates to eitherMath.abs(long)(Java) orabs(long)(OpenCL).protected doubleacos(double a) Delegates to eitherMath.acos(double)(Java) oracos(double)(OpenCL).protected floatacos(float a) Delegates to eitherMath.acos(double)(Java) oracos(float)(OpenCL).protected final doubleacospi(double a) protected final floatacospi(float a) voidaddExecutionModes(Kernel.EXECUTION_MODE... platforms) Deprecated.protected doubleasin(double _d) Delegates to eitherMath.asin(double)(Java) orasin(double)(OpenCL).protected floatasin(float _f) Delegates to eitherMath.asin(double)(Java) orasin(float)(OpenCL).protected final doubleasinpi(double a) protected final floatasinpi(float a) protected doubleatan(double _d) Delegates to eitherMath.atan(double)(Java) oratan(double)(OpenCL).protected floatatan(float _f) Delegates to eitherMath.atan(double)(Java) oratan(float)(OpenCL).protected doubleatan2(double _d1, double _d2) Delegates to eitherMath.atan2(double, double)(Java) oratan2(double, double)(OpenCL).protected floatatan2(float _f1, float _f2) Delegates to eitherMath.atan2(double, double)(Java) oratan2(float, float)(OpenCL).protected final doubleatan2pi(double y, double x) protected final floatatan2pi(float y, double x) protected final doubleatanpi(double a) protected final floatatanpi(float a) protected intatomicAdd(int[] _arr, int _index, int _delta) Atomically adds_deltavalue to_indexelement of array_arr(Java) or delegates toatomic_add(volatile int*, int)(OpenCL).protected final intatomicAdd(AtomicInteger p, int val) protected final intatomicAnd(AtomicInteger p, int val) protected final intatomicCmpXchg(AtomicInteger p, int expectedVal, int newVal) protected final intprotected final intprotected final intprotected final intatomicMax(AtomicInteger p, int val) protected final intatomicMin(AtomicInteger p, int val) protected final intatomicOr(AtomicInteger p, int val) protected final voidatomicSet(AtomicInteger p, int val) protected final intatomicSub(AtomicInteger p, int val) protected final intatomicXchg(AtomicInteger p, int newVal) protected final intatomicXor(AtomicInteger p, int val) private static <K,V, T extends Throwable>
ValueCache<Class<?>, Map<K, V>, T> cacheProperty(ValueCache.ThrowingValueComputer<Class<?>, Map<K, V>, T> throwingValueComputer) voidInvoking this method flags that once the current pass is complete execution should be abandoned.protected final doublecbrt(double a) protected final floatcbrt(float a) protected doubleceil(double _d) Delegates to eitherMath.ceil(double)(Java) orceil(double)(OpenCL).protected floatceil(float _f) Delegates to eitherMath.ceil(double)(Java) orceil(float)(OpenCL).voidFrees the bulk of the resources used by this kernel, by setting array sizes in non-primitiveKernelArgs to 1 (0 size is prohibited) and invoking kernel execution on a zero size range.clone()When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.protected intclz(int _i) Delegates to eitherInteger.numberOfLeadingZeros(int)(Java) orclz(int)(OpenCL).protected longclz(long _l) Delegates to eitherLong.numberOfLeadingZeros(long)(Java) orclz(long)(OpenCL).Force pre-compilation of the kernel for a given device, without executing it.Force pre-compilation of the kernel for a given device, without executing it.protected doublecos(double _d) Delegates to eitherMath.cos(double)(Java) orcos(double)(OpenCL).protected floatcos(float _f) Delegates to eitherMath.cos(double)(Java) orcos(float)(OpenCL).protected final doublecosh(double x) protected final floatcosh(float x) protected final doublecospi(double a) protected final floatcospi(float a) protected RangecreateRange(int _range) private static Stringvoiddispose()Release any resources associated with this Kernel.execute(int _range) Start execution of_rangekernels.execute(int _range, int _passes) Start execution of_passesiterations over the_rangeof kernels.Start execution of_rangekernels.Start execution of_passesiterations of_rangekernels.Start execution ofglobalSizekernels for the given entrypoint.Start execution ofglobalSizekernels for the given entrypoint.voidexecuteFallbackAlgorithm(Range _range, int _passId) IfhasFallbackAlgorithm()has been overriden to return true, this method should be overriden so as to apply a single pass of the kernel's logic to the entire _range.protected doubleexp(double _d) Delegates to eitherMath.exp(double)(Java) orexp(double)(OpenCL).protected floatexp(float _f) Delegates to eitherMath.exp(double)(Java) orexp(float)(OpenCL).protected final doubleexp10(double a) protected final floatexp10(float a) protected final doubleexp2(double a) protected final floatexp2(float a) protected final doubleexpm1(double x) protected final floatexpm1(float x) protected doublefloor(double _d) Delegates to eitherMath.floor(double)(Java) orfloor(double)(OpenCL).protected floatfloor(float _f) Delegates to eitherMath.floor(double)(Java) orfloor(float)(OpenCL).protected doublefma(double a, double b, double c) Delegates to either {code}a*b+c{code} (Java) orfma(double, double, double)(OpenCL).protected floatfma(float a, float b, float c) Delegates to either {code}a*b+c{code} (Java) orfma(float, float, float)(OpenCL).get(boolean[] array) Enqueue a request to return this buffer from the GPU.get(boolean[][] array) Enqueue a request to return this buffer from the GPU.get(boolean[][][] array) Enqueue a request to return this buffer from the GPU.get(byte[] array) Enqueue a request to return this buffer from the GPU.get(byte[][] array) Enqueue a request to return this buffer from the GPU.get(byte[][][] array) Enqueue a request to return this buffer from the GPU.get(char[] array) Enqueue a request to return this buffer from the GPU.get(char[][] array) Enqueue a request to return this buffer from the GPU.get(char[][][] array) Enqueue a request to return this buffer from the GPU.get(double[] array) Enqueue a request to return this buffer from the GPU.get(double[][] array) Enqueue a request to return this buffer from the GPU.get(double[][][] array) Enqueue a request to return this buffer from the GPU.get(float[] array) Enqueue a request to return this buffer from the GPU.get(float[][] array) Enqueue a request to return this buffer from the GPU.get(float[][][] array) Enqueue a request to return this buffer from the GPU.get(int[] array) Enqueue a request to return this buffer from the GPU.get(int[][] array) Enqueue a request to return this buffer from the GPU.get(int[][][] array) Enqueue a request to return this buffer from the GPU.get(long[] array) Enqueue a request to return this buffer from the GPU.get(long[][] array) Enqueue a request to return this buffer from the GPU.get(long[][][] array) Enqueue a request to return this buffer from the GPU.doubleDetermine the total execution time of all previous Kernel.execute(range) calls for all threads that ran this kernel for the device used in the last kernel execution.doubleDetermine the total execution time of all produced profile reports from all threads that executed the current kernel on the specified device.doubleDetermine the total execution time of all previous kernel executions called from the current thread, calling this method, that executed the current kernel on the specified device.private static StringgetArgumentsLetters(Method method) private static booleangetBoolean(ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> methodNamesCache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) intdoubleDetermine the time taken to convert bytecode to OpenCL for first Kernel.execute(range) call.intDeprecated.doubleDetermine the execution time of the previous Kernel.execute(range) called from the last thread that ran and executed on the most recently used device.protected final intDetermine the globalId of an executing kernel.protected final intgetGlobalId(int _dim) protected final intDetermine the value that was passed toKernel.execute(int globalSize)method.protected final intgetGlobalSize(int _dim) protected final intDetermine the groupId of an executing kernel.protected final intgetGroupId(int _dim) int[]getKernelCompileWorkGroupSize(Device device) Retrieves the specified work-group size in the compiled kernel for the specified device or intermediate language for the device.longgetKernelLocalMemSizeInUse(Device device) Retrieves the amount of local memory used in the specified device by this kernel instance.intgetKernelMaxWorkGroupSize(Device device) Retrieves the maximum work-group size allowed for this kernel when running on the specified device.longRetrieves that minimum private memory in use per work item for this kernel instance and the specified device.intRetrieves the preferred work-group multiple in the specified device for this kernel instance.protected final intDetermine the local id of an executing kernel.protected final intgetLocalId(int _dim) protected final intDetermine the size of the group that an executing kernel is a member of.protected final intgetLocalSize(int _dim) static StringgetMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry) protected final intDetermine the number of groups that will be used to execute a kernelprotected final intgetNumGroups(int _dim) protected final intDetermine the passId of an executing kernel.Get the profiling information from the last successful call to Kernel.execute().getProfileReportCurrentThread(Device device) Retrieves the most recent complete report available for the current thread calling this method for the current kernel instance and executed on the given device.getProfileReportLastThread(Device device) Retrieves a profile report for the last thread that executed this kernel on the given device.private static <V,T extends Throwable>
VgetProperty(ValueCache<Class<?>, Map<String, V>, T> cache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry, V defaultValue) private static StringgetReturnTypeLetter(Method meth) final Deviceprotected final voidWait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.booleanFalse by default.booleanDeprecated.protected doublehypot(double a, double b) protected floathypot(float a, float b) protected doubleIEEEremainder(double _d1, double _d2) Delegates to eitherMath.IEEEremainder(double, double)(Java) orremainder(double, double)(OpenCL).protected floatIEEEremainder(float _f1, float _f2) Delegates to eitherMath.IEEEremainder(double, double)(Java) orremainder(float, float)(OpenCL).static voidbooleanisAllowDevice(Device _device) booleanbooleanbooleanFor dev purposes (we should remove this for production) determine whether this Kernel uses explicit memory managementstatic booleanisMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) static booleanisOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) private static booleanisRelevant(Method method) booleanprotected final voidWait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.protected final voidWait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.protected doublelog(double _d) Delegates to eitherMath.log(double)(Java) orlog(double)(OpenCL).protected floatlog(float _f) Delegates to eitherMath.log(double)(Java) orlog(float)(OpenCL).protected final doublelog10(double a) protected final floatlog10(float a) protected final doublelog1p(double x) protected final floatlog1p(float x) protected final doublelog2(double a) protected final floatlog2(float a) protected final doublemad(double a, double b, double c) protected final floatmad(float a, float b, float c) private static <A extends Annotation>
ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> markedWith(Class<A> annotationClass) protected doublemax(double _d1, double _d2) Delegates to eitherMath.max(double, double)(Java) orfmax(double, double)(OpenCL).protected floatmax(float _f1, float _f2) Delegates to eitherMath.max(float, float)(Java) orfmax(float, float)(OpenCL).protected intmax(int n1, int n2) Delegates to eitherMath.max(int, int)(Java) ormax(int, int)(OpenCL).protected longmax(long n1, long n2) Delegates to eitherMath.max(long, long)(Java) ormax(long, long)(OpenCL).protected doublemin(double _d1, double _d2) Delegates to eitherMath.min(double, double)(Java) orfmin(double, double)(OpenCL).protected floatmin(float _f1, float _f2) Delegates to eitherMath.min(float, float)(Java) orfmin(float, float)(OpenCL).protected intmin(int n1, int n2) Delegates to eitherMath.min(int, int)(Java) ormin(int, int)(OpenCL).protected longmin(long n1, long n2) Delegates to eitherMath.min(long, long)(Java) ormin(long, long)(OpenCL).private floatnative_rsqrt(float _f) private floatnative_sqrt(float _f) protected final doublenextAfter(double start, double direction) protected final floatnextAfter(float start, float direction) protected intpopcount(int _i) Delegates to eitherInteger.bitCount(int)(Java) orpopcount(int)(OpenCL).protected longpopcount(long _i) Delegates to eitherLong.bitCount(long)(Java) orpopcount(long)(OpenCL).protected doublepow(double _d1, double _d2) Delegates to eitherMath.pow(double, double)(Java) orpow(double, double)(OpenCL).protected floatpow(float _f1, float _f2) Delegates to eitherMath.pow(double, double)(Java) orpow(float, float)(OpenCL).private KernelRunnerput(boolean[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(boolean[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(boolean[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(byte[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(byte[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(byte[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(char[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(char[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(char[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(double[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(double[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(double[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(float[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(float[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(float[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(int[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(int[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(int[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(long[] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(long[][] array) Tag this array so that it is explicitly enqueued before the kernel is executedput(long[][][] array) Tag this array so that it is explicitly enqueued before the kernel is executedvoidRegisters a new profile report observer to receive profile reports as they're produced.protected doublerint(double _d) Delegates to eitherMath.rint(double)(Java) orrint(double)(OpenCL).protected floatrint(float _f) Delegates to eitherMath.rint(double)(Java) orrint(float)(OpenCL).protected longround(double _d) Delegates to eitherMath.round(double)(Java) orround(double)(OpenCL).protected intround(float _f) Delegates to eitherMath.round(float)(Java) orround(float)(OpenCL).protected doublersqrt(double _d) Computes inverse square root usingMath.sqrt(double)(Java) or delegates torsqrt(double)(OpenCL).protected floatrsqrt(float _f) Computes inverse square root usingMath.sqrt(double)(Java) or delegates torsqrt(double)(OpenCL).abstract voidrun()The entry point of a kernel.voidsetAutoCleanUpArrays(boolean autoCleanUpArrays) Property which if true enables automatic calling ofcleanUpArrays()following each execution.voidsetExecutionMode(Kernel.EXECUTION_MODE _executionMode) Deprecated.voidsetExecutionModeWithoutFallback(Kernel.EXECUTION_MODE _executionMode) voidsetExplicit(boolean _explicit) For dev purposes (we should remove this for production) allow us to define that this Kernel uses explicit memory managementvoidDeprecated.protected doublesin(double _d) Delegates to eitherMath.sin(double)(Java) orsin(double)(OpenCL).protected floatsin(float _f) Delegates to eitherMath.sin(double)(Java) orsin(float)(OpenCL).protected final doublesinh(double x) Delegates to eitherMath.sinh(double)(Java) orsinh(double)(OpenCL).protected final floatsinh(float x) Delegates to eitherMath.sinh(double)(Java) orsinh(float)(OpenCL).protected final doublesinpi(double a) Backed by eitherMath.sin(double)(Java) orsinpi(double)(OpenCL).protected final floatsinpi(float a) Backed by eitherMath.sin(double)(Java) orsinpi(float)(OpenCL).protected doublesqrt(double _d) Delegates to eitherMath.sqrt(double)(Java) orsqrt(double)(OpenCL).protected floatsqrt(float _f) Delegates to eitherMath.sqrt(double)(Java) orsqrt(float)(OpenCL).protected doubletan(double _d) Delegates to eitherMath.tan(double)(Java) ortan(double)(OpenCL).protected floattan(float _f) Delegates to eitherMath.tan(double)(Java) ortan(float)(OpenCL).protected final doubletanh(double x) Delegates to eitherMath.tanh(double)(Java) ortanh(double)(OpenCL).protected final floattanh(float x) Delegates to eitherMath.tanh(float)(Java) ortanh(float)(OpenCL).protected final doubletanpi(double a) Backed by eitherMath.tan(double)(Java) ortanpi(double)(OpenCL).protected final floattanpi(float a) Backed by eitherMath.tan(double)(Java) ortanpi(float)(OpenCL).private static StringtoClassShortNameIfAny(Class<?> retClass) protected doubletoDegrees(double _d) Delegates to eitherMath.toDegrees(double)(Java) ordegrees(double)(OpenCL).protected floattoDegrees(float _f) Delegates to eitherMath.toDegrees(double)(Java) ordegrees(float)(OpenCL).protected doubletoRadians(double _d) Delegates to eitherMath.toRadians(double)(Java) orradians(double)(OpenCL).protected floattoRadians(float _f) Delegates to eitherMath.toRadians(double)(Java) orradians(float)(OpenCL).private static StringtoSignature(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) (package private) static StringtoSignature(Method method) toString()voidDeprecated.static booleanusesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) static booleanusesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry)
-
Field Details
-
logger
-
LOCAL_SUFFIX
We can use this suffix to 'tag' intended local buffers. So either name the buffer
Or use the Annotation formint[] buffer_$local$ = new int[1024];invalid input: '@'Local int[] buffer = new int[1024];- See Also:
-
CONSTANT_SUFFIX
We can use this suffix to 'tag' intended constant buffers. So either name the buffer
Or use the Annotation formint[] buffer_$constant$ = new int[1024];invalid input: '@'Constant int[] buffer = new int[1024];- See Also:
-
PRIVATE_SUFFIX
We can use this suffix to 'tag' __private buffers.So either name the buffer
Or use the Annotation formint[] buffer_$private$32 = new int[32];invalid input: '@'PrivateMemorySpace(32) int[] buffer = new int[32];- See Also:
-
kernelRunner
-
autoCleanUpArrays
private boolean autoCleanUpArrays -
kernelState
-
LOG_2_RECIPROCAL
private static final double LOG_2_RECIPROCAL -
PI_RECIPROCAL
private static final double PI_RECIPROCAL- See Also:
-
minOperator
-
maxOperator
-
andOperator
-
orOperator
-
xorOperator
-
typeToLetterMap
-
useNullForLocalSize
boolean useNullForLocalSize -
executionModes
Deprecated. -
currentMode
Deprecated. -
executionMode
Deprecated. -
mappedMethodFlags
-
openCLDelegateMethodFlags
private static final ValueCache<Class<?>,Map<String, openCLDelegateMethodFlagsBoolean>, RuntimeException> -
atomic32Cache
-
atomic64Cache
-
mappedMethodNamesCache
private static final ValueCache<Class<?>,Map<String, mappedMethodNamesCacheString>, RuntimeException>
-
-
Constructor Details
-
Kernel
public Kernel()
-
-
Method Details
-
getGlobalId
protected final int getGlobalId()Determine the globalId of an executing kernel.The kernel implementation uses the globalId to determine which of the executing kernels (in the global domain space) this invocation is expected to deal with.
For example in a
SquareKernelimplementation:class SquareKernel extends Kernel{ private int values[]; private int squares[]; public SquareKernel(int values[]){ this.values = values; squares = new int[values.length]; } public void run() { int gid = getGlobalID(); squares[gid] = values[gid]*values[gid]; } public int[] getSquares(){ return(squares); } }Each invocation of
SquareKernel.run()retrieves it's globalId by callinggetGlobalId(), and then computes the value ofsquare[gid]for a given value ofvalue[gid].- Returns:
- The globalId for the Kernel being executed
- See Also:
-
getGlobalId
protected final int getGlobalId(int _dim) -
getGroupId
protected final int getGroupId()Determine the groupId of an executing kernel.When a
Kernel.execute(int globalSize)is invoked for a particular kernel, the runtime will break the work into various 'groups'.A kernel can use
getGroupId()to determine which group a kernel is currently dispatched toThe following code would capture the groupId for each kernel and map it against globalId.
final int[] groupIds = new int[1024]; Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalId(); groupIds[gid] = getGroupId(); } }; kernel.execute(groupIds.length); for (int i=0; iinvalid input: '<' values.length; i++){ System.out.printf("%4d %4d\n", i, groupIds[i]); }- Returns:
- The groupId for this Kernel being executed
- See Also:
-
getGroupId
protected final int getGroupId(int _dim) -
getPassId
protected final int getPassId()Determine the passId of an executing kernel.When a
Kernel.execute(int globalSize, int passes)is invoked for a particular kernel, the runtime will break the work into various 'groups'.A kernel can use
getPassId()to determine which pass we are in. This is ideal for 'reduce' type phases- Returns:
- The groupId for this Kernel being executed
- See Also:
-
getLocalId
protected final int getLocalId()Determine the local id of an executing kernel.When a
Kernel.execute(int globalSize)is invoked for a particular kernel, the runtime will break the work into various 'groups'.getLocalId()can be used to determine the relative id of the current kernel within a specific group.The following code would capture the groupId for each kernel and map it against globalId.
final int[] localIds = new int[1024]; Kernel kernel = new Kernel(){ public void run() { int gid = getGlobalId(); localIds[gid] = getLocalId(); } }; kernel.execute(localIds.length); for (int i=0; iinvalid input: '<' values.length; i++){ System.out.printf("%4d %4d\n", i, localIds[i]); }- Returns:
- The local id for this Kernel being executed
- See Also:
-
getLocalId
protected final int getLocalId(int _dim) -
getLocalSize
protected final int getLocalSize()Determine the size of the group that an executing kernel is a member of.When a
Kernel.execute(int globalSize)is invoked for a particular kernel, the runtime will break the work into various 'groups'.getLocalSize()allows a kernel to determine the size of the current group.Note groups may not all be the same size. In particular, if
(global size)%(# of compute devices)!=0, the runtime can choose to dispatch kernels to groups with differing sizes.- Returns:
- The size of the currently executing group.
- See Also:
-
getLocalSize
protected final int getLocalSize(int _dim) -
getGlobalSize
protected final int getGlobalSize()Determine the value that was passed toKernel.execute(int globalSize)method.- Returns:
- The value passed to
Kernel.execute(int globalSize)causing the current execution. - See Also:
-
getGlobalSize
protected final int getGlobalSize(int _dim) -
getNumGroups
protected final int getNumGroups()Determine the number of groups that will be used to execute a kernelWhen
Kernel.execute(int globalSize)is invoked, the runtime will split the work into multiple 'groups'.getNumGroups()returns the total number of groups that will be used.- Returns:
- The number of groups that kernels will be dispatched into.
- See Also:
-
getNumGroups
protected final int getNumGroups(int _dim) -
run
public abstract void run()The entry point of a kernel.Every kernel must override this method.
-
hasFallbackAlgorithm
public boolean hasFallbackAlgorithm()False by default. In the event that all preferred devices fail to execute a kernel, it is possible to supply an alternate (possibly non-parallel) execution algorithm by overriding this method to return true, and overridingexecuteFallbackAlgorithm(Range, int)with the alternate algorithm. -
executeFallbackAlgorithm
IfhasFallbackAlgorithm()has been overriden to return true, this method should be overriden so as to apply a single pass of the kernel's logic to the entire _range.This is not normally required, as fallback to
JavaDevice.THREAD_POOLwill implement the algorithm in parallel. However in the event that thread pool execution may be prohibitively slow, this method might implement a "quick and dirty" approximation to the desired result (for example, a simple box-blur as opposed to a gaussian blur in an image processing application). -
cancelMultiPass
public void cancelMultiPass()Invoking this method flags that once the current pass is complete execution should be abandoned. Due to the complexity of intercommunication between java (or C) and executing OpenCL, this is the best we can do for general cancellation of execution at present. OpenCL 2.0 should introduce pipe mechanisms which will support mid-pass cancellation easily.Note that in the case of thread-pool/pure java execution we could do better already, using Thread.interrupt() (and/or other means) to abandon execution mid-pass. However at present this is not attempted.
- See Also:
-
getCancelState
public int getCancelState() -
getCurrentPass
public int getCurrentPass()- See Also:
-
isExecuting
public boolean isExecuting()- See Also:
-
clone
When using a Java Thread Pool Aparapi uses clone to copy the initial instance to each thread.If you choose to override
clone()you are responsible for delegating tosuper.clone(); -
acos
protected float acos(float a) Delegates to eitherMath.acos(double)(Java) oracos(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate toMath.acos(double)/acos(float)- Returns:
Math.acos(double)casted to float/acos(float)- See Also:
-
acos
protected double acos(double a) Delegates to eitherMath.acos(double)(Java) oracos(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate toMath.acos(double)/acos(double)- Returns:
Math.acos(double)/acos(double)- See Also:
-
asin
protected float asin(float _f) Delegates to eitherMath.asin(double)(Java) orasin(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.asin(double)/asin(float)- Returns:
Math.asin(double)casted to float/asin(float)- See Also:
-
asin
protected double asin(double _d) Delegates to eitherMath.asin(double)(Java) orasin(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.asin(double)/asin(double)- Returns:
Math.asin(double)/asin(double)- See Also:
-
atan
protected float atan(float _f) Delegates to eitherMath.atan(double)(Java) oratan(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.atan(double)/atan(float)- Returns:
Math.atan(double)casted to float/atan(float)- See Also:
-
atan
protected double atan(double _d) Delegates to eitherMath.atan(double)(Java) oratan(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.atan(double)/atan(double)- Returns:
Math.atan(double)/atan(double)- See Also:
-
atan2
protected float atan2(float _f1, float _f2) Delegates to eitherMath.atan2(double, double)(Java) oratan2(float, float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1- value to delegate to first argument ofMath.atan2(double, double)/atan2(float, float)_f2- value to delegate to second argument ofMath.atan2(double, double)/atan2(float, float)- Returns:
Math.atan2(double, double)casted to float/atan2(float, float)- See Also:
-
atan2
protected double atan2(double _d1, double _d2) Delegates to eitherMath.atan2(double, double)(Java) oratan2(double, double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1- value to delegate to first argument ofMath.atan2(double, double)/atan2(double, double)_d2- value to delegate to second argument ofMath.atan2(double, double)/atan2(double, double)- Returns:
Math.atan2(double, double)/atan2(double, double)- See Also:
-
ceil
protected float ceil(float _f) Delegates to eitherMath.ceil(double)(Java) orceil(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.ceil(double)/ceil(float)- Returns:
Math.ceil(double)casted to float/ceil(float)- See Also:
-
ceil
protected double ceil(double _d) Delegates to eitherMath.ceil(double)(Java) orceil(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.ceil(double)/ceil(double)- Returns:
Math.ceil(double)/ceil(double)- See Also:
-
cos
protected float cos(float _f) Delegates to eitherMath.cos(double)(Java) orcos(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.cos(double)/cos(float)- Returns:
Math.cos(double)casted to float/cos(float)- See Also:
-
cos
protected double cos(double _d) Delegates to eitherMath.cos(double)(Java) orcos(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.cos(double)/cos(double)- Returns:
Math.cos(double)/cos(double)- See Also:
-
exp
protected float exp(float _f) Delegates to eitherMath.exp(double)(Java) orexp(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.exp(double)/exp(float)- Returns:
Math.exp(double)casted to float/exp(float)- See Also:
-
exp
protected double exp(double _d) Delegates to eitherMath.exp(double)(Java) orexp(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.exp(double)/exp(double)- Returns:
Math.exp(double)/exp(double)- See Also:
-
abs
protected float abs(float _f) Delegates to eitherMath.abs(float)(Java) orfabs(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.abs(float)/fabs(float)- Returns:
Math.abs(float)/fabs(float)- See Also:
-
popcount
protected int popcount(int _i) Delegates to eitherInteger.bitCount(int)(Java) orpopcount(int)(OpenCL).- Parameters:
_i- value to delegate toInteger.bitCount(int)/popcount(int)- Returns:
Integer.bitCount(int)/popcount(int)- See Also:
-
popcount
protected long popcount(long _i) Delegates to eitherLong.bitCount(long)(Java) orpopcount(long)(OpenCL).- Parameters:
_i- value to delegate toLong.bitCount(long)/popcount(long)- Returns:
Long.bitCount(long)/popcount(long)- See Also:
-
clz
protected int clz(int _i) Delegates to eitherInteger.numberOfLeadingZeros(int)(Java) orclz(int)(OpenCL).- Parameters:
_i- value to delegate toInteger.numberOfLeadingZeros(int)/clz(int)- Returns:
Integer.numberOfLeadingZeros(int)/clz(int)- See Also:
-
clz
protected long clz(long _l) Delegates to eitherLong.numberOfLeadingZeros(long)(Java) orclz(long)(OpenCL).- Parameters:
_l- value to delegate toLong.numberOfLeadingZeros(long)/clz(long)- Returns:
Long.numberOfLeadingZeros(long)/clz(long)- See Also:
-
abs
protected double abs(double _d) Delegates to eitherMath.abs(double)(Java) orfabs(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.abs(double)/fabs(double)- Returns:
Math.abs(double)/fabs(double)- See Also:
-
abs
protected int abs(int n) Delegates to eitherMath.abs(int)(Java) orabs(int)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n- value to delegate toMath.abs(int)/abs(int)- Returns:
Math.abs(int)/abs(int)- See Also:
-
abs
protected long abs(long n) Delegates to eitherMath.abs(long)(Java) orabs(long)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n- value to delegate toMath.abs(long)/abs(long)- Returns:
Math.abs(long)/abs(long)- See Also:
-
floor
protected float floor(float _f) Delegates to eitherMath.floor(double)(Java) orfloor(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.floor(double)/floor(float)- Returns:
Math.floor(double)casted to float/floor(float)- See Also:
-
floor
protected double floor(double _d) Delegates to eitherMath.floor(double)(Java) orfloor(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.floor(double)/floor(double)- Returns:
Math.floor(double)/floor(double)- See Also:
-
max
protected float max(float _f1, float _f2) Delegates to eitherMath.max(float, float)(Java) orfmax(float, float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1- value to delegate to first argument ofMath.max(float, float)/fmax(float, float)_f2- value to delegate to second argument ofMath.max(float, float)/fmax(float, float)- Returns:
Math.max(float, float)/fmax(float, float)- See Also:
-
max
protected double max(double _d1, double _d2) Delegates to eitherMath.max(double, double)(Java) orfmax(double, double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1- value to delegate to first argument ofMath.max(double, double)/fmax(double, double)_d2- value to delegate to second argument ofMath.max(double, double)/fmax(double, double)- Returns:
Math.max(double, double)/fmax(double, double)- See Also:
-
max
protected int max(int n1, int n2) Delegates to eitherMath.max(int, int)(Java) ormax(int, int)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1- value to delegate toMath.max(int, int)/max(int, int)n2- value to delegate toMath.max(int, int)/max(int, int)- Returns:
Math.max(int, int)/max(int, int)- See Also:
-
max
protected long max(long n1, long n2) Delegates to eitherMath.max(long, long)(Java) ormax(long, long)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1- value to delegate to first argument ofMath.max(long, long)/max(long, long)n2- value to delegate to second argument ofMath.max(long, long)/max(long, long)- Returns:
Math.max(long, long)/max(long, long)- See Also:
-
min
protected float min(float _f1, float _f2) Delegates to eitherMath.min(float, float)(Java) orfmin(float, float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1- value to delegate to first argument ofMath.min(float, float)/fmin(float, float)_f2- value to delegate to second argument ofMath.min(float, float)/fmin(float, float)- Returns:
Math.min(float, float)/fmin(float, float)- See Also:
-
min
protected double min(double _d1, double _d2) Delegates to eitherMath.min(double, double)(Java) orfmin(double, double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1- value to delegate to first argument ofMath.min(double, double)/fmin(double, double)_d2- value to delegate to second argument ofMath.min(double, double)/fmin(double, double)- Returns:
Math.min(double, double)/fmin(double, double)- See Also:
-
min
protected int min(int n1, int n2) Delegates to eitherMath.min(int, int)(Java) ormin(int, int)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1- value to delegate to first argument ofMath.min(int, int)/min(int, int)n2- value to delegate to second argument ofMath.min(int, int)/min(int, int)- Returns:
Math.min(int, int)/min(int, int)- See Also:
-
min
protected long min(long n1, long n2) Delegates to eitherMath.min(long, long)(Java) ormin(long, long)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
n1- value to delegate to first argument ofMath.min(long, long)/min(long, long)n2- value to delegate to second argument ofMath.min(long, long)/min(long, long)- Returns:
Math.min(long, long)/min(long, long)- See Also:
-
log
protected float log(float _f) Delegates to eitherMath.log(double)(Java) orlog(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.log(double)/log(float)- Returns:
Math.log(double)casted to float/log(float)- See Also:
-
log
protected double log(double _d) Delegates to eitherMath.log(double)(Java) orlog(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.log(double)/log(double)- Returns:
Math.log(double)/log(double)- See Also:
-
pow
protected float pow(float _f1, float _f2) Delegates to eitherMath.pow(double, double)(Java) orpow(float, float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1- value to delegate to first argument ofMath.pow(double, double)/pow(float, float)_f2- value to delegate to second argument ofMath.pow(double, double)/pow(float, float)- Returns:
Math.pow(double, double)casted to float/pow(float, float)- See Also:
-
pow
protected double pow(double _d1, double _d2) Delegates to eitherMath.pow(double, double)(Java) orpow(double, double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1- value to delegate to first argument ofMath.pow(double, double)/pow(double, double)_d2- value to delegate to second argument ofMath.pow(double, double)/pow(double, double)- Returns:
Math.pow(double, double)/pow(double, double)- See Also:
-
IEEEremainder
protected float IEEEremainder(float _f1, float _f2) Delegates to eitherMath.IEEEremainder(double, double)(Java) orremainder(float, float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f1- value to delegate to first argument ofMath.IEEEremainder(double, double)/remainder(float, float)_f2- value to delegate to second argument ofMath.IEEEremainder(double, double)/remainder(float, float)- Returns:
Math.IEEEremainder(double, double)casted to float/remainder(float, float)- See Also:
-
IEEEremainder
protected double IEEEremainder(double _d1, double _d2) Delegates to eitherMath.IEEEremainder(double, double)(Java) orremainder(double, double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d1- value to delegate to first argument ofMath.IEEEremainder(double, double)/remainder(double, double)_d2- value to delegate to second argument ofMath.IEEEremainder(double, double)/remainder(double, double)- Returns:
Math.IEEEremainder(double, double)/remainder(double, double)- See Also:
-
toRadians
protected float toRadians(float _f) Delegates to eitherMath.toRadians(double)(Java) orradians(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.toRadians(double)/radians(float)- Returns:
Math.toRadians(double)casted to float/radians(float)- See Also:
-
toRadians
protected double toRadians(double _d) Delegates to eitherMath.toRadians(double)(Java) orradians(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.toRadians(double)/radians(double)- Returns:
Math.toRadians(double)/radians(double)- See Also:
-
toDegrees
protected float toDegrees(float _f) Delegates to eitherMath.toDegrees(double)(Java) ordegrees(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.toDegrees(double)/degrees(float)- Returns:
Math.toDegrees(double)casted to float/degrees(float)- See Also:
-
toDegrees
protected double toDegrees(double _d) Delegates to eitherMath.toDegrees(double)(Java) ordegrees(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.toDegrees(double)/degrees(double)- Returns:
Math.toDegrees(double)/degrees(double)- See Also:
-
rint
protected float rint(float _f) Delegates to eitherMath.rint(double)(Java) orrint(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.rint(double)/rint(float)- Returns:
Math.rint(double)casted to float/rint(float)- See Also:
-
rint
protected double rint(double _d) Delegates to eitherMath.rint(double)(Java) orrint(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.rint(double)/rint(double)- Returns:
Math.rint(double)/rint(double)- See Also:
-
round
protected int round(float _f) Delegates to eitherMath.round(float)(Java) orround(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.round(float)/round(float)- Returns:
Math.round(float)/round(float)- See Also:
-
round
protected long round(double _d) Delegates to eitherMath.round(double)(Java) orround(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.round(double)/round(double)- Returns:
Math.round(double)/round(double)- See Also:
-
sin
protected float sin(float _f) Delegates to eitherMath.sin(double)(Java) orsin(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.sin(double)/sin(float)- Returns:
Math.sin(double)casted to float/sin(float)- See Also:
-
sin
protected double sin(double _d) Delegates to eitherMath.sin(double)(Java) orsin(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.sin(double)/sin(double)- Returns:
Math.sin(double)/sin(double)- See Also:
-
sqrt
protected float sqrt(float _f) Delegates to eitherMath.sqrt(double)(Java) orsqrt(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.sqrt(double)/sqrt(float)- Returns:
Math.sqrt(double)casted to float/sqrt(float)- See Also:
-
sqrt
protected double sqrt(double _d) Delegates to eitherMath.sqrt(double)(Java) orsqrt(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.sqrt(double)/sqrt(double)- Returns:
Math.sqrt(double)/sqrt(double)- See Also:
-
tan
protected float tan(float _f) Delegates to eitherMath.tan(double)(Java) ortan(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.tan(double)/tan(float)- Returns:
Math.tan(double)casted to float/tan(float)- See Also:
-
tan
protected double tan(double _d) Delegates to eitherMath.tan(double)(Java) ortan(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.tan(double)/tan(double)- Returns:
Math.tan(double)/tan(double)- See Also:
-
acospi
protected final double acospi(double a) -
acospi
protected final float acospi(float a) -
asinpi
protected final double asinpi(double a) -
asinpi
protected final float asinpi(float a) -
atanpi
protected final double atanpi(double a) -
atanpi
protected final float atanpi(float a) -
atan2pi
protected final double atan2pi(double y, double x) -
atan2pi
protected final float atan2pi(float y, double x) -
cbrt
protected final double cbrt(double a) -
cbrt
protected final float cbrt(float a) -
cosh
protected final double cosh(double x) -
cosh
protected final float cosh(float x) -
cospi
protected final double cospi(double a) -
cospi
protected final float cospi(float a) -
exp2
protected final double exp2(double a) -
exp2
protected final float exp2(float a) -
exp10
protected final double exp10(double a) -
exp10
protected final float exp10(float a) -
expm1
protected final double expm1(double x) -
expm1
protected final float expm1(float x) -
log2
protected final double log2(double a) -
log2
protected final float log2(float a) -
log10
protected final double log10(double a) -
log10
protected final float log10(float a) -
log1p
protected final double log1p(double x) -
log1p
protected final float log1p(float x) -
mad
protected final double mad(double a, double b, double c) -
mad
protected final float mad(float a, float b, float c) -
fma
protected float fma(float a, float b, float c) Delegates to either {code}a*b+c{code} (Java) orfma(float, float, float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate to first argument offma(float, float, float)b- value to delegate to second argument offma(float, float, float)c- value to delegate to third argument offma(float, float, float)- Returns:
- a * b + c /
fma(float, float, float) - See Also:
-
fma
protected double fma(double a, double b, double c) Delegates to either {code}a*b+c{code} (Java) orfma(double, double, double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate to first argument offma(double, double, double)b- value to delegate to second argument offma(double, double, double)c- value to delegate to third argument offma(double, double, double)- Returns:
- a * b + c /
fma(double, double, double) - See Also:
-
nextAfter
protected final double nextAfter(double start, double direction) -
nextAfter
protected final float nextAfter(float start, float direction) -
sinh
protected final double sinh(double x) Delegates to eitherMath.sinh(double)(Java) orsinh(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x- value to delegate toMath.sinh(double)/sinh(double)- Returns:
Math.sinh(double)/sinh(double)- See Also:
-
sinh
protected final float sinh(float x) Delegates to eitherMath.sinh(double)(Java) orsinh(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x- value to delegate toMath.sinh(double)/sinh(float)- Returns:
Math.sinh(double)/sinh(float)- See Also:
-
sinpi
protected final double sinpi(double a) Backed by eitherMath.sin(double)(Java) orsinpi(double)(OpenCL). This method is equivelant toMath.sin(a * Math.PI)User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate tosinpi(double)or java equivelant- Returns:
sinpi(double)or java equivelant- See Also:
-
sinpi
protected final float sinpi(float a) Backed by eitherMath.sin(double)(Java) orsinpi(float)(OpenCL). This method is equivelant toMath.sin(a * Math.PI)User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate tosinpi(float)or java equivelant- Returns:
sinpi(float)or java equivelant- See Also:
-
tanh
protected final double tanh(double x) Delegates to eitherMath.tanh(double)(Java) ortanh(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x- value to delegate toMath.tanh(double)/tanh(double)- Returns:
Math.tanh(double)/tanh(double)- See Also:
-
tanh
protected final float tanh(float x) Delegates to eitherMath.tanh(float)(Java) ortanh(float)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
x- value to delegate toMath.tanh(float)/tanh(float)- Returns:
Math.tanh(float)/tanh(float)- See Also:
-
tanpi
protected final double tanpi(double a) Backed by eitherMath.tan(double)(Java) ortanpi(double)(OpenCL). This method is equivelant toMath.tan(a * Math.PI)User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate totanpi(double)or java equivelant- Returns:
tanpi(double)or java equivelant- See Also:
-
tanpi
protected final float tanpi(float a) Backed by eitherMath.tan(double)(Java) ortanpi(float)(OpenCL). This method is equivelant toMath.tan(a * Math.PI)User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
a- value to delegate totanpi(float)or java equivelant- Returns:
tanpi(float)or java equivelant- See Also:
-
rsqrt
protected float rsqrt(float _f) Computes inverse square root usingMath.sqrt(double)(Java) or delegates torsqrt(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_f- value to delegate toMath.sqrt(double)/rsqrt(double)- Returns:
( 1.0f //Math.sqrt(double)casted to float )rsqrt(double)- See Also:
-
rsqrt
protected double rsqrt(double _d) Computes inverse square root usingMath.sqrt(double)(Java) or delegates torsqrt(double)(OpenCL). User should note the differences in precision between Java and OpenCL's implementation of arithmetic functions to determine whether the difference in precision is acceptable.- Parameters:
_d- value to delegate toMath.sqrt(double)/rsqrt(double)- Returns:
( 1.0f //Math.sqrt(double))rsqrt(double)- See Also:
-
native_sqrt
private float native_sqrt(float _f) -
native_rsqrt
private float native_rsqrt(float _f) -
atomicAdd
protected int atomicAdd(int[] _arr, int _index, int _delta) Atomically adds_deltavalue to_indexelement of array_arr(Java) or delegates toatomic_add(volatile int*, int)(OpenCL).- Parameters:
_arr- array for which an element value needs to be atomically incremented by_delta_index- index of the_arrarray that needs to be atomically incremented by_delta_delta- value by which_indexelement of_arrarray needs to be atomically incremented- Returns:
- previous value of
_indexelement of_arrarray - See Also:
-
atomicGet
-
atomicSet
-
atomicAdd
-
atomicSub
-
atomicXchg
-
atomicInc
-
atomicDec
-
atomicCmpXchg
-
atomicMin
-
atomicMax
-
atomicAnd
-
atomicOr
-
atomicXor
-
localBarrier
protected final void localBarrier()Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.
Note1: In OpenCL will execute as barrier(CLK_LOCAL_MEM_FENCE), which will have a different behaviour than in Java, because it will only guarantee visibility of modifications made to local memory space to all threads leaving the barrier.
Note2: In OpenCL it is required that all threads must enter the same if blocks and must iterate the same number of times in all loops (for, while, ...).
Note3: Java version is identical to localBarrier(), globalBarrier() and localGlobalBarrier() -
globalBarrier
protected final void globalBarrier()Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.
Note1: In OpenCL will execute as barrier(CLK_GLOBAL_MEM_FENCE), which will have a different behaviour; than in Java, because it will only guarantee visibility of modifications made to global memory space to all threads, in the work group, leaving the barrier.
Note2: In OpenCL it is required that all threads must enter the same if blocks and must iterate the same number of times in all loops (for, while, ...).
Note3: Java version is identical to localBarrier(), globalBarrier() and localGlobalBarrier() -
localGlobalBarrier
protected final void localGlobalBarrier()Wait for all kernels in the current work group to rendezvous at this call before continuing execution.
It will also enforce memory ordering, such that modifications made by each thread in the work-group, to the memory, before entering into this barrier call will be visible by all threads leaving the barrier.
Note1: When in doubt, use this barrier instead of localBarrier() or globalBarrier(), despite the possible performance loss.
Note2: In OpenCL will execute as barrier(CLK_LOCAL_MEM_FENCE | CLK_GLOBAL_MEM_FENCE), which will have the same behaviour than in Java, because it will guarantee the visibility of modifications made to any of the memory spaces to all threads, in the work group, leaving the barrier.
Note3: In OpenCL it is required that all threads must enter the same if blocks and must iterate the same number of times in all loops (for, while, ...).
Note4: Java version is identical to localBarrier(), globalBarrier() and localGlobalBarrier() -
hypot
protected float hypot(float a, float b) -
hypot
protected double hypot(double a, double b) -
getKernelState
-
prepareKernelRunner
-
registerProfileReportObserver
Registers a new profile report observer to receive profile reports as they're produced. This is the method recommended when the client application desires to receive all the execution profiles for the current kernel instance on all devices over all client threads running such kernel with a single observer
Note1: A report will be generated by a thread that finishes executing a kernel. In multithreaded execution environments it is up to the observer implementation to handle thread safety.
Note2: To cancel the report subscription just set observer tonullvalue.- Parameters:
observer- the observer instance that will receive the profile reports
-
getProfileReportLastThread
Retrieves a profile report for the last thread that executed this kernel on the given device. A report will only be available if at least one thread executed the kernel on the device.
Note1: If the profile report is intended to be kept in memory, the object should be cloned withProfileReport.clone()- Parameters:
device- the relevant device where the kernel executed- Returns:
- the profiling report for the current most recent execution
- null, if no profiling report is available for such thread
- See Also:
-
getProfileReportCurrentThread
Retrieves the most recent complete report available for the current thread calling this method for the current kernel instance and executed on the given device.
Note1: If the profile report is intended to be kept in memory, the object should be cloned withProfileReport.clone()
Note2: If the thread didn't execute this kernel on the specified device, it will return null.- Parameters:
device- the relevant device where the kernel executed- Returns:
- the profiling report for the current most recent execution
- null, if no profiling report is available for such thread
- See Also:
-
getExecutionTime
public double getExecutionTime()Determine the execution time of the previous Kernel.execute(range) called from the last thread that ran and executed on the most recently used device.
Note1: This is kept for backwards compatibility only, usage of eithergetProfileReportLastThread(Device)orregisterProfileReportObserver(IProfileReportObserver)is encouraged instead.
Note2: Calling this method is not recommended when using more than a single thread to execute the same kernel, or when running kernels on more than one device concurrently.
Note that for the first call this will include the conversion time.
- Returns:
- The time spent executing the kernel (ms)
- NaN, if no profile report is available
- See Also:
-
getConversionTime
public double getConversionTime()Determine the time taken to convert bytecode to OpenCL for first Kernel.execute(range) call.
Note1: This is kept for backwards compatibility only, usage of eithergetProfileReportLastThread(Device)orregisterProfileReportObserver(IProfileReportObserver)is encouraged instead.
Note2: Calling this method is not recommended when using more than a single thread to execute the same kernel, or when running kernels on more than one device concurrently.
Note that for the first call this will include the conversion time.
- Returns:
- The time spent preparing the kernel for execution using GPU
- NaN, if no profile report is available
- See Also:
-
getAccumulatedExecutionTimeCurrentThread
Determine the total execution time of all previous kernel executions called from the current thread, calling this method, that executed the current kernel on the specified device.
Note1: This is the recommended method to retrieve the accumulated execution time for a single current thread, even when doing multithreading for the same kernel and device.
Note that this will include the initial conversion time.- Parameters:
the- device of interest where the kernel executed- Returns:
- The total time spent executing the kernel (ms)
- NaN, if no profiling information is available
- See Also:
-
getAccumulatedExecutionTimeAllThreads
Determine the total execution time of all produced profile reports from all threads that executed the current kernel on the specified device.
Note1: This is the recommended method to retrieve the accumulated execution time, even when doing multithreading for the same kernel and device.
Note that this will include the initial conversion time.- Parameters:
the- device of interest where the kernel executed- Returns:
- The total time spent executing the kernel (ms)
- NaN, if no profiling information is available
- See Also:
-
getAccumulatedExecutionTime
public double getAccumulatedExecutionTime()Determine the total execution time of all previous Kernel.execute(range) calls for all threads that ran this kernel for the device used in the last kernel execution.
Note1: This is kept for backwards compatibility only, usage ofgetAccumulatedExecutionTimeAllThreads(Device)is encouraged instead.
Note2: Calling this method is not recommended when using more than a single thread to execute the same kernel on multiple devices concurrently.
Note that this will include the initial conversion time.- Returns:
- The total time spent executing the kernel (ms)
- NaN, if no profiling information is available
- See Also:
-
execute
Start execution of_rangekernels.When
kernel.execute(globalSize)is invoked, Aparapi will schedule the execution ofglobalSizekernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_range- The number of Kernels that we would like to initiate.
-
toString
-
execute
Start execution of_rangekernels.When
kernel.execute(_range)is 1invoked, Aparapi will schedule the execution of_rangekernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.Since adding the new
Range classthis method offers backward compatibility and merely defers toreturn (execute(Range.create(_range), 1));.- Parameters:
_range- The number of Kernels that we would like to initiate.
-
createRange
-
execute
Start execution of_passesiterations of_rangekernels.When
kernel.execute(_range, _passes)is invoked, Aparapi will schedule the execution of_reangekernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_passes- The number of passes to make- Returns:
- The Kernel instance (this) so we can chain calls to put(arr).execute(range).get(arr)
-
execute
Start execution of_passesiterations over the_rangeof kernels.When
kernel.execute(_range)is invoked, Aparapi will schedule the execution of_rangekernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.Since adding the new
Range classthis method offers backward compatibility and merely defers toreturn (execute(Range.create(_range), 1));.- Parameters:
_range- The number of Kernels that we would like to initiate.
-
execute
Start execution ofglobalSizekernels for the given entrypoint.When
kernel.execute("entrypoint", globalSize)is invoked, Aparapi will schedule the execution ofglobalSizekernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_entrypoint- is the name of the method we wish to use as the entrypoint to the kernel- Returns:
- The Kernel instance (this) so we can chain calls to put(arr).execute(range).get(arr)
-
execute
Start execution ofglobalSizekernels for the given entrypoint.When
kernel.execute("entrypoint", globalSize)is invoked, Aparapi will schedule the execution ofglobalSizekernels. If the execution mode is GPU then the kernels will execute as OpenCL code on the GPU device. Otherwise, if the mode is JTP, the kernels will execute as a pool of Java threads on the CPU.- Parameters:
_entrypoint- is the name of the method we wish to use as the entrypoint to the kernel- Returns:
- The Kernel instance (this) so we can chain calls to put(arr).execute(range).get(arr)
-
compile
Force pre-compilation of the kernel for a given device, without executing it.- Parameters:
_device- the device for which the kernel is to be compiled- Returns:
- the Kernel instance (this) so we can chain calls
- Throws:
CompileFailedException- if compilation failed for some reason
-
compile
Force pre-compilation of the kernel for a given device, without executing it.- Parameters:
_entrypoint- is the name of the method we wish to use as the entrypoint to the kernel_device- the device for which the kernel is to be compiled- Returns:
- the Kernel instance (this) so we can chain calls
- Throws:
CompileFailedException- if compilation failed for some reason
-
getKernelMinimumPrivateMemSizeInUsePerWorkItem
public long getKernelMinimumPrivateMemSizeInUsePerWorkItem(Device device) throws QueryFailedException Retrieves that minimum private memory in use per work item for this kernel instance and the specified device.- Parameters:
device- the device where the kernel is intended to run- Returns:
- the number of bytes used per work item
- Throws:
QueryFailedException- if the query couldn't complete
-
getKernelLocalMemSizeInUse
Retrieves the amount of local memory used in the specified device by this kernel instance.- Parameters:
device- the device where the kernel is intended to run- Returns:
- the number of bytes of local memory in use for the specified device and current kernel
- Throws:
QueryFailedException- if the query couldn't complete
-
getKernelPreferredWorkGroupSizeMultiple
Retrieves the preferred work-group multiple in the specified device for this kernel instance.- Parameters:
device- the device where the kernel is intended to run- Returns:
- the preferred work group multiple
- Throws:
QueryFailedException- if the query couldn't complete
-
getKernelMaxWorkGroupSize
Retrieves the maximum work-group size allowed for this kernel when running on the specified device.- Parameters:
device- the device where the kernel is intended to run- Returns:
- the preferred work group multiple
- Throws:
QueryFailedException- if the query couldn't complete
-
getKernelCompileWorkGroupSize
Retrieves the specified work-group size in the compiled kernel for the specified device or intermediate language for the device.- Parameters:
device- the device where the kernel is intended to run- Returns:
- the preferred work group multiple
- Throws:
QueryFailedException- if the query couldn't complete
-
isAutoCleanUpArrays
public boolean isAutoCleanUpArrays() -
setAutoCleanUpArrays
public void setAutoCleanUpArrays(boolean autoCleanUpArrays) Property which if true enables automatic calling ofcleanUpArrays()following each execution. -
cleanUpArrays
public void cleanUpArrays()Frees the bulk of the resources used by this kernel, by setting array sizes in non-primitiveKernelArgs to 1 (0 size is prohibited) and invoking kernel execution on a zero size range. Unlikedispose(), this does not prohibit further invocations of this kernel, as sundry resources such as OpenCL queues are not freed by this method.This allows a "dormant" Kernel to remain in existence without undue strain on GPU resources, which may be strongly preferable to disposing a Kernel and recreating another one later, as creation/use of a new Kernel (specifically creation of its associated OpenCL context) is expensive.
Note that where the underlying array field is declared final, for obvious reasons it is not resized to zero.
-
dispose
public void dispose()Release any resources associated with this Kernel.When the execution mode is
CPUorGPU, Aparapi stores some OpenCL resources in a data structure associated with the kernel instance. Thedispose()method must be called to release these resources.If
execute(int _globalSize)is called afterdispose()is called the results are undefined. -
isRunningCL
public boolean isRunningCL() -
getTargetDevice
-
isAllowDevice
- Returns:
- true by default, may be overriden to allow vetoing of a device or devices by a given Kernel instance.
-
getExecutionMode
Deprecated.SeeKernel.EXECUTION_MODEReturn the current execution mode. Before a Kernel executes, this return value will be the execution mode as determined by the setting of the EXECUTION_MODE enumeration. By default, this setting is either GPU if OpenCL is available on the target system, or JTP otherwise. This default setting can be changed by calling setExecutionMode().
After a Kernel executes, the return value will be the mode in which the Kernel actually executed.
- Returns:
- The current execution mode.
- See Also:
-
setExecutionMode
Deprecated.SeeKernel.EXECUTION_MODESet the execution mode.
This should be regarded as a request. The real mode will be determined at runtime based on the availability of OpenCL and the characteristics of the workload.
- Parameters:
_executionMode- the requested execution mode.- See Also:
-
setExecutionModeWithoutFallback
-
setFallbackExecutionMode
Deprecated. -
descriptorToReturnTypeLetter
-
getReturnTypeLetter
-
toClassShortNameIfAny
-
getMappedMethodName
public static String getMappedMethodName(ClassModel.ConstantPool.MethodReferenceEntry _methodReferenceEntry) -
isMappedMethod
public static boolean isMappedMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
isOpenCLDelegateMethod
public static boolean isOpenCLDelegateMethod(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
usesAtomic32
public static boolean usesAtomic32(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
usesAtomic64
public static boolean usesAtomic64(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
setExplicit
public void setExplicit(boolean _explicit) For dev purposes (we should remove this for production) allow us to define that this Kernel uses explicit memory management- Parameters:
_explicit- (true if we want explicit memory management)
-
isExplicit
public boolean isExplicit()For dev purposes (we should remove this for production) determine whether this Kernel uses explicit memory management- Returns:
- (true if we kernel is using explicit memory management)
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
put
Tag this array so that it is explicitly enqueued before the kernel is executed- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
get
Enqueue a request to return this buffer from the GPU. This method blocks until the array is available.- Parameters:
array-- Returns:
- This kernel so that we can use the 'fluent' style API
-
getProfileInfo
Get the profiling information from the last successful call to Kernel.execute().- Returns:
- A list of ProfileInfo records
-
addExecutionModes
Deprecated.SeeKernel.EXECUTION_MODE.set possible fallback path for execution modes. for example setExecutionFallbackPath(GPU,CPU,JTP) will try to use the GPU if it fails it will fall back to OpenCL CPU and finally it will try JTP.
-
hasNextExecutionMode
Deprecated.- Returns:
- is there another execution path we can try
-
tryNextExecutionMode
Deprecated.SeeKernel.EXECUTION_MODE. try the next execution path in the list if there aren't any more than give up -
getBoolean
private static boolean getBoolean(ValueCache<Class<?>, Map<String, Boolean>, RuntimeException> methodNamesCache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
markedWith
private static <A extends Annotation> ValueCache<Class<?>,Map<String, markedWithBoolean>, RuntimeException> (Class<A> annotationClass) -
toSignature
-
getArgumentsLetters
-
isRelevant
-
getProperty
private static <V,T extends Throwable> V getProperty(ValueCache<Class<?>, Map<String, throws TV>, T> cache, ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry, V defaultValue) - Throws:
T
-
toSignature
private static String toSignature(ClassModel.ConstantPool.MethodReferenceEntry methodReferenceEntry) -
cacheProperty
private static <K,V, ValueCache<Class<?>,T extends Throwable> Map<K, cachePropertyV>, T> (ValueCache.ThrowingValueComputer<Class<?>, Map<K, V>, T> throwingValueComputer) -
invalidateCaches
public static void invalidateCaches()
-
EXECUTION_MODEs are used, as a more sophisticatedDevicepreference mechanism is in place, seeKernelManager.