MathGroup Archive: March 2013 [00208]

[Date Index] [Thread Index] [Author Index]

OpenCL yes, Cuda no

To: mathgroup at smc.vnet.net
Subject: [mg130188] OpenCL yes, Cuda no
From: János Löbb <janos at lobb.com>
Date: Mon, 18 Mar 2013 05:36:08 -0400 (EDT)
Delivered-to: l-mathgroup@mail-archive0.wolfram.com
Delivered-to: l-mathgroup@wolfram.com
Delivered-to: mathgroup-newout@smc.vnet.net
Delivered-to: mathgroup-newsend@smc.vnet.net

Hi,

The help says, if OpenCL is not working with an nvidia card, then check if it is working with CUDA and if it is than it might also works with OpenCL.

Well, OpenCL look like working:

Needs["OpenCLLink`"]=00=00=00=00=81=00=00=00=00=00=00=00
In[15]:= OpenCLQ[]
Out[15]= True=00

In[16]:= OpenCLInformation[]
Out[16]= {1->{Version->OpenCL 1.1 CUDA 4.2.1,Name->NVIDIA =
CUDA,Vendor->NVIDIA =
Corporation,Extensions->{cl_khr_byte_addressable_store,cl_khr_icd,cl_khr_g=
l_sharing,cl_nv_compiler_options,cl_nv_device_attribute_query,cl_nv_pragma=
_unroll},1->{Type->GPU,Name->Tesla T10 Processor,Version->OpenCL 1.0 =
CUDA,Extensions->{cl_khr_byte_addressable_store,cl_khr_icd,cl_khr_gl_shari=
ng,cl_nv_compiler_options,cl_nv_device_attribute_query,cl_nv_pragma_unroll=
,cl_khr_global_int32_base_atomics,cl_khr_global_int32_extended_atomics,cl_=
khr_local_int32_base_atomics,cl_khr_local_int32_extended_atomics,cl_khr_fp=
64},Driver Version->304.54,Vendor->NVIDIA =
Corporation,Profile->FULL_PROFILE,Vendor ID->4318,Compute Units->30,Core =
Count->240,Maximum Work Item Dimensions->3,Maximum Work Item =
Sizes->{512,512,64},Maximum Work Group Size->512,Preferred Vector Width =
Character->1,Preferred Vector Width Short->1,Preferred Vector Width =
Integer->1,Preferred Vector Width Long->1,Preferred Vector Width =
Float->1,Preferred Vector Width Double->1,Maximum Clock =
Frequency->1440,Address Bits->32,Maximum Memory Allocation =
Size->1073692672,Image Support->True,Maximum Read Image =
Arguments->128,Maximum Write Image Arguments->8,Maximum Image2D =
Width->4096,Maximum Image2D Height->16383,Maximum Image3D =
Width->2048,Maximum Image3D Height->2048,Maximum Image3D =
Depth->2048,Maximum Samplers->16,Maximum Parameter Size->4352,Memory =
Base Address Align->2048,Memory Data Type Align Size->128,Floating Point =
Precision Configuration->{Infinity,NaNs,Round to Nearest,Round to =
Infinity,Round to Zero,IEEE754-2008 Fused MAD},Global Memory Cache =
Type->None,Global Memory Cache Line Size->0,Global Memory Cache =
Size->0,Global Memory Size->4294770688,Maximum Constant Buffer =
Size->65536,Maximum Constant Arguments->9,Local Memory Type->Local,Local =
Memory Size->16384,Error Correction Support->False,Profiling Timer =
Resolution->1000,Endian Little->True,Available->True,Compiler =
Available->True,Execution Capabilities->{Kernel Execution},Command Queue =
Properties->{Out of Order Execution,Profiling =
Enabled}},2->{Type->GPU,Name->Tesla T10 Processor,Version->OpenCL 1.0 =
CUDA,Extensions->{cl_khr_byte_addressable_store,cl_khr_icd,cl_khr_gl_shari=
ng,cl_nv_compiler_options,cl_nv_device_attribute_query,cl_nv_pragma_unroll=
,cl_khr_global_int32_base_atomics,cl_khr_global_int32_extended_atomics,cl_=
khr_local_int32_base_atomics,cl_khr_local_int32_extended_atomics,cl_khr_fp=
64},Driver Version->304.54,Vendor->NVIDIA =
Corporation,Profile->FULL_PROFILE,Vendor ID->4318,Compute Units->30,Core =
Count->240,Maximum Work Item Dimensions->3,Maximum Work Item =
Sizes->{512,512,64},Maximum Work Group Size->512,Preferred Vector Width =
Character->1,Preferred Vector Width Short->1,Preferred Vector Width =
Integer->1,Preferred Vector Width Long->1,Preferred Vector Width =
Float->1,Preferred Vector Width Double->1,Maximum Clock =
Frequency->1440,Address Bits->32,Maximum Memory Allocation =
Size->1073692672,Image Support->True,Maximum Read Image =
Arguments->128,Maximum Write Image Arguments->8,Maximum Image2D =
Width->4096,Maximum Image2D Height->16383,Maximum Image3D =
Width->2048,Maximum Image3D Height->2048,Maximum Image3D =
Depth->2048,Maximum Samplers->16,Maximum Parameter Size->4352,Memory =
Base Address Align->2048,Memory Data Type Align Size->128,Floating Point =
Precision Configuration->{Infinity,NaNs,Round to Nearest,Round to =
Infinity,Round to Zero,IEEE754-2008 Fused MAD},Global Memory Cache =
Type->None,Global Memory Cache Line Size->0,Global Memory Cache =
Size->0,Global Memory Size->4294770688,Maximum Constant Buffer =
Size->65536,Maximum Constant Arguments->9,Local Memory Type->Local,Local =
Memory Size->16384,Error Correction Support->False,Profiling Timer =
Resolution->1000,Endian Little->True,Available->True,Compiler =
Available->True,Execution Capabilities->{Kernel Execution},Command Queue =
Properties->{Out of Order Execution,Profiling Enabled}}}}=00

But:

Needs["CUDALink`"]=00=00=00=00=00=00
In[18]:=
CUDAQ[]
Out[18]= False=00

This is a Centos 6.3 machine with Mathematica 9.0 Home Edition.  The Cuda 5.0 toolkit with the 304.54 driver was installed.  When I first issued Needs["CUDALink`"], it downloaded lots of stuffs from some Wolfram Server, although the cud a 5.0 driver was already installed.  It is true that not in the expected location, but the needed soft links are all set at the right places.  For example all samples from cuda5 are compiled and those wo need just OpenCL 1.0 or 1.1 are executing at the Linux prompt.  So I am wondering why CUDAQ[] is unable to tell a True.

Thanks ahead,
J=E1nos

Prev by Date: Swap lowercase and uppercase letters

Next by Date: Re: Multiple independent random number streams cannot be implemented.

Previous by thread: Re: Swap lowercase and uppercase letters

Next by thread: Re: OpenCL yes, Cuda no