Regarding DX IL:
Well I can only generate with fxc, right?.. also seems I can't feed DX IL to DX via fxc or D3DCompile or CreateComputeshader? seems no.. then what's is for excluding IHVs for doing drivers for it as base.. so no IL modification and compiling from that?ATI SKA also gets it but doesn't generate from it..
Also is DX IL spec public or anywhere avaiable?
Regarding OGL-DX interop trough OCL:
having new DX extensions for OCL Nvidia published only and AMD shipping is possible to
use for OGL-DX interop? (using createcontex with cl_context_properties having both ogl context and d3d context stuff)
It will work someday? one vendor at least? ogl extension says can be possible..
also what about wgl_dx_interop is going to be supported on Vista/7 and d3d9,10,11..
going to be introduced (at least spec txt) in fermi gl extensions this month?
Regardinng OCL binaries
Found AMD OpenCL 2.01 supports binaries (both CPU and GPU targets) getting and building from that altough AMD release notes list that as a lacking feature..
perhaps since 2.0..
target CPU binary should be cross CPU i.e. work with all CPUs (AMD,INtel) across generations.. even Atoms..
there is a flag for only SSE2 requirement obviating current sse3 it will generate only sse2 code and run even on p4?..
GPU support is good but worse than Nvidia first binary chars are CLBC (cl byte code? similar to DXBC) and has assembly device code so I use 5xxx will not work on 4xxx would be better AMD IL so would work on all GPUs supported..
well at least seems that OCL generates AMD IL v2 in my 5xxx and I don't know if this works on 4xxx..
Also seems ELF binary and also has other info than code so you can't modify code as some headers will show code size etc..
How OCL GPU binaries compare to ELF CAL binaries with Calclassemble?..
Are the formats going to be published simiar to CAL ELF binaries.. well at least they were some time ago but I don't know if they are up to date or possible now that seems device assmebly is not possible or at least not supported officialy on 5xxx..
Also remember Nvidia gets PTX so should work current OCL binaries with Fermi acording to Fermi compatiblity guide..
also straight ptx allows modificating code.. possible but spec 1.5 still not published (this month?)
Anyway I didn't mention last time but with decuda git now having most GT 200 arch instructions (SM 1.3) you teoretically could write a CUDA wrapper that intercept cubin and using decuda get PTX which you feed to CUDA stack.. don't know why Nvidia doesn't do that.. well they must have reason regarding precision,
mul24 is not native instruction,etc..
I have ported/fixed also swan to windows and added better opencl translation from cuda kernels..
Trying to get CAL++ fiexs for windows also..
Todays news:
*cebit: Geforce 480 boxes show 1.5gb ram 8pin+6pin connector..
ATI competition will be a 950mhz 5000mhz 5870 and 5970 with 4gb at 850mhz
also seems a Computex Dual Fermi possible by Asus..
*http://www.geosenseforwindows.com/ supplies a sensor driver for Windows for using location apis
gives a demo google maps enabled.. works with weather gadget..
Then I hope QT Location API in mobilty pack has win7 location api support..
*cebit: gigabyte shows laptop with docking station having nvidia gtx2xx for laptops and netbook with multitouch and tablet convertible
*Hardware accelerated graphics and text in Firefox directwrite and 2d in nightly firefox for windows 7
*glu3 soon.
Old news:
*Flash 10.3 beta 3 supports GPU decoding for fluid HD youtube on netbooks with GMA500 (720p) and Broadcom CrystalHD (1080p) with new gma500 and CrystalHD new drivers..
as it's based on DXVA seem now they have proper DXVA on drivers.. it's 1 or dxva 2? i suppose 1 as it works on XP also but can be on vista uses dxva 2.0?..
*C3DL 2.0 now WebGL and beyond
*OpenScreenGraph 1.96 supports OGL ES 1.x and 2.0 and GL 3.x and Iphone coming soon..
OCL tip:
Images on today's hardware have caches, so you get most of the benefits of local memory without the difficulty. The caches are small (~32kB L1, ~768kB L2) so you need a lot of locality to make it work.
Writing to images is very slow. Avoid it if you can.
Subscribe to:
Post Comments (Atom)
AMD OpenCL running on a Pentium P4
ReplyDeleteusing SET CPU_ENABLE_ALL=1
http://moozoo.dyndns.org/misc/OpenCLonP4.jpg
There are not special instructions for getting it to work.
I have Windows XP SP3
This is what I did
With Visual Studio Professional 2008 installed:
Install ati-stream-sdk-v2.01-xp32.exe
Add CPU_ENABLE_ALL value 1 via My Computer-> Properties->Advanced->Enviroment Variables.
As per SDK instructions copy
My Documents\ATI Stream\samples\opencl\bin\x86\BoxFilter_Input.bmp
to
My Documents\ATI Stream\samples\opencl\cl\app\BoxFilter
Open
My Documents\ATI Stream\samples\opencl\OpenCLSamples.sln
Modify SDKApplication.cpp so that
SDKSample::SDKSample(std::string sampleName)
and
SDKSample::SDKSample(const char* sampleName)
have
deviceType = "cpu";
instead of
deviceType = "gpu";
Rebuild Solution.
D3D IL documentation:
ReplyDeletehttp://msdn.microsoft.com/en-gb/library/ms800355.aspx