Thursday, March 18, 2010

raw data..

games:
*metro 2033 and just cause 2 demo avaiable! (fermi launch titles?)
*assasins creed2 and bad company 2 this month also..
*Command & Conquer 4: Tiberian Twilight
*3d vision cd 1.23 has direct3d11 support! (so list support for  d3d11 fermi supersleddemo)


iexplore 9 preview with direct2d directwrite support


*3D texture based separable convolution, extension of SDK example
code:
http://forums.nvidia.com/index.php?showtopic=163382

*bin format for fermi is similar ptx: post luebke on gpgpu-sim mailing list
one guy from pathscale says he has all info on this and other low level info presumably PTX 1.5,2.0 specs (bin format spec?) and also info for open source cuda driver for BSD etc..?
*gpgpu-3 papers avaiable!
http://www.ece.neu.edu/groups/nucar/GPGPU/GPGPU-FinalProgram.pdf
*CULA 1.2 avaiable with some eigenvectors/values stuff..

*"GPU Sample Sort" paper for the upcoming IPDPS 2010 conference?
It is possible to achieve much higher sorting rates for NV devices than with the Satish/CUDPP methods. You might be interested in our radix CUDA sorting results here at UVA. We demonstrate 480M pairs/sec, and 550M keys/sec on our GTX285 (with other devices evaluated as well). Interestingly enough, our keys-only results on the NV GT200 architecture are superior to the cycle-accurate sorting results from the (defunct) 32-core Larrabee.
Where is source?

http://www.cs.virginia.edu/~dgm4d/papers/RadixSortTR.pdf

Other sorting new papers:
*Revisiting Sorting for GPGPU Stream Architectures
 "GPU Sample Sort" paper for the upcoming IPDPS 2010
N. Leischner, V. Osipov, and P. Sanders. GPU sample sort. In Proc. Int'l Parallel and Distributed Processing Symposium (IPDPS), to appear, 2010 (currently available at http://arxiv1.library.cornell.edu/abs/0909.5649).

*CUFFT does support streams... and seems has 3d ffts perf improvements of sc08 paper included so
apple fft code seems now work on Nvidia OpenCL but offer 2x-3x perf disadvantage vs cufft..

2d to 3d video conversion:
we have reald and other directshow plugin..
now:
arsoft sim 3d plus hd coming q2..
and powerdvd 10..
*TrueTheater™ Stabilizer
*TrueTheater™ 3D
*TrueTheater Noise Reduction
PowerDVD 10 Mark II: Consumers who purchase PowerDVD 10 Ultra 3D will receive a FREE UPGRADE that enables support of the Blu-ray 3D format and 2D to 3D conversion of video files. Available this summer.
Blu-ray 3D playback requires FREE "Mark II" upgrade which will be available soon.
lot of betas coming:
qt 4.7
intel compiler 12
vmware workstation 7.1
other march:
openrl
heaven 2.0

http://www.cs.utk.edu/~dongarra/WEB-PAGES/cscads-libtune-09/
1st CUDA Developers' Conference
http://www.smithinst.ac.uk/Events/CUDA2009
see
"Looking after the 7 dwarfs: numerical libraries / frameworks for GPUs" Mike Giles
 also
"The Art of Performance Tuning for CUDA and Manycore Architectures"
David Tarjan (NVIDIA)
Kevin Skadron (U. Virginia)
Paulius Micikevicius (NVIDIA)


cudpp 1.1.1 svn has fermi support
cusp has amg geometric multigrid..
http://forums.nvidia.com/index.php?showtopic=163382&st=0&#entry1022104

See DirectX 9.0 on OpenGL ES 2.0 ->http://www.gametree.tv/ linux sdk

Coming in Spring 2010, the GameTree.tv Publishing SDK for Intel CE hardware will include the tools you need to optimize and debug your game for the GameTree.tv Gaming Platform, plus the ability to order Intel CE hardware.Developer Tools & Documentation      available      available
OpenGL ES 1.1 and 2.0
 - Windows Game Development and Emulation
 - Linux Desktop Runtime SDK     available     available
Direct3D® support
 - Fixed-Function
 - Shader Model 1.0 and 2.0 API
 - Linux Desktop Emulation SDK     available     available
Debugging With Visual Studio     Coming March 2010     available
GameTree.tv Developer Forums     Coming Soon     available
Publish Games For Commercial Sale       
Detailed Hardware Setup Documentation   
Hardware Order Process       
Developer Relations Support   
fglrx 8.72.5 has ubuntu 10.4 support and opengl 3.2.97xx (opengl 3.3/40 partial support?)


Nvidia theater GDC notes:
dmm2
dmm2 free 1500 objects (star unleashed not uses more) max, has interop with physx and bullet adds
also directcompute and opencl simulation

shipping september october beta
still not ready plastic simulation and fracture mode.. calculates stress on volume so physical based break..
uses fp32 for gpu support and sse..

3d vision on unreal engine 3 shipping in april..
3d vision sdk soon code samples etc developer tricks for surround
surround recommends gfx400 in sli i "release 256 driver"

khrnos gdc sessions published has
info physics amd opencl sph and soft bodies no rigid bodies this is bullet work..
also fem simulation is dmm2 work..
no more interesting talk slides?: fft profiling for OpenCL by Nvidia employee

physxlab with destruction (precalculated) is beta now with unreal engine 3 integration

new unigine 2.0 this month on 26 has Linux support? and Windows OpenGL tesselation support with Fermi /5xxx cards?
nsight 480gtx 8marzo release
nexus 1.0 opengl and opencl analyzer not hardware debugger but like gdebugger gl+cl


fermi games just cause 2 (d3d10 only) metro 2033 (d3d11 optional)
http://nvidia.fullviewmedia.com/gdc2010/agenda.html
opengl 4.0 extensions viewer and glew in trunk support!
assasins cred2, badcompany 2
ati open 3d
nvidia 3dtv

cuda and visual studio:

QUOTE
- create empty cuda projects trough "project.."

You can just create an ordinary console project and then add .cu files to this project (see next point).

QUOTE
- add new .cu files through "add new item" (renaming c++ or txt in .cu files causes build errors)

If you add the CUDA build rules (Cuda.rules, distributed with the SDK) then VS will automatically detect the .cu files and pass them to nvcc to compile these to standard .obj files, the standard linker (link.exe) will then link these with the rest of your application's .obj files.

QUOTE
- doesn't highlight code in .cu files

See the instructions in (SDK_INSTALL_DIR)\C\doc\syntax_highlighting\visual_studio_8

QUOTE
- must copy a thousand times cutil64.dll around till it releases the program ...

Cutil is used to minimise code replication between the SDK samples. I'd advise understanding what you actually need and implementing it yourself. For example, most people only want the cuda safe call macros and you would be better off handling the error in a manner suitable for your app rather than just calling exit().

QUOTE
- must add a "thousand" new libraries not to cause build errors

By "thousand" do you mean one (cudart.lib)?! Ok, so you're using cutil so you need cutil64.lib too. But by definition using any library (and the CUDA API is provided through a library) you have to link with libraries.

QUOTE
- and even then its not sure if it runs

Can't help with that one (without more info).

I would advise the following.

Preparation:
  • Set up syntax highlighting
  • Set up Intellisense

Development:
  • Create a new, empty, console project (or you can use an existing project if you have one
  • Add your .c, .cpp and .cu files
  • Add the Cuda.rules
  • Modify C/C++ code generation to use /MT in release, /MTd in debug
  • Do the same for the Cuda code generation
  • Add cudart.lib to all configurations (i.e. release and debug)
  • Build, run, debug etc.
 Proceedings of 24th IEEE International Parallel and Distributed Processing Symposium

gpu papers:

Session 2: Scientific Computing with GPUs Improving Numerical Reproducibility and Stability in Large-Scale Numerical Simulations on GPUs
Implementing the Himeno Benchmark with CUDA on GPU Clusters
Direct Self-Consistent Field Computations on GPU ClustersParallelization of Tau-Leap Coarse-Grained Monte Carlo Simulations on GPUs

A High-Performance Fault-Tolerant Software Framework for Memory on Commodity GPUs

Sort
High Performance Comparison-Based Sorting Algorithm on Many-Core GPUs
GPU Sample Sort
Highly Scalable Parallel Sorting

Session 9: Software Support for Using GPUs 26
Object-Oriented Stream Programming using Aspects
Optimal Loop Unrolling For GPGPU Programs
Speculative Execution on Multi-GPU Systems
Dynamic Load Balancing on Single- and Multi-GPU Systems


Fisheye Lens Distortion Correction on Multicore and Hardware Accelerator Platforms .. . . 37
Large-Scale Multi-Dimensional Document Clustering on GPU Clusters

Dynamically Tuned Push-Relabel Algorithm for the Maximum Flow Problem on CPU-GPU-Hybrid Platforms .

Optimization of Linked List Prefix Computations on Multithreaded GPUs Using CUDA

Inter-Block GPU Communication via Fast Barrier Synchronization

    0 comentarios:

    Post a Comment