Radeon opencl now support double-precision compute

clpeak                                                                                                                                                     
                                                                                                                                                                                                                                            
Platform: Clover                                                                                                                                                                                                                            
  Device: AMD BONAIRE                                                                                                                                                                                                                       
    Driver version  : 10.6.0-devel (Linux x64)                                                                                                                                                                                              
    Compute units   : 14                                                                                                                                                                                                                    
    Clock frequency : 1050 MHz                                                                                                                                                                                                              
                                                                                                                                                                                                                                            
    Global memory bandwidth (GBPS)                                                                                                                                                                                                          
      float   : 55.14                                                                                                                                                                                                                       
      float2  : 56.52                                                                                                                                                                                                                       
      float4  : 54.39
      float8  : 38.98
      float16 : 24.86

    Single-precision compute (GFLOPS)
      float   : 1109.28
      float2  : 960.17
      float4  : 1109.53
      float8  : 1023.15
      float16 : 1075.14

    Double-precision compute (GFLOPS)
      double   : 113.89
      double2  : 113.82
      double4  : 113.68
      double8  : 113.42
      double16 : 112.92

    Integer compute (GIOPS)
      int   : 344.50
      int2  : 329.74
      int4  : 347.39
      int8  : 353.00
      int16 : 351.91

    Transfer bandwidth (GBPS)
      enqueueWriteBuffer         : 4.59
      enqueueReadBuffer          : 1.31
      enqueueMapBuffer(for read) : 8.45
        memcpy from mapped ptr   : 4.68
      enqueueUnmap(after write)  : 1429.37
        memcpy to mapped ptr     : 4.38

    Kernel launch latency : 473.19 us
clinfo 
Number of platforms                               1
  Platform Name                                   Clover
  Platform Vendor                                 Mesa
  Platform Version                                OpenCL 1.1 MESA 10.6.0-devel
  Platform Profile                                FULL_PROFILE
  Platform Extensions                             cl_khr_icd
  Platform Extensions function suffix             MESA

  Platform Name                                   Clover
Number of devices                                 1
  Device Name                                     AMD BONAIRE
  Device Vendor                                   X.Org
  Device Vendor ID                                0x1002
  Device Version                                  OpenCL 1.1 MESA 10.6.0-devel
  Driver Version                                  10.6.0-devel
  Device OpenCL C Version                         OpenCL C 1.1
  Device Type                                     GPU
  Device Profile                                  FULL_PROFILE
  Max compute units                               14
  Max clock frequency                             1050MHz
  Max work item dimensions                        3
  Max work item sizes                             256x256x256
  Max work group size                             256
  Preferred work group size multiple              1
  Preferred / native vector sizes                 
    char                                                16 / 16      
    short                                                8 / 8       
    int                                                  4 / 4       
    long                                                 2 / 2       
    half                                                 0 / 0        (n/a)
    float                                                4 / 4       
    double                                               2 / 2        (cl_khr_fp64)
  Half-precision Floating-point support           (n/a)
  Single-precision Floating-point support         (core)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 No
    Round to infinity                             No
    IEEE754-2008 fused multiply-add               No
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Double-precision Floating-point support         (cl_khr_fp64)
    Denormals                                     Yes
    Infinity and NANs                             Yes
    Round to nearest                              Yes
    Round to zero                                 Yes
    Round to infinity                             Yes
    IEEE754-2008 fused multiply-add               Yes
    Support is emulated in software               No
    Correctly-rounded divide and sqrt operations  No
  Address bits                                    32, Little-Endian
  Global memory size                              1073741824 (1024MiB)
  Error Correction support                        No
  Max memory allocation                           268435456 (256MiB)
  Unified memory for Host and Device              Yes
  Minimum alignment for any data type             128 bytes
  Alignment of base address                       128 bits (16 bytes)
  Global Memory cache type                        None
  Image support                                   No
  Local memory type                               Local
  Local memory size                               32768 (32KiB)
  Max constant buffer size                        268435456 (256MiB)
  Max number of constant args                     0
  Max size of kernel argument                     1024
  Queue properties                                
    Out-of-order execution                        No
    Profiling                                     Yes
  Profiling timer resolution                      0ns
  Execution capabilities                          
    Run OpenCL kernels                            Yes
    Run native kernels                            No
  Device Available                                Yes
  Compiler Available                              Yes
  Device Extensions                               cl_khr_fp64

NULL platform behavior
  clGetPlatformInfo(NULL, CL_PLATFORM_NAME, ...)  Clover
  clGetDeviceIDs(NULL, CL_DEVICE_TYPE_ALL, ...)   Success [MESA]
  clCreateContext(NULL, ...) [default]            Success [MESA]
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CPU)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_GPU)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD BONAIRE
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ACCELERATOR)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_CUSTOM)  No devices found in platform
  clCreateContextFromType(NULL, CL_DEVICE_TYPE_ALL)  Success (1)
    Platform Name                                 Clover
    Device Name                                   AMD BONAIRE

ICD loader properties
  ICD loader Name                                 OpenCL ICD Loader
  ICD loader Vendor                               OCL Icd free software
  ICD loader Version                              2.1.3
  ICD loader Profile                              OpenCL 1.2

Related Images:

Half-Life 2 Cinematic mod on radeonsi( linux wine nine|csmt vs Windows 8)


View on YouTube

Related Images:

Gears on Gallium 2015.02.28

Released an updated Gears on Gallium:
openSUSE – Factory
Mesa-git – 10.6_git2015.02.28
Kernel – 3.19.0
libdrm – 2.4.99_git2015.02.27
Mesa-demos – 9.1.0_git2015.02.28
wine – 1.7.37-gallium-nine+staging
xorg-server – 1.17.0
xf86-video-ati – 7.99.99_2015.02.27
xf86-video-intel – 2.99.99_2015.02.27
xf86-video-nouveau – 1.1.99_2015.01.22
KDE – 4.14.3
LLVM – 3.7svn
Phoronix Test Suite – 5.4.1
Steam  –  1.0.0.48
LXDE –  0.8.0

Users root and gog has empy passwords.
The image is the hybrid iso, can be written as the CD and as USB flash drive (use dd for writing).
Download 910 mb
md5 406864e5586b8446eed08ad398303bc3
sha256 4ffaf9602dbad5e54d48f446ef13a3d67642c757727917eede89297f360e2d32

How to use wine on livecd:
It’s better make /home/gog/.wine as symlink to external hdd\usb (ram disk is only 700 mb) or add ramdisk_size=1024000 (1gb) to kernel parameters to increase ramdisk size.


View on YouTube

2014-12-27-063423_1920x1080_scrot

2014-12-27-113211_1920x1080_scrot

Related Images:

Driver: San Francisco on radeonsi (gallium-nine vs WIndows 8 catalyst)


View on YouTube

 

Related Images:

League of Legends performance comparison (gallium-nine vs wine-csmt vs wined3d vs opengl)


View on YouTube

 

Related Images:

Games on Xwayland + vaapi hardware h264 encoding(Next car game in Wine with gallium-nine)

Radeon hd 7790
intel i5 3330
openSUSE Factory

Weston running on Intel.
The game is running on Radeon 7790 with DRI_PRIME.
Screenrecorder use vaapi for realtime hardware h264 encoding.
Wine use gallium-nine for best performance.

enable\disable vaapi-recorder
super+shift+space q


View on YouTube

 

Related Images:

Gallium-nine debug 15.02.16

Gallium-nine debug – live cd for testing gallium-nine
openSUSE – Factory
Mesa – 10.6_git2015.02.15(iXit Mesa-3d-master)
Kernel – 3.19
libdrm – 2.4.99_git2015.02.10
Mesa-demos – 9.1.0_git2014.07.06
wine – 1.7.36-gallium-nine+staging
xorg-server – 1.17.0 (modesetting+page flipping)
xf86-video-ati – 7.99.99_2015.02.10
xf86-videor-intel – 2.99.99_2016.02.10
xf86-video-nouveau – 1.1.99_2014.10.25
KDE – 4.14.3
LLVM – 3.7

Users root and gog has empy passwords.
The image is the hybrid iso, can be written as the CD and as USB flash drive (use dd for writing).
Download 748MB
md5 c93814c930986a35a3718d91d25ee8a7
sha256 d9cc7ac7b4d430d5413e781c461d46308ae112da0ca1c61dade742c43e03846b

How to use wine with gallium nine:
It’s better to make /home/gog/.wine as symlink to external hdd\usb (ram disk is only 700-900 mb) or add ramdisk_size=2000000 ( ~2gb) to kernel parameters.

https://wiki.ixit.cz/d3d9_debugging 


View on YouTube

View on YouTube

View on YouTube

Related Images:

Weston Xwayland gallium-nine performance

Intel i5 3330
Radeon hd 7790

openSUSE Factory
Linux bb 3.19.0-desktop
OpenGL version string: 3.3 Mesa 10.6.0-devel (git-7df256a pontostroy:X11)
Wine 1.7.36-galluim-nine
X-server 1.17.0
wayland 1.7

Tomb Raider 2013 Benchmark

Weston + Xwayland 75.4
tm

X-server + openbox 74.4
tm-x

Mafia 2 Benchmark

Weston + Xwayland 53.1
maf

X-server + openbox 53.6
maf-x

GTA 4 Benchmark

Weston + Xwayland 35.52
gta

X-server + openbox 37.86
gta-x

Street Figher 4 Benchmark

Weston + Xwayland 149.19
sf

X-server + openbox 152.36
sf-x

Related Images:

Intel ilo gallium-nine benchmark

Intel i5 3330 (hd 2500)
openSUSE FactoryLinux bb 3.19.0-rc7-8-desktop
OpenGL version string: 2.1 Mesa 10.6.0-devel (git-7df256a pontostroy:X11)
xf86-video-intel 3.00.99~git20150214-1.1

Ilo gallium driver for intel has very limited 3d support, and can not work with almost 99% opengl or dx9 games.

3DMark 2006
ilo + nine
ilo=3dm2006

i965 + wine
i965-3dm2006

Harvest: Massive Encounter

ilo + nine 36 fps
harm-ilo-nine

ilo + wine 61 fps
harm-ilo-wine

i965 + wine 90 fps
harv-i965-wine

Related Images:

xf86-video-intel, xf86-video-ati, modesetting vs GtkPerf

Intel i5 3330
Radeon hd 7790
Radeon hd 6770

openSUSE Factory
Linux bb 3.19.0-rc7-8-desktop
OpenGL version string: 3.3 Mesa 10.6.0-devel (git-e68b67b pontostroy:X11)
xf86-video-ati 7.99.99~git20150115-1.3
xf86-video-intel 3.00.99~git20150207-1.1
X-server 1.17.0
Gtk 2 Theme – QtCurve

Radeon HD 7790

Modesetting(glamor) 35.75
7790-modesetting

xf86-video-ati(glamor) 34.99
7790-ati

Radeon HD 6770

Modesetting 35.46
6770-modeset

xf86-video-ati(glamor) 36.14
6770-glamor

xf86-video-ati(EXA) 42.43
6770-exa

Intel HD 2500

xf86-video-intel(uxa) 33.29
intel-uxa

xf86-video-intel(sna) 26.20
intel-sna

Modesetting GtkPerf does not work.

Related Images: